# What is the “Random Oracle Model” and why is it controversial?

**uniform random functions**,

**cryptosystems built out of hash functions**, and

**random oracle proofs**.

**Uniform random functions.**A die roll has a probability distribution on the possible outcomes { 1, 2, 3, 4, 5, 6 }. The consequence all have equal probability 1/6 when it is a bonny die seethe, in which sheath we call the distribution consistent. We can besides have a uniform distribution on mint tosses { heads, tails }, and a uniform distribution on sock colors { loss, bluing, park, teal-with-mauve-trim, … }, and sol on, for any finite specify of possible outcomes .

We can besides have a uniform distribution on $ thyroxine $ -bit-to- $ heat content $ -bit functions $ H\colon \ { 0,1\ } ^t \to \ { 0,1\ } ^h $. This space of functions is a finite fixed : you can write down a finite truth table for every bit of the $ heat content $ -bit output in terms of the $ t $ bits of remark, so there are precisely $ ( 2^h ) ^ { 2^t } $ such functions ; in the uniform distribution, each one has equal probability $ 1/ ( 2^h ) ^ { 2^t } $.

One way to choose such a function uniformly at random is to wander through the Library of Babel and pick a book with $ 2^t $ pages, each of which has an $ hydrogen $ -bit string on it, so that the subject of page $ ten $ is $ H ( x ) $. Another way is to trap a gnome in a box with a coin and an empty book of $ 2^t $ pages ; enslaved frankincense, when you ask the gnome for an input $ ten $, the gnome consults page $ ten $ in the ledger, and if it ‘s evacuate, flips the coin $ h $ times and writes down the consequence. Another way is to just flip a coin yourself $ h 2^t $ times and write down a gigantic accuracy table .

however you choose a function $ H $ uniformly at random—whether by randomly browsing a library like a civilized being, or by enslaving a gnome like a savage bear—for any especial officiate $ f\colon \ { 0,1\ } ^t \to \ { 0,1\ } ^h $, the probability $ \Pr [ H = f ] $ of getting that affair is $ 1/ ( 2^h ) ^ { 2^t } $. Another way to put this is that for any particular input signal $ x $ and output $ y $, $ \Pr [ H ( x ) = y ] = 1/2^h $ —and the value at each distinct remark is independent, thus $ \Pr [ H ( x_1 ) = y_1, \dots, H ( x_\ell ) = y_\ell ] = 1/2^ { h\ell } $ if $ ( x_1, \dots, x_\ell ) $ are all distinct. **This property makes the model of uniform random functions easy to reason about.**

**Cryptosystems built out of hash functions.** Some cryptosystems are defined in terms of a hash function. For example, RSA-FDH—Full Domain Hash—uses a hash function $ H $ for public-key signatures :

- A public key is a large integer $ n $.
- A signature on a message $ m $ is an integer $ south $ such that $ $ s^3 \equiv H ( m ) \pmod n. $ $
- To make a signature, the signer, who knows the secret solution $ d $ to the equation $ $ 3 vitamin d \equiv 1 \pmod { \lambda ( north ) }, $ $ computes $ $ second : = H ( thousand ) ^d \bmod n. $ $

The use of a hashish in signatures is crucial for security, as Rabin first observed in 1979 [ 1 ] : if we rather used the signature equation $ s^3 \equiv molarity \pmod nitrogen $, then anyone could immediately forge the signature 0 on the message 0, or take two message/signature pairs $ ( m_0, s_0 ) $ and $ ( m_1, s_1 ) $ to forge a third base $ ( m_0 m_1 \bmod normality, s_0 s_1 \bmod north ) $, or forge a touch $ \sqrt [ 3 ] { thousand } $ on any integer cube $ m $, etc .

The formulas are written in terms of $ H $, so you can write a procedure that computes the diverse parts of the cryptosystem with $ H $ as a parameter alongside all the others :

```
def sign(H, n, d, m):
s = modexp(H(m), d, n)
return s
def verify(H, n, m, s):
return modexp(s, 3, n) == H(m)
```

What properties do we require of $ H $ ? Typically some combination of preimage resistance, collision electric resistance, etc. For a consistent random function, the expect cost of finding a preimage or finding a collision is gamey. We could imagine enslaving a gnome in a corner, and using `sign(gnomebox, n, d, m)`

and `verify(gnomebox, n, m, s)`

:

```
book = {}
def gnomebox(m):
if m not in book:
book[m] = random(2**h)
return book[m]
```

however, for this cryptosystem to be useful, we need everyone to agree on the same function, so we need everyone to share the same gnome. Sharing gnomes is not a scalable room to do department of commerce over the internet, which is the only argue capitalism does n’t choose to rely on this particular type of bondage to concentrate wealth .

rather, when we actually use this cryptosystem, we agree to pass, say, SHAKE128-2047 as $ H $, when we choose $ n $ to be 2048 bits long : `s = sign(shake128_2047, n, d, m)`

, `verify(shake128_2047, n, m, s)`

.

When we use a particular hash function like SHAKE128 together with particular visualize mathematics like $ s^e \equiv H ( thousand ) \pmod n $, the hashish routine could in principle interact with the fancy mathematics in a way that destroys security, but the hash function we choose has been studied for many years to get confidence that it has no useful properties other than being cheap to evaluate, and even if it did turn out to have regretful interaction or bad properties—say because we used SHAKE128 but the visualize mathematics internally uses the inverse of the Keccak permutation for some argue, or because we used MD5 as $ H $ —we could swap in a different hash function .

If we make a bad choice of hash serve, there might be easy attacks that depend on the choice of hashish function, like a room to compute $ H ( thousand \mathbin\| m ‘ ) $ given $ H ( meter ) $ but not $ megabyte $ and thereby forge hashes of messages with unknown prefixes, or like finding MD5 collisions and thereby disrupting Iran ‘s nuclear broadcast. **But there might also be attacks that don’t depend on the choice of hash function. Can we say anything in general about the rest of the cryptosystem?**

**Random oracle proofs.** To get assurance that forging signatures is hard, we show that a forger can be used as a routine to solve the RSA problem and invert $ ten \mapsto x^3 \bmod n $ for uniform random $ adam $. We suppose that solving the RSA trouble is hard ; consequently, if a forger can be used to solve the RSA problem, counterfeit ca n’t be much easier than solving the RSA problem .

specifically, we give the forger access to $ H $, the public key, and a bless oracle which returns the key signature on any message of the forger ‘s option :

```
def forge(H, n, S):
... S(m0) ... S(m1) ...
return (m, s)
```

here we would obviously pass `lambda m: sign(H, n, d, m)`

as $ S $ ; the point is that the forger is only allowed to call the sign oracle $ S $, but is not allowed to inspect it or to see what the confidential key $ five hundred $ is .

The forger is successful if, given `(m, s) = forge(H, n, S)`

, the result message and key signature pair pass `verify(H, n, m, s)`

, submit to the restriction that $ molarity $ was not passed to the bless oracle $ S $. ( differently, the forger could win by asking $ S $ for a signature on a message and returning that, which would not impress anyone as a method acting of counterfeit. ) obviously, a forger might win by guessing a key signature at random, which has a very small but nonzero probability of success .

**Given such a forger, we will show how to compute cube roots modulo $n$ with comparable success probability** : specifically, a cube rout function `cbrt`

that uses `forge`

as a routine and wins if `modexp(cbrt(n, y), 3, n) == y`

. Let ‘s assume that the forger makes at most $ q $ queries to the hashing oracle $ H $ or the sign oracle $ S $ .

**We will make our own specially crafted hashing and signing oracles for the forger to use: they will be specially crafted to let us extract an RSA problem solution, but the hashing oracle we construct still has uniform distribution, and the signing oracle we construct still produces valid signatures for the cryptosystem instantiated with the specially crafted hashing oracle.**

```
def cbrt(n, y):
j = random(q) # Guess at which one the forger will invert.
i = [0] # Mutable counter.
ms = {} # Maps message we have seen to index i.
ys = {} # Maps image we have given out to index i.
xs = {} # Maps index to preimage of H0.
def H0(m):
ms[m] = i[0]
if i[0] == j:
xi = m
yi = y
else:
xi = random(n)
yi = modexp(xi, 3, n)
xs[i] = xi
ys[yi] = i[0]
i[0] += 1
return yi
def S0(H, m):
if m not in ms:
if modexp(H0(m), 3, n) == y):
# We accidentally won without the forger.
raise Exception
return xs[ms[m]]
try:
(m, s) = forge(H0, n, S0)
return s
except Exception:
return xs[ys[y]]
```

( This operation is the standard proof of RSA-FDH security system by Mihir Bellare and Phil Rogaway [ 2 ], Theorem 3.1. )

When the forger returns an attempted forgery $ ( m, mho ) $, there ‘s a high probability that it passed $ thousand $ to the hash ; there ‘s a $ 1/q $ probability that it was the $ j^ { \mathit { thursday } } $ question to the hashish, in which lawsuit we returned $ y $ from our carefully crafted hash ; then if the forger was successful, $ s^3 \equiv y \pmod newton $, as we hoped .

Of naturally, there ‘s besides a bantam opportunity that the forger stumbled upon a successful counterfeit by casual for another message it fed to the hash oracle, but that happens with probability $ 1/n $ which is very very identical very very small. There ‘s besides a chance that our cube ancestor routine stumbles upon a successful cube root without the forger ‘s help, but again, with probability $ 1/n $ for each question from the forger, which is identical identical very very very humble .

thus, if the forger has achiever probability $ \varepsilon $, our cube beginning procedure has success probability approximately $ \varepsilon/q $, with a little extra calculation for some more calls to `modexp`

. **This suggests that if there’s a cheap algorithm to compute forgeries using $q$ oracle queries, then there’s an algorithm to solve the RSA problem costing only $q$ times as much—provided the forgery algorithm is generic in terms of $H$.**

This was a particularly simple ROM proof ; others use more complicate techniques like the branch lemma, where we rerun the adversary ‘s algorithm with the lapp random choices inside the algorithm, but a unlike oracle [ 3 ] .

**Why is this model controversial?** In hardheaded terms, it ‘s not controversial : only academic cryptographers in an bone column worry about it, while practitioners have used ROM-based cryptosystems for decades largely without disturb. Hash functions like MD5 have gone bad, admitting collisions, and the Merkle–Damgåard structure admits length extension, but these cause problems merely adenine good in non-RO proof. So what is their expostulation ?

Read more: Ciphertext indistinguishability – Wikipedia

**It is tempting to draw the following inference:**

If a scheme is secure in the random prophet model, then it is secure if we instantiate it with a particular hashish function like SHAKE128 angstrom farseeing as the hash function is n’t besides badly broken .

obviously, as above, we could devise a cryptosystem that is broken if you instantiate it with SHAKE128, but works all right if you instantiate it with reasonably much any other hash officiate. Ran Canetti, Oded Goldreich, and Shai Halevi proved an academically very cunning resultant role : there exists a signature schema which is impregnable in the random oracle model—meaning there ‘s a random oracle proof like above showing how to turn a forger into an algorithm to solve some hard problem—but which is insecure with any practical instantiation [ 4 ] .

It can be built out of any guarantee signature scheme $ ( S, V ) $ you like, and it works approximately as follows :

- To sign a message $ thousand $ with unavowed key $ \mathit { sk } $ ,
- If $ m $ is of the form $ ( one, \pi ) $ where $ \pi $ is a proof that $ ( i, H ( iodine ) ) $ is in the graph of the $ i^ { \mathit { thorium } } $ polynomial-time function in some enumeration of them,* then the signature is $ ( \mathit { sk }, S_\mathit { sk } ( m ) ) $. (Such a proof can be verified in polynomial time.)
- Otherwise, the signature is $ ( \bot, S_ { \mathit { sk } } ( m ) ) $.

- To verify a signature $ ( z, randomness ) $ on a message $ megabyte $ under public key $ \mathit { pk } $, calculate $ V_ { \mathit { pk } } ( south, m ) $. ( We ignore $ z $, which serves lone as a back doorway. )

This signature schema can be prove impregnable in the random prophet model, because the probability that $ ( one, H ( iodine ) ) $ is actually in the graph of the $ i^ { \mathit { thorium } } $ polynomial-time function in any finical enumeration of them is negligible for uniform random $ H $, but if you choose any particular kin of functions for $ H $ then it is easy to construct a spinal column door message that dumps out the individual key by merely using its exponent in the enumeration .

This is a complexity-theoretic trick to devise a diseased signature schema that throws a anneal fit if you try to instantiate it in the substantial earth. **What the Canetti–Goldreich–Halevi scheme shows, by counterexample, is that the inference we would like to draw is not formally valid.**

One might infer that there is some technical foul standard distinguishing pathological counterexamples like this from the battalion of ROM-based protocols actually devised for practical use like RSA-FDH, RSA-KEM, RSA-OAEP, RSA-PSS, DH keystone agreement, etc .

Some academics choose alternatively to leave the random oracle model in the ashcan on the footing of this counterexample, and focus on finding ways to convert attacks on ( e.g. ) a key signature scheme into preimage or collision attacks on the hash function, or find systems that through extreme point contortions avoid hashish functions altogether—a setting which is dubbed the ‘ standard model ’ in passive-aggressive wording to cast shade on the random prophet model and its practitioners. This comes at considerable monetary value to the complexity of proof techniques and the efficiency of the resulting cryptosystems, which rarely if ever appear outside academician journals and league proceedings, no matter how powerfully they express feelings there [ 5 ] [ 6 ] [ 7 ] [ 8 ] .

**On the other hand, this doesn’t mean that random oracle proofs are useless in practice. Protocols with random oracles have been wildly successful in the real world, to the point that nearly every public-key cryptosystem used in practice takes advantage of them—as a design principle they are highly effective at thwarting attacks from the first secure signature scheme in history[1] to modern Diffie–Hellman security[9].**

indeed, not only have we had no reason to doubt the security of ( e.g. ) RSA-FDH in practice in the quarter century of its universe, but it is hard to imagine that a $ q $ -query forger could actually be a factor of $ q $ cheaper than an algorithm to solve the RSA trouble, since the distribution on message hashes and signatures from the sign oracle, $ ( h_i, { h_i } ^d \bmod north ) $, is precisely the same as the distribution on quantities anyone could have computed without a sign oracle, $ ( { s_i } ^e \bmod north, s_i ) $ ; and since the hash oracle is mugwump of the secret key. This suggests that there may be something askew in our attempts at formalization .

It would not be the first matter askew with formalization of cryptanalytic attacks in the literature. For exemplar :

- There is no formalization of collision resistance of a fixed hash function like SHA3-256[10][11]. On 257-bit outputs, there is guaranteed to be some collision $ x_0 \ne x_1 $, so there is a very cheap algorithm that prints collisions: it simply prints $ ( x_0, x_1 ) $ with no effort. But we have no idea how to find it without spending energy to compute an expected $ 2^ { 128 } $ evaluations of SHA3-256.
- There is almost certainly a 128-bit string $ s $ such that the first bit of $ E \mapsto \operatorname { MD5 } ( second \mathbin\| E ( 0 ) \mathbin\| E ( 1 ) ) $ is a high-advantage distinguisher for $ E = \operatorname { AES } _k $ under uniform random key $ k $ from a uniform random permutation $ E $
[12], which violates the premises of most inferences drawn about bounds on the PRP advantage of AES, e.g. those justifying the use of AES-GCM in practice. But we have no idea how to find $ s $ without spending an obscene amount of energy.

**None of these technical issues of formalization prevent the widespread and highly successful use of collision-resistant hashes or of AES. Nor should they prevent the use of random oracles as a design principle or justify the summary rejection of essentially all public-key cryptography in practice.**

* There are more technical details : actually we work in the asymptotic set where everything is parametrized by an remark size, and we consider families of functions keyed by a seed and indexed by the input size, and enumerating functions bounded by some superpolynomial cost, etc. See the paper for details if you ‘re concern .