16 Chosen ciphertext and random oracles
- - - - - - - - - - - - - - - - - - - -

Many public key schemes including the ones using the RSA assumption
presented earlier make use of an unkeyed hash function to scramble
ciphertext and thus to enhance security, thus in particular preventing
chosen-ciphertext attacks. In this chapter, we will formulate these
attacks, remedies against them, and a semi-rigorous method for proving
there relative security.


Chosen ciphertext attacks
- - - - - - - - - - - - -

Chosen ciphertext attacks (CCA) model scenarios where the attacker can
request the decryption of ciphertexts of its choice. This is relevant
in a number of practical scenarios, where cryptosystems are used as
building blocks in larger protocols. Here a "man in the middle" can
intercept message and assume the role of one of the participants. The
messages it fabricates will then---at least up to a point---be
decrypted by the honest parties.

None of the schemes presented to far is secure against CCA. Consider
for example a symmetric key encryption given by Enc_k(m) =
(r,F_k(r)+m) where r is a random nonce and F_k() is a PRF.

Consider an attacker who wishes to distinguishes encryptions of m0 =
000...0 and m1 = 1111..1 as in the PrivK^eav experiment. If the
attacker is given (r,c) being the encryption of either m0 or m1 all it
has to do is to request the decryption of the chosen ciphertext (r,c')
where c' is c with the first bit flipped. Clearly, this will be
1000..000 or 0111...111 according to whether m0 or m1 has been
encrypted and hence the attacker wins.

The situation is similar in the public key setting. Suppose for
instance that (c1,c2) is an ElGamal encryption of some message
m. Then, as we know, c1=g^y and c2=h^y*m where y is random but private
and h=g^x is public. Now, for any m' we have that (c1,c2*m') is a
valid encryption of m*m'. It is clear that this allows an adversary to
win the PubK^eav experiment: Given (c1,c2), the adversary can request
the decryption of (c1,c2*g). The adversary thus obtains m*g and can compute
m by multiplication with g^(-1). A similar argument can be made for
various cryptosytems based on RSA.

The property of the cryptosystems making that form of attack possible
is known as *malleability*. From a given ciphertext it is possible to
produce another one that corresponds to a known transformation of the
plaintext.

We now define security against chosen ciphertext attacks formally in
the public key setting. The corresponding definition for private keys
is analogous.

Let A be a ppt adversary and Pi be a public key cryptosystem. The experiment
PubK^cca_{A,Pi}(n) is defined as follows.

1. (pk,sk)<-Gen(n)
2. A is given n, the public key pk, and access to a decryption oracle Dec_sk()
/* in the private key setting A also gets access to an encryption oracle like in CPA security */
   A outputs two messages m0,m1
3. b<-{0,1}
4. c<-Enc_pk(m_b) is given to A
5. A continues to have access to the decryption oracle but may, of course, not
   request the decryption of c itself. Any other ciphertext is allowed.
6. A outputs a bit b' and the outcome is 1 if b=b', 0 otherwise.

The cryptosystem is deemed CCA-secure if the success probability in
this experiment is 1/2+negl(n).


CCA secure private key cryptography using MACs
- - - - - - - - - - - - - - - - - - - - - - - -

In the private key case such CCA-secure schemes can be constructed by
combining a CPA-secure system PiE with a secure MAC PiM as follows:

A key pair (k1,k2) is generated, k1 for PiE and k2 for the MAC PiM.

The encryption of m under (k1,k2) is given by (c,t) where
c<-Enc_{k1}(m) and t<-Mac_{k2}(c).  Thus, we authenticate the
ciphertext so as to prevent any modification.

To decrypt (c,t) we first check that Mac_{k2}(c)=t and only then
return Dec_{k1}(c). If Mac_{k2}(c) is different from t we return a
default value _|_.

One can now show that---assuming PiE and PiM are themselves
secure---this scheme achieves CCA security. The proof idea is that
assuming PiM is secure the probability that the adversary makes a
query which is answered with a value different from _|_ is
negligible. We can therefore assume that all queries are answered with
_|_ and in this case the experiment degrades to a PrivK^cpa experiment
against which PiE is assumed secure. For details see KL4.8.


CCA-secure public key cryptography using hash functions
- - - - - - - - - - - - - - - - - - - - - - - - - - - -

Building CCA-secure public key systems is more difficult and typically
uses hybrid encryption with a CCA-secure private key system and,
additionally, a "cryptographic hash function" H(). Here we show one such
system using RSA, which at the same time gives an example of a
security proof based on the RSA assumption.

Assume thus that Pi0=(Enc0,Dec0) is a CCA-secure private key
cryptosystem which uses uniformly distributed keys from {0,1}^n (so
there is no "Gen0"). Assume further that H:{0,1}^n->{0,1}^n is a
"cryptographic hash function". We will later discuss what this should
mean formally in this context. We will need stronger properties than
just the collision-resistance that we have required of hash functions
before.

Finally, assume that GenRSA() is an RSA key generator for which the
RSA hardness assumption holds.

We then construct a public key cryptosystem Pi=(Gen,Enc,Dec) as follows.


Gen(n):
  (N,d,e)<- GenRSA(n)
  pk = (N,e), sk=(N,d)
  return (pk,sk)

Enc_{(N,e)}(m):
   1. r <- {0,1}^n
   2. k = H(r)
   3. return (r^e mod N, Enc0_k(m))

Dec_{(N,d)}(c1,c2):
   1. r = c1^d mod N
   2. k = H(r)
   3. return Dec0_k(c2)

Why should this scheme be CCA secure? Intuitively, given the RSA-assumption,
the attacker has no way of recovering r, hence the likelihood
that it applies (the publicly known function) H() to r is negligible
and as a result all H()-values it sees would look random to it. Thus,
the key used in the private key encryption of the actual message is as
good as random so that the security assumption on Pi2 applies and the
entire scheme is secure.

We need the function H to make the key k look random. It may be tempting
to just try to use r as the encryption key. But the RSA-assumption gives
us only that the attacker cannot recover r fully. It does not rule out
that the attacker obtains a few bits of r. If we used r as key, then
it would not be fully random and the security assumption on Pi2 could
not be used.


The random oracle model
- - - - - - - - - - - -

The question is under what assumptions on H() the above argument can
be made rigorous. Assuming H() to be a PRF does not work because those
always require a secret key. Once you know the key there is nothing
random about a PRF. Taking a PRG also does not work, as its output
looks random only if the input is random as well.
Assuming that H() is merely a secure hash function does not suffice
either because nothing prevents a good hash function from, for example,
producing hash strings which contain predictable portions.
E.g. if H0() is a secure hash function so is H(x) = 0.....0 H0(x).

Unfortunately, there have not been found any plausible assumptions on
the function H() under which schemes like the above could be proved
secure. A possible way out is the random oracle model which we now describe.

In this model, one instantiates H() with a function that is randomly
chosen at the beginning of each experiment, i.e., the probabilities
are taken over uniform random choices of H() which is therefore called
*random oracle*. All participants in the experiment get access to H.
It would be possible to actually implement this by using a trusted
party, e.g. an internet server which must called for any evaluation of
H().  It can either precompute the entire value table of H() (finite
for a given security parameter) using random choices or alternatively,
fill this table on demand, i.e. regurgitate H-values on previously
requested arguments and make random choices for new requests.

In practice, however, this approach is unfeasible, and so the random
oracle, for which schemes like the above one *can be proved secure*,
is replaced with a cryptographic hash function such as SHA-2. Whether
or not the security proof in the random oracle model has any bearing
on this instantiation remains open.

In fact, it has been shown that there exist contrived cryptographic
schemes which are secure in the random oracle model but insecure with
any concrete instantiation. Nevertheless, a proof in the random oracle
model consists strong evidence as to the security of a scheme and
perhaps even more importantly, failure to prove a system secure in the
random oracle model might point out actual flaws of a scheme. KL
contains a longer discussion of the metamathematical or epistemic
implications of the random oracle model without however being able to
get over the central issue that a proof in the random oracle model
does not imply anything---in the rigorous mathematical sense---about
the "standard model" which is used in actual practice.


Secure hash functions and pseudorandom functions in the random oracle model
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

As a warmup let us see how random oracles can be used to construct
secure hash functions and pseudo random functions.

We fix a length function l(n), e.g. l(n)=n or l(n)=n/2 and let H :
{0,1}^n->{0,1}^{l(n)} be a random function.

We now show that H() itself is a secure hash function. Consider a ppt
adversary A who gets oracle access to H() and conduct the Hash-coll
experiment from Chapter 11 with random H():

1. Choose H() at random
2. Adversary A gets oracle access to H() and outputs values x,x':{0,1}^n
   such that x != x'.
3. The outcome of the experiment is 1 iff H(x)=H(x')

Theorem:

The success probability of this experiment, taken over uniform random
choice of H and the random choices that A might make is a negligible
function of the security parameter.

Proof. Let x1...xk be the sequence of values that A queries of H prior to
outputting x,x'. Clearly, k is polynomial in the security parameter n
due to the polynomial time bound on the computation time of A. The
probability that H(xi)=H(xj) for some i!=j is therefore
negligible. Notice that we can assume that H-values are selected "on
demand". Thus, when x,x' are among the x1..xk the success probability
is negligible. If, however, A chooses at least one of them afresh
then, again, the success probability is negligible because only a
negligible portion of the possible H-values is favourable. QED

Notice that, in a way, the earlier account of hash functions in
Chapter 11 was also based on some kind of random oracle in that we
assumed a uniformly random selection of the seed which also does not
happen in practice.


A pseudorandom function from a random oracle
- - - - - - - - - - - - - - - - - - - - - - -

The random oracle can be used to obtain a PRF as follows:

Theorem: Let l(n)=n/2 and define F_k(x) = H(k || x). Notice that for
k,x:{0,1}^n we have F_k(x):{0,1}^n. Now, F_k() is a PRF in the
following sense. Let D be a ppt distinguisher with oracle access to a
function g():{0,1}^n->{0,1}^n *and* the random function H(). We write
this as D^{g(),H()}(n). We then have

| Pr[ D^{F_k(),H()}(n)=1 ] - Pr[ D^{f(),H()}(n)=1 ] | = negl(n)

with probabilities taken over a random function f(), the key k and the
random function H() appearing in F_k().

Proof: As before, we can condition on the queries made to H() by D
during its computation. So long as these are disjoint from those made
as part of evaluations of F_k() we are in the situation of the
previous theorem. But then the probability for such a coincidence is
negligible by the randomization over k. The details are left as an
exercise.  QED

It is now also possible to show that the abovedescribed hybrid
encryption system is CCA secure; however, for simplicity, we only show
its CPA security which is already nontrivial. Also, we have not
seen any provably secure RSA-based cryptosystem so far.

Theorem: Let Pi0 be a CPA-secure private key system and let
Pi be the hybrid system as described above. Then for any ppt adversary, which
is given additional access to the random oracle H(), one has

Pr[ PubK^eav_{Pi,A^{H()}}(n) = 1 ] = 1/2 + negl(n)

with probabilities taken over random choice of H(), the random steps
taken within A, and those in the PubK^eav experiment.

Proof sketch: For convenience we recall the steps of the experiment.

1. H() is chosen at random.
2. (N,e,d) <- GenRSA(n); pk=(N,e); sk=(N,d)
3. A is given access to pk and H()
4. A outputs messages m0, m1.
5. r<-{0,1}^n; k=H(r)
6. b<-{0,1}
7. c1<-r^e mod N
8. c2<-Enc0_k(m_b)
9. A is given (c1,c2) and returns b'
10. Outcome is 1 iff b=b'

The encryption and decryption algorithms that A is given oracle access
to essentially follow lines 5-9 and are not repeated here.

Let us call QUERY the event that A queries H() on one of the random
nonces r used either in line 5 or as part of its queries to the
encryption oracle.

The idea is that since---assuming as we do---GenRSA satisfies the RSA
hardness assumption, the probability of QUERY is negligible.
This is because we can use A to construct an adversary in the RSA-inv
experiment as follows. We use A as a subroutine and run it according
to protocol. We monitor all queries of A to H() and for each one of
those, call it r_q, check whether r_q^e mod N is correct. If this is
the case, then we output r_q. By the RSA hardness assumption, this
adversary can output the correct r only with negligible probability,
which means that the probability of QUERY must be negligible.

Once we know that QUERY is a negligible event we can assume without
loss of generality that it does not occur, i.e., oracle access to H()
returns random values that are completely unrelated to the experiment.
We can thus assume that A does not get access to H() at all and then
the CPA-secure scheme Pi0 is secure by assumption.

We refer to KL Theorem 13.1 for details and to KL Theorem 13.5 for a
proof that the hybrid scheme even yields CCA security assuming Pi0 is
CCA secure.

We also remark that the deterministic version of RSA used to encrypt
the symmetric key can be replaced by an arbitrary so-called key
encapsulation mechanism (KEM), without essentially changing the
proof. See KL, 2nd ed for details.