16 Chosen ciphertext and random oracles - - - - - - - - - - - - - - - - - - - - Many public key schemes including the ones using the RSA assumption presented earlier make use of an unkeyed hash function to scramble ciphertext and thus to enhance security, thus in particular preventing chosen-ciphertext attacks. In this chapter, we will formulate these attacks, remedies against them, and a semi-rigorous method for proving there relative security. Chosen ciphertext attacks - - - - - - - - - - - - - Chosen ciphertext attacks (CCA) model scenarios where the attacker can request the decryption of ciphertexts of its choice. This is relevant in a number of practical scenarios, where cryptosystems are used as building blocks in larger protocols. Here a "man in the middle" can intercept message and assume the role of one of the participants. The messages it fabricates will then---at least up to a point---be decrypted by the honest parties. None of the schemes presented to far is secure against CCA. Consider for example a symmetric key encryption given by Enc_k(m) = (r,F_k(r)+m) where r is a random nonce and F_k() is a PRF. Consider an attacker who wishes to distinguishes encryptions of m0 = 000...0 and m1 = 1111..1 as in the PrivK^eav experiment. If the attacker is given (r,c) being the encryption of either m0 or m1 all it has to do is to request the decryption of the chosen ciphertext (r,c') where c' is c with the first bit flipped. Clearly, this will be 1000..000 or 0111...111 according to whether m0 or m1 has been encrypted and hence the attacker wins. The situation is similar in the public key setting. Suppose for instance that (c1,c2) is an ElGamal encryption of some message m. Then, as we know, c1=g^y and c2=h^y*m where y is random but private and h=g^x is public. Now, for any m' we have that (c1,c2*m') is a valid encryption of m*m'. It is clear that this allows an adversary to win the PubK^eav experiment: Given (c1,c2), the adversary can request the decryption of (c1,c2*g). The adversary thus obtains m*g and can compute m by multiplication with g^(-1). A similar argument can be made for various cryptosytems based on RSA. The property of the cryptosystems making that form of attack possible is known as *malleability*. From a given ciphertext it is possible to produce another one that corresponds to a known transformation of the plaintext. We now define security against chosen ciphertext attacks formally in the public key setting. The corresponding definition for private keys is analogous. Let A be a ppt adversary and Pi be a public key cryptosystem. The experiment PubK^cca_{A,Pi}(n) is defined as follows. 1. (pk,sk)<-Gen(n) 2. A is given n, the public key pk, and access to a decryption oracle Dec_sk() /* in the private key setting A also gets access to an encryption oracle like in CPA security */ A outputs two messages m0,m1 3. b<-{0,1} 4. c<-Enc_pk(m_b) is given to A 5. A continues to have access to the decryption oracle but may, of course, not request the decryption of c itself. Any other ciphertext is allowed. 6. A outputs a bit b' and the outcome is 1 if b=b', 0 otherwise. The cryptosystem is deemed CCA-secure if the success probability in this experiment is 1/2+negl(n). CCA secure private key cryptography using MACs - - - - - - - - - - - - - - - - - - - - - - - - In the private key case such CCA-secure schemes can be constructed by combining a CPA-secure system PiE with a secure MAC PiM as follows: A key pair (k1,k2) is generated, k1 for PiE and k2 for the MAC PiM. The encryption of m under (k1,k2) is given by (c,t) where c<-Enc_{k1}(m) and t<-Mac_{k2}(c). Thus, we authenticate the ciphertext so as to prevent any modification. To decrypt (c,t) we first check that Mac_{k2}(c)=t and only then return Dec_{k1}(c). If Mac_{k2}(c) is different from t we return a default value _|_. One can now show that---assuming PiE and PiM are themselves secure---this scheme achieves CCA security. The proof idea is that assuming PiM is secure the probability that the adversary makes a query which is answered with a value different from _|_ is negligible. We can therefore assume that all queries are answered with _|_ and in this case the experiment degrades to a PrivK^cpa experiment against which PiE is assumed secure. For details see KL4.8. CCA-secure public key cryptography using hash functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - Building CCA-secure public key systems is more difficult and typically uses hybrid encryption with a CCA-secure private key system and, additionally, a "cryptographic hash function" H(). Here we show one such system using RSA, which at the same time gives an example of a security proof based on the RSA assumption. Assume thus that Pi0=(Enc0,Dec0) is a CCA-secure private key cryptosystem which uses uniformly distributed keys from {0,1}^n (so there is no "Gen0"). Assume further that H:{0,1}^n->{0,1}^n is a "cryptographic hash function". We will later discuss what this should mean formally in this context. We will need stronger properties than just the collision-resistance that we have required of hash functions before. Finally, assume that GenRSA() is an RSA key generator for which the RSA hardness assumption holds. We then construct a public key cryptosystem Pi=(Gen,Enc,Dec) as follows. Gen(n): (N,d,e)<- GenRSA(n) pk = (N,e), sk=(N,d) return (pk,sk) Enc_{(N,e)}(m): 1. r <- {0,1}^n 2. k = H(r) 3. return (r^e mod N, Enc0_k(m)) Dec_{(N,d)}(c1,c2): 1. r = c1^d mod N 2. k = H(r) 3. return Dec0_k(c2) Why should this scheme be CCA secure? Intuitively, given the RSA-assumption, the attacker has no way of recovering r, hence the likelihood that it applies (the publicly known function) H() to r is negligible and as a result all H()-values it sees would look random to it. Thus, the key used in the private key encryption of the actual message is as good as random so that the security assumption on Pi2 applies and the entire scheme is secure. We need the function H to make the key k look random. It may be tempting to just try to use r as the encryption key. But the RSA-assumption gives us only that the attacker cannot recover r fully. It does not rule out that the attacker obtains a few bits of r. If we used r as key, then it would not be fully random and the security assumption on Pi2 could not be used. The random oracle model - - - - - - - - - - - - The question is under what assumptions on H() the above argument can be made rigorous. Assuming H() to be a PRF does not work because those always require a secret key. Once you know the key there is nothing random about a PRF. Taking a PRG also does not work, as its output looks random only if the input is random as well. Assuming that H() is merely a secure hash function does not suffice either because nothing prevents a good hash function from, for example, producing hash strings which contain predictable portions. E.g. if H0() is a secure hash function so is H(x) = 0.....0 H0(x). Unfortunately, there have not been found any plausible assumptions on the function H() under which schemes like the above could be proved secure. A possible way out is the random oracle model which we now describe. In this model, one instantiates H() with a function that is randomly chosen at the beginning of each experiment, i.e., the probabilities are taken over uniform random choices of H() which is therefore called *random oracle*. All participants in the experiment get access to H. It would be possible to actually implement this by using a trusted party, e.g. an internet server which must called for any evaluation of H(). It can either precompute the entire value table of H() (finite for a given security parameter) using random choices or alternatively, fill this table on demand, i.e. regurgitate H-values on previously requested arguments and make random choices for new requests. In practice, however, this approach is unfeasible, and so the random oracle, for which schemes like the above one *can be proved secure*, is replaced with a cryptographic hash function such as SHA-2. Whether or not the security proof in the random oracle model has any bearing on this instantiation remains open. In fact, it has been shown that there exist contrived cryptographic schemes which are secure in the random oracle model but insecure with any concrete instantiation. Nevertheless, a proof in the random oracle model consists strong evidence as to the security of a scheme and perhaps even more importantly, failure to prove a system secure in the random oracle model might point out actual flaws of a scheme. KL contains a longer discussion of the metamathematical or epistemic implications of the random oracle model without however being able to get over the central issue that a proof in the random oracle model does not imply anything---in the rigorous mathematical sense---about the "standard model" which is used in actual practice. Secure hash functions and pseudorandom functions in the random oracle model - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - As a warmup let us see how random oracles can be used to construct secure hash functions and pseudo random functions. We fix a length function l(n), e.g. l(n)=n or l(n)=n/2 and let H : {0,1}^n->{0,1}^{l(n)} be a random function. We now show that H() itself is a secure hash function. Consider a ppt adversary A who gets oracle access to H() and conduct the Hash-coll experiment from Chapter 11 with random H(): 1. Choose H() at random 2. Adversary A gets oracle access to H() and outputs values x,x':{0,1}^n such that x != x'. 3. The outcome of the experiment is 1 iff H(x)=H(x') Theorem: The success probability of this experiment, taken over uniform random choice of H and the random choices that A might make is a negligible function of the security parameter. Proof. Let x1...xk be the sequence of values that A queries of H prior to outputting x,x'. Clearly, k is polynomial in the security parameter n due to the polynomial time bound on the computation time of A. The probability that H(xi)=H(xj) for some i!=j is therefore negligible. Notice that we can assume that H-values are selected "on demand". Thus, when x,x' are among the x1..xk the success probability is negligible. If, however, A chooses at least one of them afresh then, again, the success probability is negligible because only a negligible portion of the possible H-values is favourable. QED Notice that, in a way, the earlier account of hash functions in Chapter 11 was also based on some kind of random oracle in that we assumed a uniformly random selection of the seed which also does not happen in practice. A pseudorandom function from a random oracle - - - - - - - - - - - - - - - - - - - - - - - The random oracle can be used to obtain a PRF as follows: Theorem: Let l(n)=n/2 and define F_k(x) = H(k || x). Notice that for k,x:{0,1}^n we have F_k(x):{0,1}^n. Now, F_k() is a PRF in the following sense. Let D be a ppt distinguisher with oracle access to a function g():{0,1}^n->{0,1}^n *and* the random function H(). We write this as D^{g(),H()}(n). We then have | Pr[ D^{F_k(),H()}(n)=1 ] - Pr[ D^{f(),H()}(n)=1 ] | = negl(n) with probabilities taken over a random function f(), the key k and the random function H() appearing in F_k(). Proof: As before, we can condition on the queries made to H() by D during its computation. So long as these are disjoint from those made as part of evaluations of F_k() we are in the situation of the previous theorem. But then the probability for such a coincidence is negligible by the randomization over k. The details are left as an exercise. QED It is now also possible to show that the abovedescribed hybrid encryption system is CCA secure; however, for simplicity, we only show its CPA security which is already nontrivial. Also, we have not seen any provably secure RSA-based cryptosystem so far. Theorem: Let Pi0 be a CPA-secure private key system and let Pi be the hybrid system as described above. Then for any ppt adversary, which is given additional access to the random oracle H(), one has Pr[ PubK^eav_{Pi,A^{H()}}(n) = 1 ] = 1/2 + negl(n) with probabilities taken over random choice of H(), the random steps taken within A, and those in the PubK^eav experiment. Proof sketch: For convenience we recall the steps of the experiment. 1. H() is chosen at random. 2. (N,e,d) <- GenRSA(n); pk=(N,e); sk=(N,d) 3. A is given access to pk and H() 4. A outputs messages m0, m1. 5. r<-{0,1}^n; k=H(r) 6. b<-{0,1} 7. c1<-r^e mod N 8. c2<-Enc0_k(m_b) 9. A is given (c1,c2) and returns b' 10. Outcome is 1 iff b=b' The encryption and decryption algorithms that A is given oracle access to essentially follow lines 5-9 and are not repeated here. Let us call QUERY the event that A queries H() on one of the random nonces r used either in line 5 or as part of its queries to the encryption oracle. The idea is that since---assuming as we do---GenRSA satisfies the RSA hardness assumption, the probability of QUERY is negligible. This is because we can use A to construct an adversary in the RSA-inv experiment as follows. We use A as a subroutine and run it according to protocol. We monitor all queries of A to H() and for each one of those, call it r_q, check whether r_q^e mod N is correct. If this is the case, then we output r_q. By the RSA hardness assumption, this adversary can output the correct r only with negligible probability, which means that the probability of QUERY must be negligible. Once we know that QUERY is a negligible event we can assume without loss of generality that it does not occur, i.e., oracle access to H() returns random values that are completely unrelated to the experiment. We can thus assume that A does not get access to H() at all and then the CPA-secure scheme Pi0 is secure by assumption. We refer to KL Theorem 13.1 for details and to KL Theorem 13.5 for a proof that the hybrid scheme even yields CCA security assuming Pi0 is CCA secure. We also remark that the deterministic version of RSA used to encrypt the symmetric key can be replaced by an arbitrary so-called key encapsulation mechanism (KEM), without essentially changing the proof. See KL, 2nd ed for details.