12 Number theory and hardness assumptions ----------------------------------------- The next few sections are about *public-key cryptography* (different keys for encryption and decryption, one public, the other private). This area makes heavy use of number theory, in particular arithmetic. In this section we review the necessary mathematics and formulate hardness assumptions on which the security of public key cryptography is based. We assume standard knowledge about - arithmetic: division with remainder, gcd, extended Euclidean algorithm, i.e. for all integers u,v, there exists integers s, t such that gcd(u,v)=su+tv, Chinese remainder theorem, modular arithmetic, in particular inverses modulo. - groups: additive and multiplicative notation; exponentiation; the additive group Z_N (addition modulo N); the multiplicative group (Z_N)^* of order phi(N) (Euler's phi function); the consequence x^(phi(N)) = 1 (mod N) and the special case x^{p-1}=1 (mod p); the group isomorphisms derived from the Chinese remainder theorem (Z_pq =~= Z_p x Z_q, Z_pq^* =~= Z_p^* x Z_q^*), primitive elements in Z_p^* (cyclicity of Z_p^*). The required knowledge is part of "Logik und Diskrete Strukturen" and can also be looked up in KL. Next, we discuss the computational complexity of various arithmetic operations. We always measure runtimes in terms of the *sizes* (lengths in binary representation) of the numbers in question. Notice that the size of a number n is O(log(n)). The basic arithmetic operations +,-,*,/ can be performed in polynomial time (in terms of the sizes) with the standard algorithms taught at school. The same is true for gcd-computations using Euclid's algorithm. Furthermore, computing modular powers, i.e. a^b mod N can also be done in polynomial time using repeated squaring based on the binary expansion of N. Note though that a^b cannot be computed in polynomial time simply because the result is too long. Testing whether a given number n is prime can also be done in polynomial time; the most efficient algorithms for this use probabilistic polynomial time computation, however. This means that one has randomised algorithms which given a number n answer either "composite" in which case n is definitely composite or "prime" in which case n is prime except for a negligible error probability. The simplest of these tests is the Miller-Rabin test (KL 7.2.2). Using such a primality test it is also possible to generate a random prime number of a given size s. To that end one chooses random n of size s and then tests n,n+1,n+2,n+3,... for primality until a prime number has been found. Theorems about the distribution of prime numbers assert that this search is successful after a small (in any case expected polynomial) time. Factoring - - - - - The factoring problem asks the following question. Given a natural number N find a non-trivial factor of N unless N is prime (as usual the trivial factors are 1 and N). If N is even or has some other small divisor then this question is easily solved by trial division (the factoring method taught at school). If, however, N has only large factors, e.g. is the product of two primes of roughly equal size then factoring is believed to be a very hard problem and at any rate no efficient algorithm is known. Notice that trial division is linear in sqrt(N) hence exponential in the size of N. (sqrt(N)=2^{1/2*log(N)}). There do exist better algorithms for factoring, but with today's technology finding out the factors of a product of two ~512bit primes must be considered infeasible with a broad security margin. RSA Cryptosystem - - - - - - - - - It is believed to be hard to reconstruct x from x^e knowing merely N and e so this procedure is used as the basis of a public key cryptosystem: RSA. Intuitively, Enc_e(x)=x^e mod N and Dec_d(y)=y^d mod N. The encryption key e can be publicized whereas the decryption key d must be kept private. Consider the situation where N=pq for primes p,q. The ring Z_N of integers mod N contains phi(N)=(p-1)(q-1) invertible elements. Recall that x:Z_N is invertible if gcd(x,N)=1. The elements with a gcd > 1 are the multiples of p and of q of which there are (q-1) + (p-1) + 1 with the last summand corresponding to 0=pq. Thus, the number of invertible elements is pq-q-p+1 = (p-1)(q-1). Equivalently, with the Chinese Remainder Theorem we have Z_N =~= Z_p x Z_q and the invertible elements correspond to those pairs (x,y):Z_p x Z_q where x=/=0 and y=/=0. The (multiplicative) group Z^*_N of invertible elements (mod N) thus has order /* number of elements */ phi(N)=(p-1)(q-1). For each element x:Z^*_N it holds that x^phi(N) = 1 (mod N) (general result about groups). Now suppose that we have e such that gcd(e,phi(N))=1 and ed=1 (mod phi(N)). Then we can solve the RSA problem, i.e. determine x such that x^e = y mod N for given y, N, e. More precisely, if x:Z^*_N we can recover x from y:=x^e as y^d (all computations done mod N). Indeed, (x^e)^d = x^{ed} = x^{t*phi(N)+1} = x (for any t). Security of RSA relative to factoring - - - - - - - - - - - - - - - - - - - If given e we are able to find d such that x^{ed}=x holds for all invertible (mod N) x then we can find the factors of N as follows: Namely, in this case, we have a value a (=ed-1) such that x^a=1 (mod N) holds for all x:Z^*_N. Such value must necessarily be even as can be seen by taking x=-1. Now we can successively divide a by two until we have found a value b such that x^{2b} = 1 (mod N) holds for all x but x^b =/= 1 for some x. The crucial observation is that once x^b =/= 1 for some x then x^b =/= 1 for at least 50% of all x so we can find such a witness value x by repeated guesses. The reason for the 50% bound is that {x | x^b=1} is a proper subgroup of Z^*_N and its order must therefore be a proper divisor of |Z^*_N|=phi(N). So, summing up, we are in possession of a value b such that x^{2b}=1 for all x, but x^b=1 not for all x. Let us consider u := x^b mod p and v := x^b mod q. Since u^2=v^2=1 we have u,v:{-1,1}. Moreover, by the Chinese Remainder Theorem, we have Z_N^* =~= Z_p^* x Z_q^* so that if x is chosen uniformly at random then u and v are also uniformly at random and independent. As before, if one of them *may* be =/=1 then this happens for at least 50% of them. As a result, under the assumptions, the event u*v=-1 has a a likelihood of at least 50% and in this event gcd(x^b-1,N) yields a nontrivial divisor of N. Unfortunately, there is no known reduction from RSA to factoring in the sense that we would be able to prove that RSA is semantically secure on condition that factoring is hard. Namely, notwithstanding the above, it might be possible to break RSA without actually producing an exponent d such that x^ed=1 mod N. To date, it is not known whether RSA is sec ure relative to factoring, let alone in itself. RSA hardness assumption - - - - - - - - - - - - The computational hardness of RSA is thus not proven and not reducible to factoring but at least to some extent plausible. It is thus justified to formulate the following RSA hardness assumption. There exists a randomised polynomial time algorithm GenRSA() which given security parameter n produces (N,e,d) such that N is the product of two primes N=pq and d = e^{-1} mod phi(N) and such that the following experiment has negligible success property for all probabilistic polynomial time adversaries A: RSA-inv_A(n): 1. Run GenRSA(n) to obtain (N,e,d) 2. Choose x<-Z^*_N 3. A is given N,e, and x^e mod N and outputs x' 4. The output of the experiment is 1 if x=x' and 0 otherwise. It is believed that the RSA assumption holds with GenRSA(n) selecting two random primes of length n, choosing e at random and outputting (pq,e,e^{-1} mod phi(N)). Groups for cryptography - - - - - - - - - - - - A *group generating* algorithm G(n) is a probabilistic polynomial time algorithm which given security parameter n produces a description of a group G_n whose order is polynomial in n. The description of the group comprises generators, a well-formedness test, testing whether a given bitstring encodes a group element or not and algorithms implementing multiplication and inverses on these representations. For example, G(n) might determine a random prime p of length n and then return an algorithmic description of the group Z_p^*. As for a generator, it could produce a primitive element. For some applications, groups of prime order are preferred. One example of those is the following: If q is a prime such that p=2q+1 is also prime (p is then called a "strong prime") then the subgroup of Z^*_p consisting of the elements of the form y^2 for y:Z_^*p (the quadratic residues mod p) form a subgroup of Z^*_p of order q which by assumption is prime. Further examples are furnished by groups of points on an elliptic curve. The elements of such a group are pairs (x,y) of elements of a finite field GF(q) (q prime) such that y^2 = x^3 + Ax + B holds for fixed parameters A, B which thus determine the elliptic groups. In addition to these pairs (x,y) the group contains a special element O. For example, if the field is GF(7) = Z_7 and A=B=3 then the group (the elliptic curve) has elements (1,0), (3,2), (3,5), (4,3), (4,4), O. Indeed, whenever x^3+3x+3 is (zero or) a quadratic residue mod 7 we find (one or) two points and otherwise not. The group operation, i.e. how elements of the group are added (or multiplied, depending on notation), is omitted for now. We come back to it later. Discrete logarithm and Diffie Hellman - - - - - - - - - - - - - - - - - - - Given a cyclic group G with generator g the discrete logarithm problem is the following: given h:G determine e such that h=g^e. Formally, suppose that we are given a group generating algorithm G(n) outputting for each n a cyclic group G_n and a generator g. For adversary A the experiment DLog_{A,G}(n) then is the following: 1. Run G(n) to obtain (G,q,g) where G is a cyclic group of order q with generator g. 2. Choose h<-G /* e.g. by putting h=g^d for d<-Z_q */ 3. A is given G,q,g,h and outputs e:Z_q 4. The output of the experiment is 1 if g^e = h By definition, the discrete logarithm problem is hard for G() if for all adversaries A one has Pr(DLog_{A,G}(n)=1) = negl(n). It is generally believed that discrete logarithm is hard for the aforementioned example groups. Related problems are computational Diffie-Hellman (CDH) and decisional Diffie-Hellman (DDH). Computational Diffie-Hellman: given a cyclic group G of order q and generator g as above and h=g^x, h'=g^y determine (without knowing x,y !) the value g^{xy}. Decisional Diffie-Hellman: given a cyclic group G of order q and generator g as above and h=g^x, h'=g^y and h''=g^z determine (without knowing x,y,z !) whether or not g^{xy}=g^z. Clearly, a solution to the discrete logarithm problem implies a solution to Diffie-Hellman, but the converse is not known. As a result, hardness of discrete log is a weaker assumption than hardness of DDH (and CDH) Diffie-Hellman is important for the eponymous Diffie-Hellman key-exchange protocol: In order to share a common secret key Alice may choose random x:Z_q and send u=g^x to Bob. Bob does the same and thus sends v=g^y to Alice. Now, Alice and Bob can both compute the shared key g^{xy} as u^x (Alice) and v^y (Bob). We will show later that assuming DDH is hard this scheme is secure in a precise sense. Formal definition of hardness of DDH: Given a group-generating algorithm G() producing cyclic groups as above, we say that the decisional Diffie Hellman problem is hard for G() if for all probabilistic polynomial-time adversaries A and all n we have |Pr(A(G,q,g,g^x,g^y,g^z)=1) - Pr(A(G,q,g,g^x,g^y,g^{xy}))| = negl(n) where G(n)=(G,q,g) and x,y,z <- Z_q. Thus, DDH is hard, if g^{xy} looks like a random element of G even if g^x and g^y are known. We close this chapter by remarking that under the assumption that RSA is hard thre exist pseudo random generators, pseudorandom functions, and permutations (see KL6 and 7.4.1). Under the assumption that discrete logarithm is hard for some group then there exist collision resistant hash functions. (KL7.4.2).