14 El Gamal and elliptic curve cryptography ------------------------------------------- The El Gamal cryptosystem is based on the DDH assumption and is provably secure under the assumption of the latter. Just like DH-key exchange it is parametrised by a group (generating algorithm) G(n) and works as follows: Gen(n) : (G,q,g) = G(n) x <- Z_q h <- g^x pk = (G,q,g,h) sk = (G,q,g,x) output (pk,sk) Enc_(G,q,g,h)(m) : check that m:G y <- Z_q c = (g^y,h^y*m) output c Dec_(G,q,g,g,x)(c1,c2) : m = c2 / c1^x output m Let us first see that this scheme is correct: If c1=g^y and c2=h^y*m where h=g^x then c2 / c1^x= g^(xy) * m / g^(xy) = m We also notice that messages and ciphertexts are actually elements of the group G which must therefore by somehow encoded as binary strings. This can be done in various ways for the various groups, see KL for details. Theorem: If the DDH assumptions holds for a group (generating algorithm) G() then the El Gamal cryptosystem basd on G() is semantically secure. Proof: Let A be an adversary against the El Gamal system and let its success probability be 1/2+eps(n). We must show that eps(n) is negligible. We construct from A an adversary A' as in the DDH assumption. Suppose that A' receives g^x, g^y, u where u is either g^(xy) or g^z for random z and A' is to decide which of the two. We forward (G,q,g,g^x) as "public key" to A and wait for it to output m0,m1:G. We choose a random bit b and build c = (g^y,u*m_b) and let b' be the bit that A returns. If u=g^(xy) then A is played with according to the rules of the El Gamal cryptosystem and therefore b'=b with probability 1/2+eps(n). If u=g^z for random z then u*m0 is also uniformly random (since x|->x*m0 is a bijection G->G) and therefore the probability that b'=b is exactly 1/2. The difference between the two probabilities is precisely eps(n) which under the DDH assumption will be negligible. QED Elliptic curve cryptography - - - - - - - - - - - - - - As already mentioned, elliptic curves furnish a large reservoir of cyclic groups for which taking discrete logarithms, CDH, and DDH are believed to be hard. With existing algorithms, solving these problems on elliptic curve groups is consistently harder than doing so for groups of invertible elements mod p or in a finite field. Recall that an elliptic curve over some field F of characteristic =/= 2,3 is defined by two constants a, b and comprises the points (x,y) that solve the equation y^2 = x^3 + ax + b Here, one assumes that the *discriminant* 4a^3+27b^2 =/= 0 for otherwise the curve might be *singular*, i.e. contain anomalies like isolated points, sharp bends ("cusps") or similar. When viewed over the reals such a curve looks roughly as follows: | | | / - | - / / | \ _ - | | --------------------------------------------------------------------- | | - _ \ | / \ -| - \ | | Depending on the parameters the knob to the left may also become detached and form a closed loop sitting to the right. In either case whenever a line intersects the curve in two points it meets it again in a third point except when the line is parallel to the y-axis in which case we say that the third point of intersection is at infinity, here denoted O. This property also holds for curves over other fields, in particular finite ones which are of interest for cryptography. There is an explicit formula for obtaining the third point of intersection. To wit, if (x1,y1) and (x2,y2) lie on the curve then the line through them is y=m(x-x1)+y1 where m=(y2-y1)/(x2-x1). Plugging this into the curve yields (m(x-x1)+y1)^2 = x^3+ax+b The three solutions of this equation are x=x1, x=x2, x=m^2-x1-x2, so the last one is the desired third point of intersection. Now, it is a remarkable fact that the points on the curve together with O form an abelian group, traditionally written additively, where the sum of points P and Q is obtained by first taking the third curve point on the line through P and Q, call it R=(x_R,y_R), and then reflecting R on the x-axis, i.e. P+Q=(x_R,-y_R). This, however, only applies when P=/=Q and when P and Q do not have the same x-coordinate. In the latter case, one has P+Q=O, the special point at infinity. In the former case, i.e. when P=Q, one must take the tangent through P and intersect it with the curve. One can show using implicit differentiation that the coordinates of that point are given by (((3x_1^2+a)/2y_1)^2 - 2x1, -y1+((3x_1^2+a)/2y_1)^2(x1-x2)) and this formula, just as the one above for the general case, also makes sense for finite fields where there are no "tangents". Yet another special case is when P or Q is O. One simply puts O+P=P+O=P. Curves over fields of characteristic 2 or 3 are also possible, but then both the defining equations and the formulas for addition of points become slightly different (notice e.g. that in the formula above we divide by 2). We refer to the literature, but stress that it is very important not to make mistakes with the implementation of the addition formulas and that doing so has lead to security flaws. We have already seen that addition of points is well-defined and commutative. It is also clear that the inverse of a point (x,y) is given by (x,-y), i.e. reflection on the x-axis. Associativity of addition follows from nontrivial results, e.g. the Weierstrass p-function, but can in principle also be validated "by hand" on the basis of the above formulas. For integer n and point P on a curve we use the notation n.P for P+P+...+P with summands. This is the same as the exponent notation in multiplicatively written groups. Therefore, the "repeated squaring" method can be used here as well: 100.P = 4.P+32.P+64.P = 2.(2.(P + 2.(2.(2.(P+2.P))))) Now, the plan is to use the groups coming from elliptic curves for cryptosystems like Diffie-Hellman key exchange or El Gamal. For this to work, a couple more considerations are in order: we need to encode plaintexts as points on the group and we need to ascertain that the group is cyclic or else move to a cyclic subgroup. In order to find a cyclic group one can rely on one of the publicly available and recommended curves. Many of them have a prime number of points and hence are cyclic with any element other than O being a generator. Alternatively, one can try out random coefficients a and b until one finds a curve with a prime number of points. To do this, one needs to invoke Schoof's algorithm which allows one to determine the number of points on a given elliptic curve. If one would like to use a curve with a nonprime number, say N, of points, then one must factor N and it is then important that N has a large prime factor p, i.e. p>sqrt(N). In this case, by the representation theorem for finite(ly generated) abelian groups there is a subgroup of order p and any of its nonzero elements are generators. Moreover, the remaining subgroups have smaller order. To check whether a point P is in the subgroup of order p it suffices to check that p.P = O. To obtain a point in that subgroup one can pick a random point P and if Q := N/p.P =/= O then p.Q is in the subgroup, hence Q can serve as a generator. For the most, however, one uses one of the already publicised elliptic curves such as those recommended by the NIST "for government use". See http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf There is no need to choose a "fresh" curve for every application and on the other hand, some curves are "weak" and should be avoided. In order to encode plaintexts as points we can proceed as follows. First, we note that it suffices to store the x-coordinate of a point and a single bit telling whether the y-coordinate is above or below the x-axis, i.e., which of the two square roots of x^3 +ax+b are meant. This is known as point compression. As for the x-coordinates one fixes a bijection alpha between bitstrings of length l=log(q) and field elements. Now one puts t=l-k where k is chosen sufficiently large. Then, to encode message x of length t one searches for a string z of length k so that alpha(x||z) has a square root y over GF(q), i.e., is a quadratic residue. At that point, (alpha(x||z),y) can be taken as the desired encoding of x as a curve point. For random padding z the chance for alpha(x||z) to be a quadratic residue is about 1/2, so after a short number of iterations one is almost sure to find an appropriate encoding. We also remark that there exist efficient algorithms for testing whether a given field element is a quadratic residue and if so to compute its square root. With all this machinery in place, the Diffie-Hellman and El Gamal cryptosystems can be used with elliptic curve groups without any change, bearing in mind, of course, that instead of exponentiation g^x, we now use multiplication x.P and instead of multiplication we use addition, so that, e.g. the encryption of message m encoded as a point M becomes (y.P,y.Q+M) where P is the generator and Q=x.P We close by remarking that since the early 2000 elliptic curve cryptography (ECC) has become more and more common. Also the bitcoin cryptocurrency uses it.