summaryrefslogtreecommitdiff
path: root/ss2017/cryptography/Notes15.txt
diff options
context:
space:
mode:
Diffstat (limited to 'ss2017/cryptography/Notes15.txt')
-rw-r--r--ss2017/cryptography/Notes15.txt279
1 files changed, 279 insertions, 0 deletions
diff --git a/ss2017/cryptography/Notes15.txt b/ss2017/cryptography/Notes15.txt
new file mode 100644
index 0000000..e76ada5
--- /dev/null
+++ b/ss2017/cryptography/Notes15.txt
@@ -0,0 +1,279 @@
1
2
315 Factoring and computing discrete logarithms
4----------------------------------------------
5
6In this chapter we survey algorithms for factoring large numbers, in
7particular products of equal sized primes, and for computing discrete
8logarithms. These algorithms or rather their runtime provide lower
9bounds on the necessary key lengths for public key cryptography for it
10relies on the difficulty of these problems.
11
12For the sections about factoring we let N=pq be a product of two
13roughly equal sized primes and n = O(log(N)) be the length of N in
14binary. The task is to find p,q from N.
15
16For the sections about discrete logarithms let G be a cyclic group of
17order q and g,y:G with g a generator. The task is to find x:Z_q such
18that g^x=y.
19
20Trial division
21- - - - - - - -
22
23In order to factor N=pq, i.e. to find p or q given N, we can try out
24divisors d=1...sqrt(N). The runtime of this is polynomial in sqrt(N),
25hence takes time exponential in n. Once n>120, hence ||sqrt(N)||=~=60
26this is no longer feasible.
27
28
29Pollard's p-1 method
30- - - - - - - - - - -
31
32Pollard's p-1 method may help when p-1 is a product of powers of small
33primes. We let p_1,...p_k be the first k primes and construct
34
35 B_k = prod_{i=1..k}p_i^{n / ||p_i||}
36
37The exponent n/||p_i|| is chosen so that if some power of p_i divides
38p-1 then that power is at most n/||p_i||. As a result, if p-1 is a
39product of powers of primes from among the set {p_1..p_k} then p-1 |
40B_k. If, in addition, q-1 does *not* divide B_k then we can find out p
41as we show shortly. In order that q-1 does *not* divide B_k we can
42repeatedly execute the method with increasing values for k hoping that
43at some point one of the two prime factors "works" and the other does
44not.
45
46OK so, now let B any number so that p-1 | B and q-1 does not divide
47B. For any x it holds that x^B mod p = 1 whereas with constant
48probability x^B mod q =/= 1. The former holds since the order of Z_p^*
49is p-1 and B is a multiple thereof. The latter holds since when x is a
50generator of Z_q^* then x^B mod q is not 1 and the proportion of those
51generators is about 50% (not hard to see or use Thm B.15 of KL).
52
53Now let y = x^B-1 mod N. We have y mod p = 0 and y mod q =/= 0. Thus,
54p divides y and q does not divide y so gcd(y,N)=p.
55
56The resulting algorithm is now as follows:
57
58for k=1,2,3,... do the following until either a factor of N is found
59or computing resources are exhausted.
60 Put B = B_k
61 for a certain number of times do the following
62 x<-Z_N
63 d = gcd(x^B-1 mod N,N)
64 if d=/=1 then a factor of N has been found.
65
66
67
68Pollard's rho method
69- - - - - - - - - - -
70
71This algorithm is based on the observation that once we have two
72distinct values x,y:Z_N such that x mod p = y mod p then p =
73gcd(x-y,N). Thus, factoring amounts to finding a collision in the
74"hash function" H(x) = x mod p. Notice that while we cannot compute
75this function we can detect whether x,y constitute a collision by gcd
76computation.
77
78Now, by the birthday paradox, a random set of size S := 2*sqrt(p) =
79O(N^{1/4}) contains such a collision with likelihood 50%. Finding it,
80requires S^2 collision tests thus, when implemented naively, this
81method is no better than trial division.
82
83However, the constant space method for collision detection described
84earlier only has a runtime of O(S), but of course, as already
85mentioned, we cannot compute H(). However, we can replace it with any
86function F which has the property that x = y (mod p) implies F(x)=F(y)
87(mod p) and which is such that for random x0 the set
88{x_0,F(x_0),F(F(x_0)),..., F^S(x_0)} is also sufficiently close to a
89random set so that it is likely to contain a collision. The function
90F(x)=x^2 + b for some b =/= 0, -2 mod N is believed to have this
91property and at any rate behaves well in practice. Just as in the
92earlier algorithm we have that F^i(x0) = F^j(x0) (mod p) implies
93F^k(x0) = F^{2k}(x0) (mod p) for some k.
94
95So, the set of the first S iterates of F() starting from random x0 is
96likely to contain a collision. But since once x,y form a collision so
97do F(x), F(y), it suffices to search for a collision among the pairs
98F^k(x0), F^{2k}(x0) which leads to the following algorithm:
99
100b <- Z_N \ {0,-2}
101F(x) = x^2+b mod N
102
103x <- Z_N
104y = x
105for k=1..2*sqrt(N) do
106 x = F(x); y=F(F(x))
107 if gcd(x-y,N) != 1 a factor is found.
108
109repeat the whole procedure until either a factor is found or resources
110are exhausted.
111
112Assuming that the function F() has the required properties (which has
113not been proved to this date!) the runtime of this algorithm is
114proportional to N^{1/4} which is still exponential in n but requires
115one to double n if one wants to stay immune against this
116method.
117
118
119The quadratic sieve method
120- - - - - - - - - - - - - -
121
122The starting point for the most efficient factoring algorithms to this
123date is the following observation made by Pierre de Fermat:
124
125If we succeed in expressing N as the difference of two
126squares N=x^2-y^2 then N=(x+y)(x-y) so we have factored N.
127
128Fermat suggested to try out successive values for x starting at
129sqrt(N) and for each of those checking whether x^2-N is a square.
130
131The first improvement one can make on this observation is that it
132suffices that N divides x^2 - y^2 for if kN = (x+y)(x-y)
133then---assuming that x != (+/-)y mod N---the factors p and q of N
134must straddle the product so that gcd(x-y,N) or gcd(x+y,N) will be
135different from 1.
136
137Thus, we seek x,y:Z_N such that x^2-y^2 = 0 (mod N) and x !=y, x!=-y mod N.
138
139Notice that by the Chinese Remainder Theorem each square has four
140roots modulo N, so given x, two out of the four roots of x^2 are good.
141
142Now, to find such x,y more efficiently we proceed as follows. We
143select a set B of small primes and try out various values x>sqrt(N) so
144that x^2 mod N can be factored (because after reduction mod N it is
145sufficiently small) and its factors are in B. In this way, we find
146numbers x_1 .. x_l such that each x_i^2 mod N is expressed as a
147product of primes in B.
148
149With each of these numbers x_i we associate a B-indexed 0,1 vector
150whose entry at p:B is the exponent of p in the factorisation of x_i
151modulo 2.
152
153For example, if B={2,3,5,11} and x_1^2 mod N is 2^5*3^2*11^2 then the
154associated vector is (1 0 0 0).
155
156If l exceeds the size of k then there is a (mod 2) linear combination
157of these vectors which results in the all zero vector. But multiplying
158the corresponding x_i together then yields a square because all prime
159factors have even exponents.
160
161Continuing the example, if x_2^2 mod N is 2^15*11, i.e. (1 0 0 1), and
162x_3^2 mod N is 11^13, i.e. (0 0 0 1) then, since 1000+1001+0001=0000,
163we obtain the following identity mod N:
164
165 (x_1*x_2*x_3)^2 = (2^10 * 3 * 11^8)^2
166
167and, heuristically, we may assume that there is a 50% chance for this
168identity to be nontrivial in the sense that it yields a good pair of
169square roots.
170
171Summing up, we build up a large basis of values x whose squares mod N
172can be completely factored, then compute a linear combination (mod 2)
173of the zero vector from the corresponding B-indexed vectors and check
174whether the resulting identity of squares mod N is nontrivial which
175then yields the desired factor.
176
177We now study the problem of finding discrete logarithms in a cyclic
178group G as above.
179
180
181The Pohlig-Hellman algorithm
182- - - - - - - - - - - - - - -
183
184If we have a factorisation of the group order q as prod_{i=1}^n q^i
185with the q_i relatively prime (gcd = 1) then, if we have values x_i
186such that
187
188 (g^{q/q_i})^x_i = y^{q/q_i}
189
190we can find x by solving (using the Chinese Remainder Theorem) the
191system of modular equations
192
193 x mod q_i = x_i
194
195Since g^{q/q_i} generates a subgroup of order q_i which is easier if
196the q_i are small. It remains to see why x satisfies the above system
197of equations. Indeed, if g^x=y then g^{q/q_i}^x = y^{q/q_i} so x mod
198q_i solves the discrete log problem of order q_i.
199
200It remains to see how to deal with groups whose order is a prime power
201q=p^f. In this case, one can expand the unknown x as
202
203 x = x_0 + x_1 * p + x_2 * p^2 + ... + x_{f-1} p^{f-1}
204
205Now, by exhaustive search or using any other algorithm we can
206determine x_0:Z_p such that g^{q/p}^{x_0} = y^{q/p} (mod p). Once we
207know x_0 we can then recursively solve (g^p)^{x'} = y/g^{x_0} which
208yields x_1,... The details are left as an exercise.
209
210
211The giant-step baby-step algorithm
212- - - - - - - - - - - - - - - - - -
213
214Here the idea is that the unknown x can be decomposed as x =
215t*[sqrt(q)] + s for some s <sqrt(q). Here [sqrt(q)] stands for the
216greatest integer <= sqrt(q). We then precompute a table of the values
217y_t := g^{t*[sqrt{q}]} for t=0..sqrt(q), this are the giant
218steps. Now, for each s we look for (using binary search or similar)
219for a t such that
220
221 g^{t*[sqrt(q)]} = y*g^{-s}
222
223Once we have found it we can output x = s*[sqrt(q)]+t. The runtime of
224this method is O(sqrt(q)) which again essentially halves the "bits of
225security".
226
227
228The index calculus method
229- - - - - - - - - - - - -
230
231Unlike the earlier methods the index calculus makes special
232assumptions on the group G for which to solve the discrete logarithm
233problem. Namely, it must be the group Z_p^* for some prime p. It can
234be generalised to the group of invertible elements in a finite field,
235but does not work in the case of ellipic curves, for instance.
236
237So, let us assume that G=Z_p^* and that g,y:Z_p^*. As before, we seek
238x such that g^x=y. As before, we write q=p-1 for the group order.
239
240Let us write log_g(z) for the discrete logarithm of some value z:Z_p^*
241so that g^{log_g(z)} = z. The idea is to precompute a table of such
242logarithms. To that end, we choose values x_1..x_l such that g^{x_i}
243can be factored modulo p into small primes from a previously fixed set
244B={p_1,...,p_k}.From
245
246 g^{x_j} = prod_{i=1}^k p_i^{e_{j,i}} (mod p)
247
248we then obtain
249
250 x_j = sum_{i=1}^k e_{j,i} * log_g(p_i) (mod q)
251
252Now, assuming that l in relation to k was chosen large enough this
253system can be olved for the unknowns log_g(p_i). We thus have obtained
254the desired "table of logarithms".
255
256It allows us to easily compute the discrete logarithm of any number
257w:Z_p^* which can be factored into primes from B: if w = prod_{i=1}^k
258p_i^{e_i} then log_g(w) = sum_{i=1}^k e_i * log(p_i) (mod q).
259
260If y itself has that property we are obviously done. Otherwise, we can
261search for a value With the latter at hand we can then --- with luck
262--- take the logarithm of our y as follows: we choose some value z:Z_q
263at random such that y*g^z can be so factored and thus taken a
264logarithm of. This helps, since log_g(y) = log_g(y*g^z)-z.
265
266Using advanced number-theoretic methods one can analyse the
267probabilities that the various random choices in this method succeed
268and thus derive an expected runtime that is subexponential (in
269||q||). Indeed, the index calculus method is currently the best known
270method for computing discrete logarithms.
271
272We also remark that it is the possibility of factoring elements of G
273into primes what makes this algorithm possible. For other groups, in
274particular the multiplicative group of a finite field, analogous
275notions of "primes" can be found, but for groups derived from elliptic
276curves no such notion has been discovered and hence these groups are
277immune against index calculus making them particularly promising for
278cryptography. Having said that, taking discrete logarithms in a group
279Z_p^* where p has 1000+ digits remains unfeasible.