12 Number theory and hardness assumptions
-----------------------------------------

The next few sections are about *public-key cryptography* (different
keys for encryption and decryption, one public, the other
private). This area makes heavy use of number theory, in particular
arithmetic. In this section we review the necessary mathematics and
formulate hardness assumptions on which the security of public key
cryptography is based.

We assume standard knowledge about

- arithmetic: division with remainder, gcd, extended Euclidean
algorithm, i.e. for all integers u,v, there exists integers s, t such
that gcd(u,v)=su+tv, Chinese remainder theorem, modular arithmetic, in
particular inverses modulo.

- groups: additive and multiplicative notation; exponentiation; the
  additive group Z_N (addition modulo N); the multiplicative group
  (Z_N)^* of order phi(N) (Euler's phi function); the consequence
  x^(phi(N)) = 1 (mod N) and the special case x^{p-1}=1 (mod p); the
  group isomorphisms derived from the Chinese remainder theorem (Z_pq
  =~= Z_p x Z_q, Z_pq^* =~= Z_p^* x Z_q^*), primitive elements in
  Z_p^* (cyclicity of Z_p^*). 

The required knowledge is part of "Logik und Diskrete Strukturen" and
can also be looked up in KL.

Next, we discuss the computational complexity of various arithmetic
operations.  We always measure runtimes in terms of the *sizes*
(lengths in binary representation) of the numbers in question. Notice
that the size of a number n is O(log(n)). The basic arithmetic
operations +,-,*,/ can be performed in polynomial time (in terms of
the sizes) with the standard algorithms taught at school. The same is
true for gcd-computations using Euclid's algorithm. Furthermore,
computing modular powers, i.e. a^b mod N can also be done in
polynomial time using repeated squaring based on the binary expansion
of N. Note though that a^b cannot be computed in polynomial time
simply because the result is too long.

Testing whether a given number n is prime can also be done in
polynomial time; the most efficient algorithms for this use
probabilistic polynomial time computation, however. This means that
one has randomised algorithms which given a number n answer either
"composite" in which case n is definitely composite or "prime" in
which case n is prime except for a negligible error probability. The
simplest of these tests is the Miller-Rabin test (KL 7.2.2).

Using such a primality test it is also possible to generate a random
prime number of a given size s. To that end one chooses random n of
size s and then tests n,n+1,n+2,n+3,... for primality until a prime
number has been found. Theorems about the distribution of prime
numbers assert that this search is successful after a small (in any
case expected polynomial) time.


Factoring
- - - - -

The factoring problem asks the following question. Given a natural
number N find a non-trivial factor of N unless N is prime (as usual
the trivial factors are 1 and N).

If N is even or has some other small divisor then this question is
easily solved by trial division (the factoring method taught at
school). If, however, N has only large factors, e.g. is the product of
two primes of roughly equal size then factoring is believed to be a
very hard problem and at any rate no efficient algorithm is
known. Notice that trial division is linear in sqrt(N) hence
exponential in the size of N. (sqrt(N)=2^{1/2*log(N)}). There do exist
better algorithms for factoring, but with today's technology finding
out the factors of a product of two ~512bit primes must be considered
infeasible with a broad security margin. 


RSA Cryptosystem
- - - - - - - - -
It is believed to be hard to reconstruct x from x^e knowing merely N
and e so this procedure is used as the basis of a public key
cryptosystem: RSA. Intuitively, Enc_e(x)=x^e mod N and Dec_d(y)=y^d
mod N. The encryption key e can be publicized whereas the decryption
key d must be kept private.

Consider the situation where N=pq for primes p,q. The ring Z_N of
integers mod N contains phi(N)=(p-1)(q-1) invertible elements. Recall
that x:Z_N is invertible if gcd(x,N)=1. The elements with a gcd > 1
are the multiples of p and of q of which there are (q-1) + (p-1) + 1
with the last summand corresponding to 0=pq. Thus, the number of
invertible elements is pq-q-p+1 = (p-1)(q-1). Equivalently, with the
Chinese Remainder Theorem we have Z_N =~= Z_p x Z_q and the invertible
elements correspond to those pairs (x,y):Z_p x Z_q where x=/=0 and
y=/=0.

The (multiplicative) group Z^*_N of invertible elements (mod N) thus
has order /* number of elements */ phi(N)=(p-1)(q-1). For each element
x:Z^*_N it holds that x^phi(N) = 1 (mod N) (general result about
groups). 

Now suppose that we have e such that gcd(e,phi(N))=1 and ed=1
(mod phi(N)). Then we can solve the RSA problem,
 i.e. determine x such that x^e = y mod N for given y, N, e.
More precisely, if x:Z^*_N we can recover x from y:=x^e as y^d
(all computations done mod N). Indeed, (x^e)^d = x^{ed} =
x^{t*phi(N)+1} = x (for any t).


Security of RSA relative to factoring
- - - - - - - - - - - - - - - - - - -

If given e we are able to find d such that x^{ed}=x holds for all
invertible (mod N) x then we can find the factors of N as follows:
Namely, in this case, we have a value a (=ed-1) such that x^a=1 (mod
N) holds for all x:Z^*_N. Such value must necessarily be even as can
be seen by taking x=-1. Now we can successively divide a by two until
we have found a value b such that x^{2b} = 1 (mod N) holds for all x
but x^b =/= 1 for some x. The crucial observation is that once x^b =/=
1 for some x then x^b =/= 1 for at least 50% of all x so we can find
such a witness value x by repeated guesses. The reason for the 50%
bound is that {x | x^b=1} is a proper subgroup of Z^*_N and its order
must therefore be a proper divisor of |Z^*_N|=phi(N). 

So, summing up, we are in possession of a value b such that x^{2b}=1
for all x, but x^b=1 not for all x. Let us consider u := x^b mod p and
v := x^b mod q. Since u^2=v^2=1 we have u,v:{-1,1}. Moreover, by the
Chinese Remainder Theorem, we have Z_N^* =~= Z_p^* x Z_q^* so that if
x is chosen uniformly at random then u and v are also uniformly at
random and independent. As before, if one of them *may* be =/=1 then
this happens for at least 50% of them. As a result, under the
assumptions, the event u*v=-1 has a a likelihood of at least 50% and
in this event gcd(x^b-1,N) yields a nontrivial divisor of N. 


Unfortunately, there is no known reduction from RSA to factoring in
the sense that we would be able to prove that RSA is semantically
secure on condition that factoring is hard. Namely, notwithstanding
the above, it might be possible to break RSA without actually
producing an exponent d such that x^ed=1 mod N. To date, it is not
known whether RSA is sec ure relative to factoring, let alone in
itself. 


RSA hardness assumption
- - - - - - - - - - - - 

The computational hardness of RSA is thus not proven and not reducible
to factoring but at least to some extent plausible. It is thus
justified to formulate the following RSA hardness assumption.

There exists a randomised polynomial time algorithm GenRSA() which
given security parameter n produces (N,e,d) such that N is the product
of two primes N=pq and d = e^{-1} mod phi(N) and such that the following 
experiment has negligible success property for all probabilistic polynomial time adversaries A:

RSA-inv_A(n):

1. Run GenRSA(n) to obtain (N,e,d)
2. Choose x<-Z^*_N
3. A is given N,e, and x^e mod N and outputs x'
4. The output of the experiment is 1 if x=x' and 0 otherwise.


It is believed that the RSA assumption holds with GenRSA(n) selecting
two random primes of length n, choosing e at random and outputting
(pq,e,e^{-1} mod phi(N)).


Groups for cryptography 
- - - - - - - - - - - - 

A *group generating* algorithm G(n) is a probabilistic polynomial time
algorithm which given security parameter n produces a description of a
group G_n whose order is polynomial in n. The description of the group
comprises generators, a well-formedness test, testing whether a given
bitstring encodes a group element or not and algorithms implementing
multiplication and inverses on these representations.

For example, G(n) might determine a random prime p of length n and
then return an algorithmic description of the group Z_p^*. As for a
generator, it could produce a primitive element.

For some applications, groups of prime order are preferred. One
example of those is the following: If q is a prime such that p=2q+1 is
also prime (p is then called a "strong prime") then the subgroup of
Z^*_p consisting of the elements of the form y^2 for y:Z_^*p (the
quadratic residues mod p) form a subgroup of Z^*_p of order q which by
assumption is prime.

Further examples are furnished by groups of points on an elliptic
curve. The elements of such a group are pairs (x,y) of elements of
a finite field GF(q) (q prime) such that

                   y^2 = x^3 + Ax + B 

holds for fixed parameters A, B which thus determine the elliptic
groups. In addition to these pairs (x,y) the group contains a special
element O. 

For example, if the field is GF(7) = Z_7 and A=B=3 then the group (the
elliptic curve) has elements (1,0), (3,2), (3,5), (4,3), (4,4),
O. Indeed, whenever x^3+3x+3 is (zero or) a quadratic residue mod 7 we find (one
or) two points and otherwise not.

The group operation, i.e. how elements of the group are added (or
multiplied, depending on notation), is omitted for now. 
We come back to it later.


Discrete logarithm and Diffie Hellman
- - - - - - - - - - - - - - - - - - -

Given a cyclic group G with generator g the discrete logarithm problem
is the following: given h:G determine e such that h=g^e. Formally,
suppose that we are given a group generating algorithm G(n) outputting
for each n a cyclic group G_n and a generator g.

For adversary A the experiment DLog_{A,G}(n) then is the following:

1. Run G(n) to obtain (G,q,g) where G is a cyclic group of order q
with generator g.

2. Choose h<-G /* e.g. by putting h=g^d for d<-Z_q */
3. A is given G,q,g,h and outputs e:Z_q
4. The output of the experiment is 1 if g^e = h

By definition, the discrete logarithm problem is hard for G() if for
all adversaries A one has Pr(DLog_{A,G}(n)=1) = negl(n).

It is generally believed that discrete logarithm is hard for the
aforementioned example groups. 

Related problems are computational Diffie-Hellman (CDH) and decisional
Diffie-Hellman (DDH).

Computational Diffie-Hellman: given a cyclic group G of order q and
generator g as above and h=g^x, h'=g^y determine (without knowing x,y !)
the value g^{xy}. 

Decisional Diffie-Hellman: given a cyclic group G of order q and
generator g as above and h=g^x, h'=g^y and h''=g^z determine (without
knowing x,y,z !) whether or not g^{xy}=g^z.

Clearly, a solution to the discrete logarithm problem implies a
solution to Diffie-Hellman, but the converse is not known.  As a
result, hardness of discrete log is a weaker assumption than hardness
of DDH (and CDH)

Diffie-Hellman is important for the eponymous Diffie-Hellman
key-exchange protocol: In order to share a common secret key Alice may
choose random x:Z_q and send u=g^x to Bob. Bob does the same and thus
sends v=g^y to Alice.  Now, Alice and Bob can both compute the shared
key g^{xy} as u^x (Alice) and v^y (Bob). We will show later that
assuming DDH is hard this scheme is secure in a precise sense.

Formal definition of hardness of DDH:

Given a group-generating algorithm G() producing cyclic groups as
above, we say that the decisional Diffie Hellman problem is hard for
G() if for all  probabilistic polynomial-time adversaries A and all n we have

 |Pr(A(G,q,g,g^x,g^y,g^z)=1) - Pr(A(G,q,g,g^x,g^y,g^{xy}))| = negl(n)

where G(n)=(G,q,g) and x,y,z <- Z_q.

Thus, DDH is hard, if g^{xy} looks like a random element of G even if
g^x and g^y are known.

We close this chapter by remarking that under the assumption that RSA
is hard thre exist pseudo random generators, pseudorandom functions,
and permutations (see KL6 and 7.4.1). Under the assumption that
discrete logarithm is hard for some group then there exist collision
resistant hash functions. (KL7.4.2).