ss2017/cryptography/Notes02.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214

2 Principles of  cryptography (KL1.1, KL1.2, KL1.4, KL2)
--------------------------------------------------------

These considerations and similar ones applying to other historical
ciphers have led to the principles for the design of cryptosystems.

Before stating these we describe somewhat more formally what a
(private-key) cryptosystem is. 

A cryptosystem comprises three algorithms:

- A key generation algorithm Gen which outputs a key according to some
  format an probability distribution.

- An encryption algorithm Enc which given plaintext m and key k
  produces a ciphertext Enc_k(m)

- A decryption algorithm Dec which given ciphertext c and key k
  produces a plaintext Dec_k(m).

The set of all keys possibly put out by the key generator Gen is
called keyspace (K).  For every key k produced by Gen it must hold
that Dec_k(Enc_k(m)) = m. There is also a finite set of possible
messages M and ciphertexts C. Clearly, Gen must be randomized,
otherwise it would always return the same key. Sometimes Enc is also
randomized, i.e. ciphertexts may contain some random information and
do not depend functionally on plaintext and key.

One of the most important design principles is Kerckhoff's rule (19th
century): Security of the cryptosystem should rely on secrecy of the
key not on secrecy of the method. In other words, the cryptosystem
should remain secure if everything about the three algorithms involved
is publicly known. "No security through obscurity".

In addition to this, the cryptosystem should withstand some or all of
the following kinds of attack:

Ciphertext only attack: an eavesdropper observes ciphertext and aims
at determining the underlying plaintexts.

Known plaintext attack (KPA): This is when the eavesdropper knows some
ciphertext-plaintext pairs for one and the same key and tries to use
this information to then decrypt additional ciphertexts (again
w.r.t. the same key).

Chosen plaintext attack (CPA): Here the adversary may prepare
plaintexts of its choice and request the corresponding ciphertexts.

Chosen ciphertext attack (CCA): Here the adversary can even request
the decryption of ciphertexts of its choice.

Modern cryptography, say since WW2, holds the following additional
principles:


Principle 1: Formulation of exact definitions

Which notion of security is one aiming at?

No adversary should be able to know the key ... But what if they can
decrypt ciphertexts without knowing the key?

No adversary should be able to decrypt any ciphertext unless they know
the key. But what if they can find out some bits of the ciphertext,
e.g. every second letter, or a block of letters somewhere in the
middle?


Principle 2: Reliance on precise assumptions

Typically, security of a cryptosystem can be guaranteed only by making
assumptions on the computational power of an adversary (with unlimited
resources they could always try out all plaintexts) and also on the
computational difficulty of solving certain algorithmic
problems. Desirable as it would be to have unconditional security; if
one has to work with assumptions one should at least state them
precisely. Sometimes, security can only be proven for idealized
versions of cryptosystems that use components not known to exist but
which are believed to be "approximated" by existing components. Again,
it is then important to clearly state what these idealized components
or versions are.


Principle 3: Rigorous proofs of security 

Once the desired notion of security and the assumptions are clearly
defined one should then establish a rigorous mathematical proof of
conditional security.

"Given that Assumption X is true, Construction Y is secure according
to the given definition."

As an example, we now look at perfect security in the sense of Shannon
and see how Vernam's cipher (patented in 1919) implements it.


In order to get around the fact that an adversary can always try out
all possible plaintexts but that this kind of attack "doesn't count"
Claude Shannon suggested to use probabilities: if one out of two
messages is selected at random and the adversary gets to see the
ciphertexts only it has a chance of no more than 50% of determining
the plaintext correctly from the ciphertext. For cryptosystem Pi and
adversary A we thus define the following

Eavesdropping indistinguishability experiment PrivK^eav_{A,Pi}:

  1. The adversary produces two messages m0, m1 from the message space M.
  2. Gen is used to generate a (random) key k and a bit b <- {0,1} is
     chosen uniformly at random (50/50). The (or rather a) ciphertext
     c = Enc_k(m_b) is then given to A. 
  3. A answers by outputting a bit b'.
  4. The output of the experiment is defined 1 if b=b' and 0 otherwise.

Definition (perfect secrecy): The encryption scheme Pi=(Gen,Enc,Dec)
is perfectly secret if for all adversaries A it holds that

                Pr[PrivK^eav_{A,Pi} = 1] = 0.5

The following cryptosystems are not perfectly secure:

A book cipher which uses a secret code book containing for each *word*
another word to replace it. E,g. "bomb" gets enciphered as "cake" and
"dollars" gets enciphered as "grams". Here the adversary could select
m0="bomb bomb" and m1="dollar bomb". If the received ciphertext has
the form c = x x then it will output b'=0 and if c = x y with x =/= y
it goes for 1.

The Vigenere cipher with key length uniformly chosen at random from
{1,2} (and once the length is fixed keys also chosen at random).  The
adversary can choose m0 = aa and m1=ab. If the ciphertext has the form
xx it goes for b'=0. Otherwise it returns b'=1.

Now, if b=0 then there is a 1/2+1/2*1/26 chance that the ciphertext
has the form xx (key length 1 or key length 2 with equal letters) and thus
the adversary is right with probability 1/2+1/52

If b=1 then the adversary is right with probability 1-1/52 (the only
way to be wrong is key length two and two key letters of distance 1).

Thus, the total probability for A to be right is 1/2*(1/2+1/52) +
1/2*(1-1/52) = 3/4.


Proposition: Fix a cryptosystem Pi and suppose that plaintexts and
keys are drawn according to some, not necessarily uniform
distribution. The sytem Pi is perfectly secret if and only if

               Pr[M=m | C=c] = Pr[M=m]

for every fixed plaintext m and ciphertext c such that Pr[C=c] !=
0.

Proof (sketch): Suppose that Pr[M=m | C=c] != Pr[M=m] for some c and
m. Since sum_m Pr[M=m | C=c] = sum_m Pr[M=m] = 1 (both are probability
measures) there must exist m_0,m_1 (one of them m) such that Pr[M=m0 |
C=c] < Pr[M=m0] and Pr[M=m1 | C=c] > Pr[M=m1].  By Bayes' theorem it
follows that Pr[C=c|M=m0]<Pr[C=c] and Pr[C=c|M=m1]>Pr[C=c], thus in
particular, Pr[C=c|M=m0] < Pr[C=c|M=m1].

Now, consider the adversary who presents these messages m_0, m_1 and
if the ciphertext returned is c
then outputs b'=1, i.e. assume that the plaintext was not
m0 and outputs b'=0 otherwise. Its success probability is 

1/2*(1-Pr[C=c|M=m0]) + 1/2*Pr[C=c|M=m1] =
1/2 + 1/2(Pr[C=c|M=m1] - Pr[C=c|M=m0])

which is strictly above 1/2.

For the converse, one uses the fact that all the adversary knows is
the ciphertext, but according to the assumed equality Pr[M=m | C=c] =
Pr[M=m] this does not help.                                  Q.E.D.

Now, such perfectly secret cryptosystems exist indeed: for fixed n:N
put M=C=K={0,1}^n and let Gen return k:K uniformly at random. The
encryption function is Enc_k(m) = m+k and Dec_k(m) = m+k as well,
where + denotes pointwise addition modulo 2 (XOR). For example, if n=5
and k=11010 and m=11100 then Enc_k(m)=00110.

This system is known as Vernam's cipher or "one time pad" the point
being that its secrecy relies on the key being chosen afresh at random
every time one wants to use the scheme. It can also be seen as a
special case of the Vigenere cipher with the key being as long as the
entire text to be enciphered.

Proposition: The Vernam cipher is perfectly secret.

Proof: This is because for a given plaintext m the ciphertexts are
uniformly distributed, i.e., their distributions are identical and
independent for m0 and m1. As a result, the adversary cannot have an
advantage. Q.E.D.

The obvious disadvantage of the one-time pad is the key length,
nevertheless it is sometimes used, e.g. purportedly during the Cold
War for secure communication between the White House and the Kremlin. 

Unfortunately, this key length is the price of perfect secrecy:

Theorem: If Pi is a perfectly secret cryptosystem with key space K and
message space M then |K|>=|M|, i.e. there have to be at least as many
possible keys as there are possible messages ("the key must be as long
as the message").

Proof: Suppose for a contradiction that |K|<|M| and that ciphertext
c:C occurs with nonzero probability. Let M(c) = {m | Dec_k(c)=m for
some k:K}.  We must have |M(c)|<=|K| since for each m:M(c) there is at
least one k:K such that m=Dec_k(c) and the keys thus corresponding to
distinct messages are distinct. If now |K|<|M| there must exist some
m' not in M(c). But then Pr[M=m'|C=c] = 0 =/= Pr[M=m'], so the scheme
is not perfectly secret.

This result is due to Claude Shannon who gives in fact a more precise
characterisation of perfect secrecy, see (KL2.4).