ss2017/cryptography/Notes06.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216

6 Block ciphers, Modes of operation (KL3.6.3 KL3.6.4)
-----------------------------------------------------

A pseudorandom function F() with the additional property that F_k is
invertible and the inverse F_k^{-1} is computable in polynomial time
is known as *block cipher* or *pseudorandom permutation*. It is not
hard to see that in this case,

Enc_k(m) = F_k(m)
Dec_k(c) = F_k^{-1}(c)

is semantically secure (hence the name "block cipher"). While a simple
PRG already suffices to build a semantically secure cryptosystem those
based on PRF as above enjoy stronger properties as we now explain.

With the help of a block cipher it is possible to design encryption
schemes for messages of arbitrary length using various *modes of
operation* that we now describe. In practice, the underlying block
ciphers (or pseudorandom functions) are heuristically designed
functions like DES or AES which we we will look at later. In some
modes of operation it suffices to have a pseudorandom function rather
than permutation.

In each mode of operation we divide the message to be encrypted into
blocks of size n, where n is the security parameter and hence the size
of the messages the pseudorandom function can work with. If the length
of the whole message is not a multiple of n we must *pad* the last
block in some way, e.g. by adding 1000...0. There are various padding
schemes which describe this in detail. We henceforth assume that
padding has already taken place and that the message comprises l
blocks m_1,...,m_l of length n each.

Remark: it is possible to vary l from message to message and in this
way to encrypt a *stream* of bits rather than a block. One thus
effectively obtains so-called *stream ciphers*. Such stream ciphers
are also constructed and studied in their own right, see
e.g. KL3.4. In this course, we do not consider stream ciphers.


ECB mode (electronic code book)
- - - - - - - - - - - - - - - - 

This is the simplest mode of operation and it is actually not secure
at all, hence obsolete and included only for historical reasons. Let F
be a block cipher. 

The encryption of the plaintext m = m1..ml under ECB-mode is given by

                Enc_k(m) = F_k(m1) F_k(m2) ... F_k(ml)

i.e. the encodings F_k(mi) of the blocks mi are simply
concatenated. This mode receives its name from the fact that it
resembles a code book (which for each short message lists the
corresponding code) albeit in electronic form.

It is easy to see that ECB is not even secure against an
eavesdropper. If, assuming blocklength 3, we choose messages m=000000
and m'=000111 we can tell them apart on the basis of the corresponding
ciphertexts c,c' alone because c has the form xyzxyz whereas c' does
not. This deficiency of ECB shows in practice when it is used to
encode images, see KL (2nd ed) for an example.


CBC mode (cipher block chaining)
- - - - - - - - - - - - - - - - - 

Here we choose a random initialisation vector IV:{0,1}^n and encrypt as

Enc_k(m) = IV c1 c2 c3 ... cl

where c1=F_k(IV+m1), c2=F_k(c1+m2), c3=F_k(c2+m3), ...

I.e. each block is xor-ed with the encoding of the previous block
prior to sending it through F_k.

One can show that CBC-mode is CPA-secure, however it suffers from the
disadvantage that the ith ciphertext is needed to compute the i+1-st
one so the scheme cannot be parallelized.

The following variant of CBC-mode has been used in practice (SSL 3.0, TLS 1.0):


Chained CBC-mode (not CPA secure)
- - - - - - - - - - - - - - - - - 

Rather than choosing a fresh initialisation vector each time it has
been proposed to use instead the last ciphertext, i.e., c_l as the new
initialisation vector.  This reduces the size of the subsequent
message by one block and should intuitively be as good as CBC-mode
proper. However, this scheme is vulnerable to a CPA-attack as follows:
Suppose that the attacker knows that m_1 is one out of m_1^0 and m_1^1
and that m = m_1 m_2 m_3 has already been encrypted as IV c_1 c_2
c_3. The attacker could now request the encryption of another message
starting with the block

                  m_1' := IV + m_1^0 + c_3

The first ciphertext block c_1' then satisfies

                  c_1' = F_k(c_3+m_1') = F_k(IV + m_1^0)


Noticing that c_1 = F_k(IV + m_1) we then have m_1 = m_1^0 iff c_1' =
c_1.


OFB mode (output feedback)
- - - - - - - - - - - - - -

Here we only need a pseudorandom function. From a randomly chosen
initialisation vector IV we first compute pseudorandom strings r1...rl
as follows:

r1 = F_k(IV), r2 = F_k(r1), ...

and then encrypt using xor:

Enc_k = IV m1+r1 m2+r2 m3+r3 ... ml+rl

So, in essence, we use F_k as a pseudorandom generator of expansion
l(n) = l*n and then apply the earlier construction of an EAV-secure
scheme from a PRG. Due to the additional randomization via the
initialisation vector OFB-mode is even CPA-secure.


CTR mode (counter)
- - - - - - - - -

This mode is similar to OFB, but we compute the pseudorandom string by
successively incrementing IV:

r_i = F_k(IV+i)      /* this + is binary addition, not XOR */

Enc_k = IV m1+r1 m2+r2 m3+r3 ... ml+rl

This has the advantage that the ri and hence the ciphertexts can be
computed in parallel. A further advantage that it shares with OFB is
that the pseudorandom string IVr1..rl can be computed ahead of time
and stored until the message to be encrypted arrives. Furthermore,
portions of the ciphertext can be decrypted without computing or
decrypting anything else. This property is known as *random access*.


Theorem: If F is a pseudorandom function then (randomised) counter
mode (as described above) is CPA-secure.

Proof: (Unlike KL we make the simplifying assumption that message
length l is always the same). Let A be an adversary as in the Priv^cpa
experiment. We transform it into a distinguisher D for the underlying
pseudorandom function as we did in the proof of the last Theorem.

Given security parameter n and oracle access to a function h (which is
instantiated as either F_k() or a truly random f()) the distinguisher
begins by passing the security parameter n to the adversary.

Now, whenever the adversary queries its own oracle which it assumes to
be of the form Enc_k() then the distinguisher chooses IV:{0,1}^n at
random, computes the sequence ri=h(IV+i) and submits IV m1+r1
... ml+rl to the adversary.

When the adversary enters the second phase and outputs two messages
m^0, m^1 to distinguish based on their descriptions the distinguisher
then chooses a random bit b:{0,1} and IV:{0,1}^n uniformly at random
and returns the encryption of m^b w.r.t. IV to A. Further oracle
queries from A are answered as before.

When A eventually outputs a bit b' reply 1 if b=b' and 0 otherwise.

Let us see what the adversary can do when h is a truly random
function. Again, the adversary has a certain advantage which we now
try to bound from above.

Let us assume that for any two encryptions requested the sequences of
the counter values do not overlap, i.e. the sets
{IV,IV+1,IV+2,...,IV+l} and {IV',IV'+1,IV'+2,...,IV'+l}, where IV and
IV' are the respective initialisation vecotrs chosen, are disjoint. In
this case, the ciphertexts follow a uniform random distribution and
the adversary cannot glean any information whatsoever, i.e. has
success probability 1/2.  If they do overlap the adversary may or may
not use the extra information but if we can show that the probability
of this happening is negligible we are good. So, this is what we now
try to do.

Let us first consider that the overlap happens between the first and
second oracle question. In this case, once IV has been chosen, at most
2l values for IV' are unfavorable (or favorable depending on
viewpoint): IV, IV-1 ... IV-l, IV+1, IV+2, ... IV+l. Thus, we have a
probability of 2l*2^n. Now, since such a clash can happen anywhere we
multiply this with q(n), the number of questions being asked. Thus, we
get an overall bound of q(n)*2l/2^n on the probability of an
overlap. Even if l is not constant, but polynomial in the security
parameter this is negligible (noting of course that in view of the
time bounds q(n) is polynomially bounded). As a result, the success
probability for the adversary can be bounded by 1/2+negl(n).

The rest of the argument is as before: If h is F_k(), then let p be
the success probability of the adversary A on our new cryptosystem. It
is clear that our distinguisher will output 1 with probability p for
we have used A according to the rules of the
PrivK^cpa-experiment. Since F was assumed to be a PRF we thus know that
|p-(1/2+negl(n))| is itself negligible. It then follows that p is
bounded by 1/2 plus a negligible function.                    Q.E.D.


We close by remarking that none of the schemes presented so far is
secure against chosen ciphertext attacks (CCA) where one may request
the decryption of arbitrary ciphertexts different from the
challenge. Consider, e.g. the encryption of m as r || F_k(r)+m. An
adversary can prepare messages m0=00..0 and m1=11..1. Upon receiving a
ciphertext c= r || s corresponding to one of these two the adversary can
flip the first bit of s and ask for the decryption of the resulting
ciphertext. This must be either 100..0 or 011..1 according to whether
m was 00..0 or 11..1.