diff options
| author | Gregor Kleen <gkleen@yggdrasil.li> | 2017-10-24 15:33:07 +0200 |
|---|---|---|
| committer | Gregor Kleen <gkleen@yggdrasil.li> | 2017-10-24 15:33:07 +0200 |
| commit | 162df3190c4e179ef968b0a467ddd4951faba336 (patch) | |
| tree | 3c9628cf1abf259e1b897ac14b006e2b448cff27 /ss2017/cryptography/Notes09.txt | |
| parent | 92ed41381f40c97b52b0e96f0eaeaa9025091411 (diff) | |
| download | uni-162df3190c4e179ef968b0a467ddd4951faba336.tar uni-162df3190c4e179ef968b0a467ddd4951faba336.tar.gz uni-162df3190c4e179ef968b0a467ddd4951faba336.tar.bz2 uni-162df3190c4e179ef968b0a467ddd4951faba336.tar.xz uni-162df3190c4e179ef968b0a467ddd4951faba336.zip | |
Notes on cryptography
Diffstat (limited to 'ss2017/cryptography/Notes09.txt')
| -rw-r--r-- | ss2017/cryptography/Notes09.txt | 259 |
1 files changed, 259 insertions, 0 deletions
diff --git a/ss2017/cryptography/Notes09.txt b/ss2017/cryptography/Notes09.txt new file mode 100644 index 0000000..7c6f51d --- /dev/null +++ b/ss2017/cryptography/Notes09.txt | |||
| @@ -0,0 +1,259 @@ | |||
| 1 | |||
| 2 | 9 Cryptanalysis | ||
| 3 | --------------- | ||
| 4 | |||
| 5 | Cryptanalysis is aimed at breaking ciphers in the sense of | ||
| 6 | reconstructing a key from given or chosen pairs of plain- and | ||
| 7 | ciphertext. In the first chapter, we saw how various historic ciphers | ||
| 8 | can be broken using regularities in the statistical distribution of | ||
| 9 | ciphertexts or rather using the fact that known properties of the | ||
| 10 | statistical distribution of plaintexts shine through in the | ||
| 11 | statistical distribution of plaintexts. Modern cryptographic schemes | ||
| 12 | avoid such attacks by making sure that the statistical distribution of | ||
| 13 | ciphertext looks close to uniform no matter what the statistical | ||
| 14 | distribution of the plaintexts is. The principles of confusion and | ||
| 15 | diffusion were introduced to ensure this property at least | ||
| 16 | approximately in practice. The notion of pseudo random generator | ||
| 17 | furnishes an ideal yet useful theoretical model. | ||
| 18 | |||
| 19 | Attacks against cryptographic schemes that properly implement | ||
| 20 | confusion and diffusion are much more complicated and in many cases, | ||
| 21 | in particular DES and AES not known to exist except in the sense that | ||
| 22 | the computational time needed to carry them out is close to that of a | ||
| 23 | brute-force attack (trying out all keys) which is always | ||
| 24 | possible. Nevertheless, it is important to know and study principles | ||
| 25 | of such attacks in order to avoid them when designing a new cipher, | ||
| 26 | e.g. a smaller variant of existing ones to be run in | ||
| 27 | resource-constrained environments e.g. "Internet of Things". | ||
| 28 | |||
| 29 | We have already seen that a cipher that only uses linear or affine | ||
| 30 | linear operations is vulnerable to an attack based on solving systems | ||
| 31 | of linear equations. In this chapter we take a brief look at three | ||
| 32 | more principles of cryptanalysis. First, we show why feedback shift | ||
| 33 | registers which are a popular device for generating pseudorandom | ||
| 34 | numbers are not secure. | ||
| 35 | |||
| 36 | Linear Feedback Shift Registers (LFSR): Such a device is given by a | ||
| 37 | sequence of shift registers with some outputs fed back via XOR to the | ||
| 38 | input. For instance: | ||
| 39 | |||
| 40 | >[D]-o->[D]-o->[D]---o-------------> | ||
| 41 | | | | | | ||
| 42 | -(+)-----------(+)---- | ||
| 43 | |||
| 44 | When the shift registers (here noted [D]) are initialised with some | ||
| 45 | bits then a stream of bits is generated at the output. E.g. starting from 100 | ||
| 46 | we obtain the following sequence of states: | ||
| 47 | |||
| 48 | 100,110,111,011,101,010,001,100,... | ||
| 49 | and the following output sequence 00111010... | ||
| 50 | |||
| 51 | Here, from xyz we get to uxy where u=x+z (mod 2). Of course, the | ||
| 52 | sequence repeats after at most 8 steps, but until then looks rather | ||
| 53 | random. With a chain 16 shift registers one can achieve a period | ||
| 54 | length of 65535. In order to encrypt a stream of bits ("stream | ||
| 55 | cipher") one can initialise the shift registers with the secret key | ||
| 56 | and then add (XOR) the generated stream to the data stream. Due to the | ||
| 57 | relative simplicity of implementation this has been used for a long | ||
| 58 | time in esp. military applications. | ||
| 59 | |||
| 60 | However, LFSR are linear (the sequence generated by the sum of two | ||
| 61 | keys equals the sum of the sequences generated by the keys | ||
| 62 | individually) and this can be exploited in various ways. The most | ||
| 63 | interesting way uses the Berlekamp Massey algorithm which from a given | ||
| 64 | sequence of bits computes the shortest LFSR which generates it. If one | ||
| 65 | knows the beginning m of a plaintext and the corresponding ciphertext | ||
| 66 | c, e.g. because the plaintext starts with the date or some formal | ||
| 67 | greeting one can then apply the Berlekamp-Massey algorithm to m+c and | ||
| 68 | use the computed LFSR to decode the remaining ciphertext. | ||
| 69 | |||
| 70 | In order to circumvent this one can use nonlinear functions in the | ||
| 71 | feedback loop, e.g. logical or use nonlinear combinations of several | ||
| 72 | LFSR. The resulting stream ciphers, e.g. A5/1 and A5/2, E0 are still | ||
| 73 | in use in GSM and Bluetooth but are known to have vulnerabilities | ||
| 74 | too. Indeed, the design of secure stream ciphers is very difficult and | ||
| 75 | for high security applications one resorts to block ciphers run, | ||
| 76 | e.g. in CTR mode in order to encrypt bitstreams. | ||
| 77 | |||
| 78 | |||
| 79 | Linear cryptanalysis | ||
| 80 | - - - - - - - - - - - | ||
| 81 | |||
| 82 | Linear cryptanalysis was developed in the 90s by Matsui and apparently | ||
| 83 | was not known to the DES developers. Nevertheless, it can only produce | ||
| 84 | a theoretical attack on the latter and on AES. | ||
| 85 | |||
| 86 | We illustrate linear cryptanalysis with the toy cryptosystem from | ||
| 87 | Michael Heys' tutorial | ||
| 88 | https://www.engr.mun.ca/~howard/PAPERS/ldc_tutorial.pdf. | ||
| 89 | |||
| 90 | It is an S/P network with a block length of 16 and four proper rounds, | ||
| 91 | i.e. four rounds and an additional key mixing step. We regard each | ||
| 92 | block as a 4x4 bit matrix. The substitution step applies the first | ||
| 93 | S-Box of DES S_1, i.e. | ||
| 94 | |||
| 95 | 0 1 2 3 4 5 6 7 8 9 A B C D E F | ||
| 96 | ------------------------------- | ||
| 97 | E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7 | ||
| 98 | |||
| 99 | separately to each of the four rows (regarded as a hexadecimal number). So | ||
| 100 | |||
| 101 | (1 0 1 0) (A) (6) (0 1 1 0) | ||
| 102 | (0 1 1 0) which is (6) becomes (B) which is (1 0 1 1) | ||
| 103 | (1 1 0 1) (D) (9) (1 0 0 1) | ||
| 104 | (0 0 0 1) (1) (4) (0 1 0 0) | ||
| 105 | |||
| 106 | The permutation step is simply matrix transposition, thus our example | ||
| 107 | becomes | ||
| 108 | |||
| 109 | (0 1 1 0) (6) | ||
| 110 | (1 1 0 0) which is (C) | ||
| 111 | (1 0 0 1) (9) | ||
| 112 | (0 1 1 0) (6) | ||
| 113 | |||
| 114 | As usual, each round consists of addition mod2 of the round key, | ||
| 115 | substitution (S), and permutation (P), i.e. transposition. After four | ||
| 116 | rounds we add one more key. The master key simply consists of the | ||
| 117 | concatenation of the five round keys k = k1 k2 k3 k4 k5. | ||
| 118 | |||
| 119 | We can now observe that the S-box S is not completely balanced in the | ||
| 120 | following sense. Let WX1 be (1 0 1 1)^T and WY1 = (0 1 0 0)^T. The ^T | ||
| 121 | suffix means transposition so W1, W2 are actually columns. We now have | ||
| 122 | |||
| 123 | Pr(WX1^T x + WY1^T S(x) = 0) = 3/4 | ||
| 124 | |||
| 125 | where the probability is taken over uniformly distributed | ||
| 126 | x:{0,1}^4. In other words, for 12 out of the 16 possible inputs x=(x1 | ||
| 127 | x2 x3 x4)^T and results S(x)=(y1 y2 y3 y4)^T one has x1+x3+x4=y2 | ||
| 128 | whereas for the remaining four one has x1+x3+x4=/=y2, | ||
| 129 | i.e. x1+x3+x4=y2+1. | ||
| 130 | |||
| 131 | One can find such an imbalance by exhaustive search. Another such | ||
| 132 | imbalance is given by WX2=(0 1 0 0)^T WY2 = (0 1 0 1)^T: | ||
| 133 | |||
| 134 | Pr(WX2^T x + WY2^T S(x) = 0) = 1/4 | ||
| 135 | |||
| 136 | Chaining these imbalances through the whole cipher (using the first | ||
| 137 | one on round 1 and the second one on the other rounds) one obtains a | ||
| 138 | similar linear relationship between the plaintext, the ciphertext just | ||
| 139 | after round four, and the four round keys that have been used until | ||
| 140 | then: Letting P be the plaintext (now written as dim16 vector) and U | ||
| 141 | the ciphertext prior to applying the last S-boxes, permuting, and | ||
| 142 | mixing with the fifth and last round key we find dim16 vectors A, B | ||
| 143 | D1, D2, D3, D4 such that | ||
| 144 | |||
| 145 | Pr(A^T P + B^T U + D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 = 0) = 15/32 | ||
| 146 | |||
| 147 | with keys K1 ... K4 fixed and probability taken over the plaintexts P. | ||
| 148 | |||
| 149 | |||
| 150 | Since D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 is either 0 or 1 we have that | ||
| 151 | |||
| 152 | Prob(A^T P + B^T U = 0) = 15/32 or Prob(A^T P + B^T U = 0) = 17/32 | ||
| 153 | |||
| 154 | irrespective of K1,..,K4. This allows us to work out a likely value | ||
| 155 | for K5 or rather some bits of it. | ||
| 156 | |||
| 157 | Namely, given a list of plaintexts P_1..P_N and corresponding | ||
| 158 | ciphertexts C_1..C_n we can try out all 2^16 possible choices for K_5, | ||
| 159 | for each one compute the corresponding U_i := S^{-1}(P^{-1}(C_i+K_5)), | ||
| 160 | and compute the ratio | ||
| 161 | |||
| 162 | p(K_5) = {i | A^T P_i + B^T U_i = 0} / N | ||
| 163 | |||
| 164 | We choose a value K_5 for which this ratio is close to 15/32 or 17/32 | ||
| 165 | and discard all those where it is close to 1/2. Now, p(K_5) does not | ||
| 166 | actually depend on all bits of K_5 but only on those where the | ||
| 167 | corresponding bit in B^T is set for the other ones only influence | ||
| 168 | components of U_i which are not used in p(K_5). Nevertheless, this | ||
| 169 | allows us to considerably narrow down the possible options for | ||
| 170 | K_5. For each of those, we can then continue in the same fashion and | ||
| 171 | thus peel off the fourth round key etc. | ||
| 172 | |||
| 173 | |||
| 174 | Differential cryptanalysis | ||
| 175 | - - - - - - - - - - - - - - | ||
| 176 | |||
| 177 | Differential cryptanalysis was developed by Biham and Shamir as an | ||
| 178 | attack against DES but apparently already known to the DES | ||
| 179 | developers. It has been used to successfully break other ciphers. It | ||
| 180 | uses the fact that the effect of adding a round key cancels out when | ||
| 181 | we consider differences between two plaintexts. However, this | ||
| 182 | requires the preparation of chosen plaintexts. In more detail, the | ||
| 183 | *differential* between two plaintexts X1 and X2 is simply X1+X2, | ||
| 184 | i.e. the bitvector which has a 1 where X1 and X2 differ and a 0 where | ||
| 185 | they are equal. | ||
| 186 | |||
| 187 | Coming back to our 4bit example S-box for any given differential there | ||
| 188 | then are 2^4=16 different ways of realising it. E.g. the differential | ||
| 189 | 1010 arises for the plaintext pair 0011,1001 and also for | ||
| 190 | 1111,0101. Indeed, for any plaintext X1 there is a unique X2 such that | ||
| 191 | X1, X2 have a given differential. | ||
| 192 | |||
| 193 | Given two differentials X* and Y* we are interested in the following | ||
| 194 | question. For how many of the (16) plaintext pairs X1,X2 with | ||
| 195 | differential X* is it the case that the results of applying the S-box | ||
| 196 | Y1=S(X1), Y2=S(X2) have differential Y*. Tabulating these numbers by | ||
| 197 | exhaustive search yields the following table: | ||
| 198 | |||
| 199 | Output differential | ||
| 200 | 0 1 2 3 4 5 6 7 8 9 A B C D E F | ||
| 201 | I 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | ||
| 202 | n 1 0 0 0 2 0 0 0 2 0 2 4 0 4 2 0 0 | ||
| 203 | p 2 0 0 0 2 0 6 2 2 0 2 0 0 0 0 2 0 | ||
| 204 | u 3 0 0 2 0 2 0 0 0 0 4 2 0 2 0 0 4 | ||
| 205 | t 4 0 0 0 2 0 0 6 0 0 2 0 4 2 0 0 0 | ||
| 206 | 5 0 4 0 0 0 2 2 0 0 0 4 0 2 0 0 2 | ||
| 207 | d 6 0 0 0 4 0 4 0 0 0 0 0 0 2 2 2 2 | ||
| 208 | i 7 0 0 2 2 2 0 2 0 0 2 2 0 0 0 0 4 | ||
| 209 | f 8 0 0 0 0 0 0 2 2 0 0 0 4 0 4 2 2 | ||
| 210 | f 9 0 2 0 0 2 0 0 4 2 0 2 2 2 0 0 0 | ||
| 211 | ' A 0 2 2 0 0 0 0 0 6 0 0 2 0 6 0 0 | ||
| 212 | i B 0 0 8 0 0 2 0 2 0 0 0 0 0 2 0 2 | ||
| 213 | a C 0 2 0 0 2 2 2 0 0 0 0 2 0 6 0 0 | ||
| 214 | l D 0 4 0 0 0 0 0 4 2 0 2 0 2 0 2 0 | ||
| 215 | E 0 0 2 4 2 0 0 0 6 0 0 0 0 0 2 0 | ||
| 216 | F 0 2 0 0 6 0 0 0 0 4 0 2 0 0 2 0 | ||
| 217 | |||
| 218 | |||
| 219 | The high entries 8 and 6 are of particular interest because they | ||
| 220 | constitute a deviation of an intuitive pseudorandom permutation. | ||
| 221 | Let us now see how we can exploit these in order to attack our cipher. | ||
| 222 | |||
| 223 | We feed in a pair of 16bit plaintexts X1 X2 which differ in the second | ||
| 224 | block by B=1011 and are equal everywhere else. With probability 8/16 | ||
| 225 | this will result in differential 2=0010 at the output of the second | ||
| 226 | S-box. The output of the other S-boxes will be equal for X1 and X2. | ||
| 227 | This differential 2=0010 will be propagated through the P network and | ||
| 228 | results in a differential of 4=0100 at the entry of the third S-box of | ||
| 229 | the second round: | ||
| 230 | |||
| 231 | Differential before P Differential after P | ||
| 232 | 0 0 0 0 0 0 0 0 | ||
| 233 | 0 0 0 0 ------P----> 0 0 1 0 | ||
| 234 | 0 1 0 0 0 0 0 0 | ||
| 235 | 0 0 0 0 0 0 0 0 | ||
| 236 | |||
| 237 | |||
| 238 | There, with probability 6/16 it will produce differential 6=0110 at | ||
| 239 | the output. This results in differential 2=0010 at the input of the | ||
| 240 | second and third S-boxes of the third round. With probability 36/256 | ||
| 241 | this will lead to differential 5 at both outputs which propagates to | ||
| 242 | differential 6=0110 at the entries to the second and fourth S-boxes of | ||
| 243 | the final round. | ||
| 244 | |||
| 245 | 0 0 0 0 0 0 0 0 | ||
| 246 | 0 0 1 0 -------P-----> 0 0 0 0 | ||
| 247 | 0 0 1 0 0 1 1 0 | ||
| 248 | 0 0 0 0 0 0 0 0 | ||
| 249 | |||
| 250 | Summing up, when we encipher plaintexts with differential 0B00 we will | ||
| 251 | obtain with probability 1/2*3/8*9/64=27/1024 a differential 0606 at | ||
| 252 | the entry to the last round. We now have to set the relevant | ||
| 253 | components (second and fourth block) of the final key in such a way | ||
| 254 | that this ratio (after inverting the last layer of S-boxes) is | ||
| 255 | actually reproduced. This allows us (with high probability) to guess | ||
| 256 | hone half of the bits of the last round key. Continuing in this way | ||
| 257 | with other differentials combined with exhaustive search allows us to | ||
| 258 | undo the entire last round and then to proceed from there. | ||
| 259 | |||
