9 Cryptanalysis --------------- Cryptanalysis is aimed at breaking ciphers in the sense of reconstructing a key from given or chosen pairs of plain- and ciphertext. In the first chapter, we saw how various historic ciphers can be broken using regularities in the statistical distribution of ciphertexts or rather using the fact that known properties of the statistical distribution of plaintexts shine through in the statistical distribution of plaintexts. Modern cryptographic schemes avoid such attacks by making sure that the statistical distribution of ciphertext looks close to uniform no matter what the statistical distribution of the plaintexts is. The principles of confusion and diffusion were introduced to ensure this property at least approximately in practice. The notion of pseudo random generator furnishes an ideal yet useful theoretical model. Attacks against cryptographic schemes that properly implement confusion and diffusion are much more complicated and in many cases, in particular DES and AES not known to exist except in the sense that the computational time needed to carry them out is close to that of a brute-force attack (trying out all keys) which is always possible. Nevertheless, it is important to know and study principles of such attacks in order to avoid them when designing a new cipher, e.g. a smaller variant of existing ones to be run in resource-constrained environments e.g. "Internet of Things". We have already seen that a cipher that only uses linear or affine linear operations is vulnerable to an attack based on solving systems of linear equations. In this chapter we take a brief look at three more principles of cryptanalysis. First, we show why feedback shift registers which are a popular device for generating pseudorandom numbers are not secure. Linear Feedback Shift Registers (LFSR): Such a device is given by a sequence of shift registers with some outputs fed back via XOR to the input. For instance: >[D]-o->[D]-o->[D]---o-------------> | | | | -(+)-----------(+)---- When the shift registers (here noted [D]) are initialised with some bits then a stream of bits is generated at the output. E.g. starting from 100 we obtain the following sequence of states: 100,110,111,011,101,010,001,100,... and the following output sequence 00111010... Here, from xyz we get to uxy where u=x+z (mod 2). Of course, the sequence repeats after at most 8 steps, but until then looks rather random. With a chain 16 shift registers one can achieve a period length of 65535. In order to encrypt a stream of bits ("stream cipher") one can initialise the shift registers with the secret key and then add (XOR) the generated stream to the data stream. Due to the relative simplicity of implementation this has been used for a long time in esp. military applications. However, LFSR are linear (the sequence generated by the sum of two keys equals the sum of the sequences generated by the keys individually) and this can be exploited in various ways. The most interesting way uses the Berlekamp Massey algorithm which from a given sequence of bits computes the shortest LFSR which generates it. If one knows the beginning m of a plaintext and the corresponding ciphertext c, e.g. because the plaintext starts with the date or some formal greeting one can then apply the Berlekamp-Massey algorithm to m+c and use the computed LFSR to decode the remaining ciphertext. In order to circumvent this one can use nonlinear functions in the feedback loop, e.g. logical or use nonlinear combinations of several LFSR. The resulting stream ciphers, e.g. A5/1 and A5/2, E0 are still in use in GSM and Bluetooth but are known to have vulnerabilities too. Indeed, the design of secure stream ciphers is very difficult and for high security applications one resorts to block ciphers run, e.g. in CTR mode in order to encrypt bitstreams. Linear cryptanalysis - - - - - - - - - - - Linear cryptanalysis was developed in the 90s by Matsui and apparently was not known to the DES developers. Nevertheless, it can only produce a theoretical attack on the latter and on AES. We illustrate linear cryptanalysis with the toy cryptosystem from Michael Heys' tutorial https://www.engr.mun.ca/~howard/PAPERS/ldc_tutorial.pdf. It is an S/P network with a block length of 16 and four proper rounds, i.e. four rounds and an additional key mixing step. We regard each block as a 4x4 bit matrix. The substitution step applies the first S-Box of DES S_1, i.e. 0 1 2 3 4 5 6 7 8 9 A B C D E F ------------------------------- E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7 separately to each of the four rows (regarded as a hexadecimal number). So (1 0 1 0) (A) (6) (0 1 1 0) (0 1 1 0) which is (6) becomes (B) which is (1 0 1 1) (1 1 0 1) (D) (9) (1 0 0 1) (0 0 0 1) (1) (4) (0 1 0 0) The permutation step is simply matrix transposition, thus our example becomes (0 1 1 0) (6) (1 1 0 0) which is (C) (1 0 0 1) (9) (0 1 1 0) (6) As usual, each round consists of addition mod2 of the round key, substitution (S), and permutation (P), i.e. transposition. After four rounds we add one more key. The master key simply consists of the concatenation of the five round keys k = k1 k2 k3 k4 k5. We can now observe that the S-box S is not completely balanced in the following sense. Let WX1 be (1 0 1 1)^T and WY1 = (0 1 0 0)^T. The ^T suffix means transposition so W1, W2 are actually columns. We now have Pr(WX1^T x + WY1^T S(x) = 0) = 3/4 where the probability is taken over uniformly distributed x:{0,1}^4. In other words, for 12 out of the 16 possible inputs x=(x1 x2 x3 x4)^T and results S(x)=(y1 y2 y3 y4)^T one has x1+x3+x4=y2 whereas for the remaining four one has x1+x3+x4=/=y2, i.e. x1+x3+x4=y2+1. One can find such an imbalance by exhaustive search. Another such imbalance is given by WX2=(0 1 0 0)^T WY2 = (0 1 0 1)^T: Pr(WX2^T x + WY2^T S(x) = 0) = 1/4 Chaining these imbalances through the whole cipher (using the first one on round 1 and the second one on the other rounds) one obtains a similar linear relationship between the plaintext, the ciphertext just after round four, and the four round keys that have been used until then: Letting P be the plaintext (now written as dim16 vector) and U the ciphertext prior to applying the last S-boxes, permuting, and mixing with the fifth and last round key we find dim16 vectors A, B D1, D2, D3, D4 such that Pr(A^T P + B^T U + D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 = 0) = 15/32 with keys K1 ... K4 fixed and probability taken over the plaintexts P. Since D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 is either 0 or 1 we have that Prob(A^T P + B^T U = 0) = 15/32 or Prob(A^T P + B^T U = 0) = 17/32 irrespective of K1,..,K4. This allows us to work out a likely value for K5 or rather some bits of it. Namely, given a list of plaintexts P_1..P_N and corresponding ciphertexts C_1..C_n we can try out all 2^16 possible choices for K_5, for each one compute the corresponding U_i := S^{-1}(P^{-1}(C_i+K_5)), and compute the ratio p(K_5) = {i | A^T P_i + B^T U_i = 0} / N We choose a value K_5 for which this ratio is close to 15/32 or 17/32 and discard all those where it is close to 1/2. Now, p(K_5) does not actually depend on all bits of K_5 but only on those where the corresponding bit in B^T is set for the other ones only influence components of U_i which are not used in p(K_5). Nevertheless, this allows us to considerably narrow down the possible options for K_5. For each of those, we can then continue in the same fashion and thus peel off the fourth round key etc. Differential cryptanalysis - - - - - - - - - - - - - - Differential cryptanalysis was developed by Biham and Shamir as an attack against DES but apparently already known to the DES developers. It has been used to successfully break other ciphers. It uses the fact that the effect of adding a round key cancels out when we consider differences between two plaintexts. However, this requires the preparation of chosen plaintexts. In more detail, the *differential* between two plaintexts X1 and X2 is simply X1+X2, i.e. the bitvector which has a 1 where X1 and X2 differ and a 0 where they are equal. Coming back to our 4bit example S-box for any given differential there then are 2^4=16 different ways of realising it. E.g. the differential 1010 arises for the plaintext pair 0011,1001 and also for 1111,0101. Indeed, for any plaintext X1 there is a unique X2 such that X1, X2 have a given differential. Given two differentials X* and Y* we are interested in the following question. For how many of the (16) plaintext pairs X1,X2 with differential X* is it the case that the results of applying the S-box Y1=S(X1), Y2=S(X2) have differential Y*. Tabulating these numbers by exhaustive search yields the following table: Output differential 0 1 2 3 4 5 6 7 8 9 A B C D E F I 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 n 1 0 0 0 2 0 0 0 2 0 2 4 0 4 2 0 0 p 2 0 0 0 2 0 6 2 2 0 2 0 0 0 0 2 0 u 3 0 0 2 0 2 0 0 0 0 4 2 0 2 0 0 4 t 4 0 0 0 2 0 0 6 0 0 2 0 4 2 0 0 0 5 0 4 0 0 0 2 2 0 0 0 4 0 2 0 0 2 d 6 0 0 0 4 0 4 0 0 0 0 0 0 2 2 2 2 i 7 0 0 2 2 2 0 2 0 0 2 2 0 0 0 0 4 f 8 0 0 0 0 0 0 2 2 0 0 0 4 0 4 2 2 f 9 0 2 0 0 2 0 0 4 2 0 2 2 2 0 0 0 ' A 0 2 2 0 0 0 0 0 6 0 0 2 0 6 0 0 i B 0 0 8 0 0 2 0 2 0 0 0 0 0 2 0 2 a C 0 2 0 0 2 2 2 0 0 0 0 2 0 6 0 0 l D 0 4 0 0 0 0 0 4 2 0 2 0 2 0 2 0 E 0 0 2 4 2 0 0 0 6 0 0 0 0 0 2 0 F 0 2 0 0 6 0 0 0 0 4 0 2 0 0 2 0 The high entries 8 and 6 are of particular interest because they constitute a deviation of an intuitive pseudorandom permutation. Let us now see how we can exploit these in order to attack our cipher. We feed in a pair of 16bit plaintexts X1 X2 which differ in the second block by B=1011 and are equal everywhere else. With probability 8/16 this will result in differential 2=0010 at the output of the second S-box. The output of the other S-boxes will be equal for X1 and X2. This differential 2=0010 will be propagated through the P network and results in a differential of 4=0100 at the entry of the third S-box of the second round: Differential before P Differential after P 0 0 0 0 0 0 0 0 0 0 0 0 ------P----> 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 There, with probability 6/16 it will produce differential 6=0110 at the output. This results in differential 2=0010 at the input of the second and third S-boxes of the third round. With probability 36/256 this will lead to differential 5 at both outputs which propagates to differential 6=0110 at the entries to the second and fourth S-boxes of the final round. 0 0 0 0 0 0 0 0 0 0 1 0 -------P-----> 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 Summing up, when we encipher plaintexts with differential 0B00 we will obtain with probability 1/2*3/8*9/64=27/1024 a differential 0606 at the entry to the last round. We now have to set the relevant components (second and fourth block) of the final key in such a way that this ratio (after inverting the last layer of S-boxes) is actually reproduced. This allows us (with high probability) to guess hone half of the bits of the last round key. Continuing in this way with other differentials combined with exhaustive search allows us to undo the entire last round and then to proceed from there.