From 162df3190c4e179ef968b0a467ddd4951faba336 Mon Sep 17 00:00:00 2001 From: Gregor Kleen Date: Tue, 24 Oct 2017 15:33:07 +0200 Subject: Notes on cryptography --- ss2017/cryptography/Notes09.txt | 259 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 259 insertions(+) create mode 100644 ss2017/cryptography/Notes09.txt (limited to 'ss2017/cryptography/Notes09.txt') diff --git a/ss2017/cryptography/Notes09.txt b/ss2017/cryptography/Notes09.txt new file mode 100644 index 0000000..7c6f51d --- /dev/null +++ b/ss2017/cryptography/Notes09.txt @@ -0,0 +1,259 @@ + +9 Cryptanalysis +--------------- + +Cryptanalysis is aimed at breaking ciphers in the sense of +reconstructing a key from given or chosen pairs of plain- and +ciphertext. In the first chapter, we saw how various historic ciphers +can be broken using regularities in the statistical distribution of +ciphertexts or rather using the fact that known properties of the +statistical distribution of plaintexts shine through in the +statistical distribution of plaintexts. Modern cryptographic schemes +avoid such attacks by making sure that the statistical distribution of +ciphertext looks close to uniform no matter what the statistical +distribution of the plaintexts is. The principles of confusion and +diffusion were introduced to ensure this property at least +approximately in practice. The notion of pseudo random generator +furnishes an ideal yet useful theoretical model. + +Attacks against cryptographic schemes that properly implement +confusion and diffusion are much more complicated and in many cases, +in particular DES and AES not known to exist except in the sense that +the computational time needed to carry them out is close to that of a +brute-force attack (trying out all keys) which is always +possible. Nevertheless, it is important to know and study principles +of such attacks in order to avoid them when designing a new cipher, +e.g. a smaller variant of existing ones to be run in +resource-constrained environments e.g. "Internet of Things". + +We have already seen that a cipher that only uses linear or affine +linear operations is vulnerable to an attack based on solving systems +of linear equations. In this chapter we take a brief look at three +more principles of cryptanalysis. First, we show why feedback shift +registers which are a popular device for generating pseudorandom +numbers are not secure. + +Linear Feedback Shift Registers (LFSR): Such a device is given by a +sequence of shift registers with some outputs fed back via XOR to the +input. For instance: + + >[D]-o->[D]-o->[D]---o-------------> + | | | | + -(+)-----------(+)---- + +When the shift registers (here noted [D]) are initialised with some +bits then a stream of bits is generated at the output. E.g. starting from 100 +we obtain the following sequence of states: + +100,110,111,011,101,010,001,100,... +and the following output sequence 00111010... + +Here, from xyz we get to uxy where u=x+z (mod 2). Of course, the +sequence repeats after at most 8 steps, but until then looks rather +random. With a chain 16 shift registers one can achieve a period +length of 65535. In order to encrypt a stream of bits ("stream +cipher") one can initialise the shift registers with the secret key +and then add (XOR) the generated stream to the data stream. Due to the +relative simplicity of implementation this has been used for a long +time in esp. military applications. + +However, LFSR are linear (the sequence generated by the sum of two +keys equals the sum of the sequences generated by the keys +individually) and this can be exploited in various ways. The most +interesting way uses the Berlekamp Massey algorithm which from a given +sequence of bits computes the shortest LFSR which generates it. If one +knows the beginning m of a plaintext and the corresponding ciphertext +c, e.g. because the plaintext starts with the date or some formal +greeting one can then apply the Berlekamp-Massey algorithm to m+c and +use the computed LFSR to decode the remaining ciphertext. + +In order to circumvent this one can use nonlinear functions in the +feedback loop, e.g. logical or use nonlinear combinations of several +LFSR. The resulting stream ciphers, e.g. A5/1 and A5/2, E0 are still +in use in GSM and Bluetooth but are known to have vulnerabilities +too. Indeed, the design of secure stream ciphers is very difficult and +for high security applications one resorts to block ciphers run, +e.g. in CTR mode in order to encrypt bitstreams. + + +Linear cryptanalysis +- - - - - - - - - - - + +Linear cryptanalysis was developed in the 90s by Matsui and apparently +was not known to the DES developers. Nevertheless, it can only produce +a theoretical attack on the latter and on AES. + +We illustrate linear cryptanalysis with the toy cryptosystem from +Michael Heys' tutorial +https://www.engr.mun.ca/~howard/PAPERS/ldc_tutorial.pdf. + +It is an S/P network with a block length of 16 and four proper rounds, +i.e. four rounds and an additional key mixing step. We regard each +block as a 4x4 bit matrix. The substitution step applies the first +S-Box of DES S_1, i.e. + +0 1 2 3 4 5 6 7 8 9 A B C D E F +------------------------------- +E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7 + +separately to each of the four rows (regarded as a hexadecimal number). So + +(1 0 1 0) (A) (6) (0 1 1 0) +(0 1 1 0) which is (6) becomes (B) which is (1 0 1 1) +(1 1 0 1) (D) (9) (1 0 0 1) +(0 0 0 1) (1) (4) (0 1 0 0) + +The permutation step is simply matrix transposition, thus our example +becomes + +(0 1 1 0) (6) +(1 1 0 0) which is (C) +(1 0 0 1) (9) +(0 1 1 0) (6) + +As usual, each round consists of addition mod2 of the round key, +substitution (S), and permutation (P), i.e. transposition. After four +rounds we add one more key. The master key simply consists of the +concatenation of the five round keys k = k1 k2 k3 k4 k5. + +We can now observe that the S-box S is not completely balanced in the +following sense. Let WX1 be (1 0 1 1)^T and WY1 = (0 1 0 0)^T. The ^T +suffix means transposition so W1, W2 are actually columns. We now have + + Pr(WX1^T x + WY1^T S(x) = 0) = 3/4 + +where the probability is taken over uniformly distributed +x:{0,1}^4. In other words, for 12 out of the 16 possible inputs x=(x1 +x2 x3 x4)^T and results S(x)=(y1 y2 y3 y4)^T one has x1+x3+x4=y2 +whereas for the remaining four one has x1+x3+x4=/=y2, +i.e. x1+x3+x4=y2+1. + +One can find such an imbalance by exhaustive search. Another such +imbalance is given by WX2=(0 1 0 0)^T WY2 = (0 1 0 1)^T: + + Pr(WX2^T x + WY2^T S(x) = 0) = 1/4 + +Chaining these imbalances through the whole cipher (using the first +one on round 1 and the second one on the other rounds) one obtains a +similar linear relationship between the plaintext, the ciphertext just +after round four, and the four round keys that have been used until +then: Letting P be the plaintext (now written as dim16 vector) and U +the ciphertext prior to applying the last S-boxes, permuting, and +mixing with the fifth and last round key we find dim16 vectors A, B +D1, D2, D3, D4 such that + + Pr(A^T P + B^T U + D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 = 0) = 15/32 + +with keys K1 ... K4 fixed and probability taken over the plaintexts P. + + +Since D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 is either 0 or 1 we have that + + Prob(A^T P + B^T U = 0) = 15/32 or Prob(A^T P + B^T U = 0) = 17/32 + +irrespective of K1,..,K4. This allows us to work out a likely value +for K5 or rather some bits of it. + +Namely, given a list of plaintexts P_1..P_N and corresponding +ciphertexts C_1..C_n we can try out all 2^16 possible choices for K_5, +for each one compute the corresponding U_i := S^{-1}(P^{-1}(C_i+K_5)), +and compute the ratio + + p(K_5) = {i | A^T P_i + B^T U_i = 0} / N + +We choose a value K_5 for which this ratio is close to 15/32 or 17/32 +and discard all those where it is close to 1/2. Now, p(K_5) does not +actually depend on all bits of K_5 but only on those where the +corresponding bit in B^T is set for the other ones only influence +components of U_i which are not used in p(K_5). Nevertheless, this +allows us to considerably narrow down the possible options for +K_5. For each of those, we can then continue in the same fashion and +thus peel off the fourth round key etc. + + +Differential cryptanalysis +- - - - - - - - - - - - - - + +Differential cryptanalysis was developed by Biham and Shamir as an +attack against DES but apparently already known to the DES +developers. It has been used to successfully break other ciphers. It +uses the fact that the effect of adding a round key cancels out when +we consider differences between two plaintexts. However, this +requires the preparation of chosen plaintexts. In more detail, the +*differential* between two plaintexts X1 and X2 is simply X1+X2, +i.e. the bitvector which has a 1 where X1 and X2 differ and a 0 where +they are equal. + +Coming back to our 4bit example S-box for any given differential there +then are 2^4=16 different ways of realising it. E.g. the differential +1010 arises for the plaintext pair 0011,1001 and also for +1111,0101. Indeed, for any plaintext X1 there is a unique X2 such that +X1, X2 have a given differential. + +Given two differentials X* and Y* we are interested in the following +question. For how many of the (16) plaintext pairs X1,X2 with +differential X* is it the case that the results of applying the S-box +Y1=S(X1), Y2=S(X2) have differential Y*. Tabulating these numbers by +exhaustive search yields the following table: + + Output differential + 0 1 2 3 4 5 6 7 8 9 A B C D E F +I 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 +n 1 0 0 0 2 0 0 0 2 0 2 4 0 4 2 0 0 +p 2 0 0 0 2 0 6 2 2 0 2 0 0 0 0 2 0 +u 3 0 0 2 0 2 0 0 0 0 4 2 0 2 0 0 4 +t 4 0 0 0 2 0 0 6 0 0 2 0 4 2 0 0 0 + 5 0 4 0 0 0 2 2 0 0 0 4 0 2 0 0 2 +d 6 0 0 0 4 0 4 0 0 0 0 0 0 2 2 2 2 +i 7 0 0 2 2 2 0 2 0 0 2 2 0 0 0 0 4 +f 8 0 0 0 0 0 0 2 2 0 0 0 4 0 4 2 2 +f 9 0 2 0 0 2 0 0 4 2 0 2 2 2 0 0 0 +' A 0 2 2 0 0 0 0 0 6 0 0 2 0 6 0 0 +i B 0 0 8 0 0 2 0 2 0 0 0 0 0 2 0 2 +a C 0 2 0 0 2 2 2 0 0 0 0 2 0 6 0 0 +l D 0 4 0 0 0 0 0 4 2 0 2 0 2 0 2 0 + E 0 0 2 4 2 0 0 0 6 0 0 0 0 0 2 0 + F 0 2 0 0 6 0 0 0 0 4 0 2 0 0 2 0 + + +The high entries 8 and 6 are of particular interest because they +constitute a deviation of an intuitive pseudorandom permutation. +Let us now see how we can exploit these in order to attack our cipher. + +We feed in a pair of 16bit plaintexts X1 X2 which differ in the second +block by B=1011 and are equal everywhere else. With probability 8/16 +this will result in differential 2=0010 at the output of the second +S-box. The output of the other S-boxes will be equal for X1 and X2. +This differential 2=0010 will be propagated through the P network and +results in a differential of 4=0100 at the entry of the third S-box of +the second round: + + Differential before P Differential after P + 0 0 0 0 0 0 0 0 + 0 0 0 0 ------P----> 0 0 1 0 + 0 1 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 + + +There, with probability 6/16 it will produce differential 6=0110 at +the output. This results in differential 2=0010 at the input of the +second and third S-boxes of the third round. With probability 36/256 +this will lead to differential 5 at both outputs which propagates to +differential 6=0110 at the entries to the second and fourth S-boxes of +the final round. + + 0 0 0 0 0 0 0 0 + 0 0 1 0 -------P-----> 0 0 0 0 + 0 0 1 0 0 1 1 0 + 0 0 0 0 0 0 0 0 + +Summing up, when we encipher plaintexts with differential 0B00 we will +obtain with probability 1/2*3/8*9/64=27/1024 a differential 0606 at +the entry to the last round. We now have to set the relevant +components (second and fourth block) of the final key in such a way +that this ratio (after inverting the last layer of S-boxes) is +actually reproduced. This allows us (with high probability) to guess +hone half of the bits of the last round key. Continuing in this way +with other differentials combined with exhaustive search allows us to +undo the entire last round and then to proceed from there. + -- cgit v1.2.3