From 162df3190c4e179ef968b0a467ddd4951faba336 Mon Sep 17 00:00:00 2001 From: Gregor Kleen Date: Tue, 24 Oct 2017 15:33:07 +0200 Subject: Notes on cryptography --- ss2017/cryptography/Notes10.txt | 257 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 257 insertions(+) create mode 100644 ss2017/cryptography/Notes10.txt (limited to 'ss2017/cryptography/Notes10.txt') diff --git a/ss2017/cryptography/Notes10.txt b/ss2017/cryptography/Notes10.txt new file mode 100644 index 0000000..5457bf3 --- /dev/null +++ b/ss2017/cryptography/Notes10.txt @@ -0,0 +1,257 @@ + + +10 Message integrity, CBC-MAC (KL4.1-4.5) +----------------------------------------- + +So far, we wanted to prevent an eavesdropper or a more active attacker +from getting to know the content of encrypted messages or at least +partial information about them. + +Another important objective for cryptography is message integrity and +authenticity. We want to prevent the attacker from being able to +manipulate an encrypted message or from fabricating one such. + +It is not the case that encryption automatically guarantees +integrity. For example, with the Vernam cipher and derived systems +flipping one bit in the ciphertext results in the same bit in the +plaintext being flipped. Also, randomly chosen ciphertexts correspond +to plaintexts albeit probably nonsensical ones. + + +Message authentication codes +- - - - - - - - - - - - - - - + +Definition: A message authentication code (MAC) consists of two +probabilistic polynomial time algorithms (Gen,Mac) where + +1. On input n (security parameter) Gen outputs a key k of length >= n +2. Given key k and message m:{0,1}* the tag-generation algorithm Mac outputs + t = Mac_k(m) + +We assume (unlike in KL where this is a special case) that Mac is +deterministic. + +If m is furthermore constrained to be from {0,1}^l(n) for some length +function l(n), e.g. l(n)=n then the MAC is called *fixed length*. EOD + +The intended usage is as follows: Alice and Bob share a secret key k +that has been obtained with Gen(). If Alice wants to send a message m +to Bob she could compute the tag t=Mac_k(m) and transmit (m,t) either +encrypted or not. + +In order to be sure that the received message (m,t) was really from +Alice he can then compute t'=Mac_k(m) and check that t=t'. The MAC +should be designed in such a way that no adversary can fabricate valid +tags unless it knows the key so that if the verification succeeds Bob +can indeed be sure that the message was from Alice. The following +experiment formalizes this. + + +The message authentication experiment Mac-forge_{A,Pi}(n): + +Pi = (Gen,Mac) a MAC, A an adversary, n the security parameter + +1. A random key k is generated with Gen(n). + +2. The adversary is given n and oracle access to Mac_k(), i.e. it can +request arbitrary tags w.r.t k. The adversary eventually outputs a +pair (m,t). + +3. The output of the experiment is defined to be 1 if + (1) Mac_k(m)=t and (2) m has not previously been queried by A. + +Thus, the adversary must prepare a valid tag for a message it has not +previously requested a tag for. + +We define a MAC to be secure if for all adversaries +the probability that Mac-forge yields 1 is negligible. + +While this notion of security is indeed very strong there are some +attacks it cannot withstand: + +1) Replay attacks. The adversary can store an authenticated message +(m,t) and send it at some later point. In order to prevent that one +can extend message with timestamps or similar. + +2) Timing attacks. If the check t=t'? proceeds by comparing bits +starting from the left until the first mismatch is encountered (cf +strcmp() from the C-library) then the time it takes to perform the +check depends on the tags. This can be used to find out the bits of a +valid tag one by one. According to KL this kind of attack has been +used against an early version of the Xbox in order to install +unlicensed games. To prevent the attack one should make sure that the +computation time for the comparison is always the same. + +Construction of a fixed-length MAC from a pseudorandom function: + +Let F_k() be a pseudorandom function. The following then is a secure +fixed-length MAC. + +1. Gen(): Given n choose k:{0,1}^n uniformly at random +2. Mac(): On input k and m output the tag t=F_k(m) + +Theorem: The above MAC is secure. + +Proof sketch: As in the other security proofs it suffices to show this for +truly random F (selected at the beginning of the experiment). The +passage to a pseudorandom F then incurs a negligible change in +probability. In order to produce a valid tag the adversary would have +to correctly guess a random value from {0,1}^n which has indeed +negligible probability. QED + + +CBC-MAC +- - - - + +In order to extend a fixed-length MAC to variable length messages one +can break the message up into equal length blocks m=m_1,...m_l of +length n each possibly padding the last block with 0s and then +authenticate each block separately. In order that this is secure, one +must, however augment each block with + +- the length l, +- its sequence number i:{1..l} +- a random identifier r:{0,1}^n (the same for each block) + +Thus, the blocks to be authenticated have length 4n each (assuming +that l and i fit into n bits) and an appropriate MAC must be +used. Alternatively, one can break m into blocks of size n/4.... + +These additional data prevent the adversary from dropping blocks at +the end (length), changing the order of blocks (sequence numbers), +mixing blocks from two different messages (random identifier). + +One can show that this scheme is indeed secure, but it is rather +inefficient since the tags are 4 times as long as the message itself. + +A more efficient method mimicks the CBC-mode encryption but is subtly +different: + +Rather than an arbitrary MAC we need a pseudorandom function F_k, +i.e. a block cipher. We then decompose the message m into equal sized +blocks + +m = m_1 .. m_l + +as before (using padding if needed). We then *prepend* the message +with another block encoding the length l (which is thus assumed +smaller than 2^n). Then blocks are encrypted and XORed with the +encryption of the previous block prior to encryption. The encryption +of the last block then is the desired MAC: + +t_0 = F_k(l) +t_{i+1} = F_k(t_i + m_i) +t = t_l + +We will prove that this scheme is secure. Notice, however, that +seemingly innocuous modifications thwart its security: + +First, let us see why we need to incorporate the length. Suppose we +would just put: + +t_1 = F_k(m_1) +t_2 = F_k(m_2+t_1) +... +t = t_l + +Then, if t=Mac_k(m_1 m_2) and t'=Mac_k(m_3+t m_4) then t'=Mac_k(m_1 +m_2 m_3 m_4) thus allowing us to forge a previously unqueried tag. + +Now suppose that we would start with a random initialisation vector IV +and transmit it along with the tag: + +t_0 = F_k(l+IV) +... + +This would allow us to modify the length: If (IV,t) is the MAC of l,m +then (IV+l+l',t) is the MAC of l',m. + +It is also not ok to attach the length at the end rather than at the +beginning. Namely, suppose that we obtain the following tags where the +length is added in parenthesis for clarity. + + +ABCDE(5) : t +XYZKL(5) : t' +ABCDE5UVW(9) : t'' + +We then have that the tag of XYZKL5 U+t+t' VW(9) is also t''. + +We will now sketch a proof that CBC-MAC is indeed secure. We will show +the following theorem from which the claim follows by the standard +reasoning used in earlier security proofs. + +Theorem: Fix a security parameter n. For g:{0,1}^n->{0,1}^n and m=m1 +m2 ... mk write CBC_g(m) = CBC_g(m1,m2,...,mk) = + + g(ml+g(m_{k-1}+ ... g(m2 + g(m1)) ... ) + +Also let f:({0,1}^n)* -> {0,1}^n be an arbitrary function. Let D be an +adversary which is given oracle access to a function ({0,1}^n)* -> +{0,1}^n. + +Suppose that D queries its oracle on a prefix-free set, i.e. if X and +Y are both queried then X is not a prefix of Y nor is Y a prefix of X. +One then has + + |Pr(D^{CBC_g()}(n) = 1) - Pr(D^{f()}(n) = 1)| = negl(n) + +where probabilities are taken over g and f chosen uniformly at random. + +To see why this implies the theorem we first consider that the +difference between CBC_g and CBC_{F_k} is negligible since F_k is +assumed to be pseudorandom. Furthermore, notice that in CBC-MAC proper +the function CBC_{F_k} is applied to a subset of P:={|m| m | +m:({0,1}^n)*} which itself is a prefix-free set. Thus, the whole +business with including the length and putting it into the first +position rather than anywhere else serves to ensure prefix-free-ness. + +Proof of Theorem (sketch; for details see KL, 2nd ed): + +Let Q={X1..Xq} be the (prefix-free) set of queries (q = number of +queries) made by the distinguisher D to the oracle. Let l be the +maximum number of blocks in those queries. Notice that both q---the +number of queries---and l---their maximum length are polynomial in n +since the distinguished is assumed to run in polynomial time. If D +queries X and X = m1 m2 ... mk then (in the case where the oracle is +instantiated with CBC_g()) the function g() will be evaluated at m1, +m2+g(m1), m3+g(m2+g(m1)), ... . Write C_g(X) for this sequence (of +length < l). We say that Q exhibits a collision if among the +g-evaluations occurring in {C_g(X) | X:Q} one finds the situation +g(x)=g(x') where x =/= x'. We also consider it a collision if at some +point we have mi+g(x) = mj'+g(x') despite x=/=x' (and then necessarily +mi=/=mj'). + +Under the assumption that g() is random, the probability that such a +collision is exhibited is bounded by q^2 * l^2 * negl(n) for there are +<= q^2 * l^2 possibilities for where such a collision might occur and +the probability for a collision at a given position is negligible. +Since q and l are polynomial in n this probability is therefore +negligible itself. + +Let us assume that no collision is exhibited. Now, for each X:Q let +v(X) be the last value in C_g(X). We have CBC_g(X) = g(v(X)). The +values v(X) are pairwise distinct and do not agree with any of the +other values in {C_g(X) | X:Q}. This is where prefix-freeness is +used. As a result, since g() is random, the answers to the queries +g(v(X)) are uniformly random and thus adhere to the same distribution +as f(X) for the random f. Thus, the only advantage that the +distinguisher has stems from the possibility of a collision being +exhibited which, as we saw above, is negligible. + + + +In conclusion, we note that an already existing implementation of +CBC-mode (for encryption) can be turned into an implementation of +CBC-MAC by + + - using IV=0 + - discarding all but the last blocks of the ciphertext + - prepending the plaintext with its length + +Mistakes in any one of these (e.g. using a random IV and transmitting +it, omitting the length or placing it elsewhere, transmitting other +blocks of the ciphertext, lead to insecurities. See the nice blog +article "Why I hate CBC-MAC"). + +https://blog.cryptographyengineering.com/2013/02/15/why-i-hate-cbc-mac/ -- cgit v1.2.3