18 files changed, 3885 insertions, 0 deletions
diff --git a/ss2017/cryptography/Notes01.txt b/ss2017/cryptography/Notes01.txt
new file mode 100644
index 0000000..75c4cc6
--- /dev/null
+++ b/ss2017/cryptography/Notes01.txt
@@ -0,0 +1,108 @@
+Throughout the notes the acronym KL stands for the book "Introduction
+to Modern Cryptography" by Jonathan Katz and Yehuda Lindell, Chapman &
+Hall, 2008.
+Set braces {} may be used to enclose a subscript consisting of more
+than one letter. E.g x_{10} is "x sub ten". Sometimes we omit the
+underscore vor subscripting and write x1 or xi for x_1 and x_i.
+1 Overview, Historic Ciphers:  Caesar, Vigenere,  (KL1.3)
+---------------------------------------------------------
+The Shift Cipher
+- - - - - - - - -
+Identify the 26 upper case alphabet letters with numbers 0..25,
+e.g. A=0, B=1.  Choose an integer k:{0..25} as "key" and encrypt a
+letter x as x+k mod 26. Encrypt a text as the sequence of the
+encryption of its letter.  For example, if k=3 then the text
+BEGINTHEATTACKNOW is encrypted as EHJLQWKHDWWDFNQRZ.  The decryption
+of a letter x is just x-k mod 26 and the decryption of a text is
+obtained by applying this procedure letterwise.
+The special case k=3 is known as "Caesar Cipher" since purportedly
+Julias Caesar used it to encrypt some of his letters. The special case
+k=13 is known as ROT-13.
+There are two simple attacks on a shift cipher. The first one uses the
+fact that there are only 26 possible keys so one can try them all
+out. This, however, requires that one can recognize the right
+plaintext, e.g. because it is correct English.
+The second is based on frequencies of letters. In a standard English
+text the letters occur with typical frequencies. E.g. E is the most
+frequent letter with likelihood 12.7%. Then follows T with 9.0% and A
+with 8.2% etc. Thus, we can count the relative frequencies of the
+letters in the ciphertext and thus work out the key.
+An improved and more systematic attack based on frequencies goes as
+follows: Let p_i stand for the frequency of letter i, e.g. p_4 = 12.7%
+since E=4. One has
+                         sum_{i=0..25} p_i^2 =~= 0.065
+So, if q_i is the frequency of the letter i *in the ciphertext* then
+we expect q_{i+k} to be close to p_i so that we can recover k by
+choosing j in such a way that the following quantity I_j is as close to
+0.065 as possible. 
+                     I_j := sum_{i=0..25} p_i * q_{i+j mod 26}
+The Vigenere Cipher
+- - - - - - - - - -
+The shift cipher suffers from two problems:
+1) the number of possible keys is too small
+2) the frequency distribution of letters in the plain text shows in
+   the ciphertext
+A famous historic cipher not suffering from the first problem is the
+Vigenere cipher. Here one selects a word or a short text as key,
+writes the key repeatedly under the plaintext and then "adds" letters
+one above another (as numbers mod 26). Here is an example with key BEADS:
+THEMANANDTHEWOMANRETRIEVEDTHELETTERFROMTHEPOSTOFFICE
+BEADSBEADSBEADSBEADSBEADSBEADSBEADSBEADSBEADSBEADSBE
+ULEPSOENGLIIWREBRRHLSMEYWEXHHDFXTHJGVOPLIIPRKUSFIADI
+Alas, the frequency distribution of the plaintext still seeps through
+and we can use this information to break the cipher without knowing
+the key:
+If we know the length of the key, e.g. 5, and the ciphertext is x_0
+x_1 x_2 ...  then we can apply the previous frequency analysis (with
+the I_j scores) to the 5 texts
+x_0 x_5 x_10 ...
+x_1 x_6 x_11 ...
+...
+x_4 x_9 x_14 ...
+all of which are obtained by encrypting the corresponding plaintext
+fragments with a shift cipher. Therefore, their frequency spectrum
+should be a shift of the standard one and the corresponding shift,
+hence the individual letters of the key, can be retrieved.
+In order to get the key length we can follow a similar approach. For a
+given shift t let q_i be the relative frequency of letter i (e.g. E
+when i=4) in the sequence x_t, x_2t, x_3t. If the shift t is the
+actual key length then this subsequence will have the frequency
+spectrum as English plaintexts just shifted by t so we expect that
+            J_t := sum_{i=0..25} q_i^2 =~= 0.065
+since the summation is invariant under the any shift. If, however, the
+shift was the wrong one then we expect the letters to appear
+completely random so that in that case the above sum is rather close
+to 1/26 = 0.038. Thus, we compute J_t for t=1,2,3,... and pick the t
+for which J_t is largest (and close to 0.065).
+Other historic ciphers include monoalphabetic substitution (key =
+arbitrary permutation of the 26 letters), Grand Chiffre, the German
+Enigma, etc. All these historic ciphers have been broken. See
+Wikipedia and other online resources.
diff --git a/ss2017/cryptography/Notes02.txt b/ss2017/cryptography/Notes02.txt
new file mode 100644
index 0000000..59da143
--- /dev/null
+++ b/ss2017/cryptography/Notes02.txt
@@ -0,0 +1,214 @@
+2 Principles of  cryptography (KL1.1, KL1.2, KL1.4, KL2)
+--------------------------------------------------------
+These considerations and similar ones applying to other historical
+ciphers have led to the principles for the design of cryptosystems.
+Before stating these we describe somewhat more formally what a
+(private-key) cryptosystem is. 
+A cryptosystem comprises three algorithms:
+- A key generation algorithm Gen which outputs a key according to some
+  format an probability distribution.
+- An encryption algorithm Enc which given plaintext m and key k
+  produces a ciphertext Enc_k(m)
+- A decryption algorithm Dec which given ciphertext c and key k
+  produces a plaintext Dec_k(m).
+The set of all keys possibly put out by the key generator Gen is
+called keyspace (K).  For every key k produced by Gen it must hold
+that Dec_k(Enc_k(m)) = m. There is also a finite set of possible
+messages M and ciphertexts C. Clearly, Gen must be randomized,
+otherwise it would always return the same key. Sometimes Enc is also
+randomized, i.e. ciphertexts may contain some random information and
+do not depend functionally on plaintext and key.
+One of the most important design principles is Kerckhoff's rule (19th
+century): Security of the cryptosystem should rely on secrecy of the
+key not on secrecy of the method. In other words, the cryptosystem
+should remain secure if everything about the three algorithms involved
+is publicly known. "No security through obscurity".
+In addition to this, the cryptosystem should withstand some or all of
+the following kinds of attack:
+Ciphertext only attack: an eavesdropper observes ciphertext and aims
+at determining the underlying plaintexts.
+Known plaintext attack (KPA): This is when the eavesdropper knows some
+ciphertext-plaintext pairs for one and the same key and tries to use
+this information to then decrypt additional ciphertexts (again
+w.r.t. the same key).
+Chosen plaintext attack (CPA): Here the adversary may prepare
+plaintexts of its choice and request the corresponding ciphertexts.
+Chosen ciphertext attack (CCA): Here the adversary can even request
+the decryption of ciphertexts of its choice.
+Modern cryptography, say since WW2, holds the following additional
+principles:
+Principle 1: Formulation of exact definitions
+Which notion of security is one aiming at?
+No adversary should be able to know the key ... But what if they can
+decrypt ciphertexts without knowing the key?
+No adversary should be able to decrypt any ciphertext unless they know
+the key. But what if they can find out some bits of the ciphertext,
+e.g. every second letter, or a block of letters somewhere in the
+middle?
+Principle 2: Reliance on precise assumptions
+Typically, security of a cryptosystem can be guaranteed only by making
+assumptions on the computational power of an adversary (with unlimited
+resources they could always try out all plaintexts) and also on the
+computational difficulty of solving certain algorithmic
+problems. Desirable as it would be to have unconditional security; if
+one has to work with assumptions one should at least state them
+precisely. Sometimes, security can only be proven for idealized
+versions of cryptosystems that use components not known to exist but
+which are believed to be "approximated" by existing components. Again,
+it is then important to clearly state what these idealized components
+or versions are.
+Principle 3: Rigorous proofs of security 
+Once the desired notion of security and the assumptions are clearly
+defined one should then establish a rigorous mathematical proof of
+conditional security.
+"Given that Assumption X is true, Construction Y is secure according
+to the given definition."
+As an example, we now look at perfect security in the sense of Shannon
+and see how Vernam's cipher (patented in 1919) implements it.
+In order to get around the fact that an adversary can always try out
+all possible plaintexts but that this kind of attack "doesn't count"
+Claude Shannon suggested to use probabilities: if one out of two
+messages is selected at random and the adversary gets to see the
+ciphertexts only it has a chance of no more than 50% of determining
+the plaintext correctly from the ciphertext. For cryptosystem Pi and
+adversary A we thus define the following
+Eavesdropping indistinguishability experiment PrivK^eav_{A,Pi}:
+  1. The adversary produces two messages m0, m1 from the message space M.
+  2. Gen is used to generate a (random) key k and a bit b <- {0,1} is
+     chosen uniformly at random (50/50). The (or rather a) ciphertext
+     c = Enc_k(m_b) is then given to A. 
+  3. A answers by outputting a bit b'.
+  4. The output of the experiment is defined 1 if b=b' and 0 otherwise.
+Definition (perfect secrecy): The encryption scheme Pi=(Gen,Enc,Dec)
+is perfectly secret if for all adversaries A it holds that
+                Pr[PrivK^eav_{A,Pi} = 1] = 0.5
+The following cryptosystems are not perfectly secure:
+A book cipher which uses a secret code book containing for each *word*
+another word to replace it. E,g. "bomb" gets enciphered as "cake" and
+"dollars" gets enciphered as "grams". Here the adversary could select
+m0="bomb bomb" and m1="dollar bomb". If the received ciphertext has
+the form c = x x then it will output b'=0 and if c = x y with x =/= y
+it goes for 1.
+The Vigenere cipher with key length uniformly chosen at random from
+{1,2} (and once the length is fixed keys also chosen at random).  The
+adversary can choose m0 = aa and m1=ab. If the ciphertext has the form
+xx it goes for b'=0. Otherwise it returns b'=1.
+Now, if b=0 then there is a 1/2+1/2*1/26 chance that the ciphertext
+has the form xx (key length 1 or key length 2 with equal letters) and thus
+the adversary is right with probability 1/2+1/52
+If b=1 then the adversary is right with probability 1-1/52 (the only
+way to be wrong is key length two and two key letters of distance 1).
+Thus, the total probability for A to be right is 1/2*(1/2+1/52) +
+1/2*(1-1/52) = 3/4.
+Proposition: Fix a cryptosystem Pi and suppose that plaintexts and
+keys are drawn according to some, not necessarily uniform
+distribution. The sytem Pi is perfectly secret if and only if
+               Pr[M=m | C=c] = Pr[M=m]
+for every fixed plaintext m and ciphertext c such that Pr[C=c] !=
+0.
+Proof (sketch): Suppose that Pr[M=m | C=c] != Pr[M=m] for some c and
+m. Since sum_m Pr[M=m | C=c] = sum_m Pr[M=m] = 1 (both are probability
+measures) there must exist m_0,m_1 (one of them m) such that Pr[M=m0 |
+C=c] < Pr[M=m0] and Pr[M=m1 | C=c] > Pr[M=m1].  By Bayes' theorem it
+follows that Pr[C=c|M=m0]<Pr[C=c] and Pr[C=c|M=m1]>Pr[C=c], thus in
+particular, Pr[C=c|M=m0] < Pr[C=c|M=m1].
+Now, consider the adversary who presents these messages m_0, m_1 and
+if the ciphertext returned is c
+then outputs b'=1, i.e. assume that the plaintext was not
+m0 and outputs b'=0 otherwise. Its success probability is 
+1/2*(1-Pr[C=c|M=m0]) + 1/2*Pr[C=c|M=m1] =
+1/2 + 1/2(Pr[C=c|M=m1] - Pr[C=c|M=m0])
+which is strictly above 1/2.
+For the converse, one uses the fact that all the adversary knows is
+the ciphertext, but according to the assumed equality Pr[M=m | C=c] =
+Pr[M=m] this does not help.                                  Q.E.D.
+Now, such perfectly secret cryptosystems exist indeed: for fixed n:N
+put M=C=K={0,1}^n and let Gen return k:K uniformly at random. The
+encryption function is Enc_k(m) = m+k and Dec_k(m) = m+k as well,
+where + denotes pointwise addition modulo 2 (XOR). For example, if n=5
+and k=11010 and m=11100 then Enc_k(m)=00110.
+This system is known as Vernam's cipher or "one time pad" the point
+being that its secrecy relies on the key being chosen afresh at random
+every time one wants to use the scheme. It can also be seen as a
+special case of the Vigenere cipher with the key being as long as the
+entire text to be enciphered.
+Proposition: The Vernam cipher is perfectly secret.
+Proof: This is because for a given plaintext m the ciphertexts are
+uniformly distributed, i.e., their distributions are identical and
+independent for m0 and m1. As a result, the adversary cannot have an
+advantage. Q.E.D.
+The obvious disadvantage of the one-time pad is the key length,
+nevertheless it is sometimes used, e.g. purportedly during the Cold
+War for secure communication between the White House and the Kremlin. 
+Unfortunately, this key length is the price of perfect secrecy:
+Theorem: If Pi is a perfectly secret cryptosystem with key space K and
+message space M then |K|>=|M|, i.e. there have to be at least as many
+possible keys as there are possible messages ("the key must be as long
+as the message").
+Proof: Suppose for a contradiction that |K|<|M| and that ciphertext
+c:C occurs with nonzero probability. Let M(c) = {m | Dec_k(c)=m for
+some k:K}.  We must have |M(c)|<=|K| since for each m:M(c) there is at
+least one k:K such that m=Dec_k(c) and the keys thus corresponding to
+distinct messages are distinct. If now |K|<|M| there must exist some
+m' not in M(c). But then Pr[M=m'|C=c] = 0 =/= Pr[M=m'], so the scheme
+is not perfectly secret.
+This result is due to Claude Shannon who gives in fact a more precise
+characterisation of perfect secrecy, see (KL2.4).
diff --git a/ss2017/cryptography/Notes03.txt b/ss2017/cryptography/Notes03.txt
new file mode 100644
index 0000000..fb94b87
--- /dev/null
+++ b/ss2017/cryptography/Notes03.txt
@@ -0,0 +1,100 @@
+3 Semantic Security (KL3.1, KL3.2)
+----------------------------------
+In order to formally study secrecy of schemes that use shorter keys
+one must somehow incorporate the computational power of an
+adversary. One way of doing this is to use concrete bounds:
+A scheme is (t,epsilon)-secure if every adversary running for time t
+succeeds in breaking the scheme (in the appropriate sense) with
+probability at most epsilon.
+For example, if we use keys of bit length n and the only possible
+attack is by trying out all keys (this happens e.g. with Vernam's
+cipher) then assuming that 10^9 keys can be tried out every second
+(1GHz) trying out 2^60 operations takes 35years thus several years
+even with highly parallel computers. Thus, a 60bit key takes this time
+to be broken with certainty, a 64 bit key takes this time to be broken
+with probability 1/16 etc.  Currently, one recommends key lengths of
+128bit (128 *bits of security*) to make the success probability
+negligible (e.g. lower than that of a meteor crashing into the device)
+and to allow for future technological advances.
+Now, working with concrete runtimes and probabilities is difficult and
+brittle because of its dependency on concrete hardware configuration,
+programming language, implementation details, etc. As a result, one
+uses instead an asymptotic approach where runtimes and probabilities
+as well as sizes of keys and texts are functions of a *security
+parameter* n. One then wants that the success probability of an
+adversary whose runtime is polynomial in the security parameter is a
+*negligible function* of the security parameter, i.e. smaller than n^k
+for any k.
+Definition: A private-key encryption scheme is a tuple of
+probabilistic polynomial-time algorithms (Gen,Enc,Dec) such that:
+1. Gen computes a key (in {0,1}^*) from the security parameter
+(presented in unary, i.e. as 1...1 (n times)). We write this as k <-
+Gen(n) or k <- Gen(1^n) to emphasize the unary notation.
+2. Enc_k(m) and Dec_k(c) are as before, except that m, c are
+bitsequences (in {0,1}^*). In particular, we must have Dec_k(Enc_k(m))=m.
+A fixed-length private-key encryption scheme is defined analogously,
+but there is a function l(n) fixing the length of plaintexts as a
+function of the security parameter. Formally, when k is returned from
+Gen(n) then Dec_k(m) is defined only for m of length exactly l(n).
+For example, the one-time pad can be cast into this framework. Given
+the security parameter n, Gen generates a random bit sequence of
+length n. One has l(n)=n and Enc_k(m) = m+k etc.
+Indistinguishability in the presence of an eavesdropper
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - 
+We now want to adapt the notion of perfect secrecy from above to the
+asymptotic framework. An adversary thus also gets the security
+parameter n and its runtime must be bounded by a polynomial in n. It
+is allowed to use randomization (has access to random bits). 
+Given a scheme Pi=(Gen,Enc,Dec) and an adversary A the experiment
+PrivK^eav_{A,Pi}(n) is now defined as follows:
+  1. The adversary produces two messages m0, m1 of equal length
+     and of length l(n) in the case of fixed length. 
+  2. Key k is generated by Gen(n) and a bit b <- {0,1} is
+     chosen uniformly at random (50/50). The (or rather a) ciphertext
+     c = Enc_k(m_b) as well as the security parameter n are then given to A. 
+  3. A answers by outputting a bit b'.
+  4. The output of the experiment is defined 1 if b=b' and 0 otherwise.
+Definition:  A scheme Pi=(Gen,Enc,Dec) has *indistinguishable
+encryptions in the presence of an eavesdropper* if there is a
+negligible function negl(n) such that for all adversaries A
+      Pr[PrivK^eav_{A,Pi}(n) = 1] <= 1/2 + negl(n)
+Remark: rather than having the adversary supply m0 and m1 they could
+also be universally quantified. Randomizing over m0 and m1 would lead
+to a weaker (and unreasonable) definition though.
+*Semantic security* proper: 
+One can show (see 3.2) that this definition is equivalent to other
+ones that give even more power to the adversary. The most compelling
+one is "semantic security" which assumes an arbitrary probability
+distribution on the plaintexts and also partial information on the
+plaintexts to be given to the adversary.
+We remark that in general proofs of semantic security (in the sense of
+the above definition) are constructive in that the negligible function
+bounding the probability can be read off as an explicit term. From
+this, one can --- possibly after optimizing the constructions used in
+the proof --- read off concrete bounds for fixed security parameter n.
diff --git a/ss2017/cryptography/Notes04.txt b/ss2017/cryptography/Notes04.txt
new file mode 100644
index 0000000..cabf8f9
--- /dev/null
+++ b/ss2017/cryptography/Notes04.txt
@@ -0,0 +1,78 @@
+4 Pseudorandomness (3.3)
+------------------------
+In order to arrive at cryptosystems with keys considerably shorter
+than messages it is natural to use a generator which from a truly
+random seed produces (deterministically!) a sequence of bits that
+looks random. Indeed, such generators are often used in the
+implementation of random in standard libraries e.g. for C or Java.
+The question is whether a so constructed cryptosystem might be while
+not perfectly secret still enjoy semantic security.  The answer is yes
+if the generator is a *pseudorandom generator*, PRG for short, in the
+following sense.
+Definition (pseudorandom generator): Let l(n) be a polynomial and G a
+deterministic polynomial time algorithm that for any bitstring s of
+length n outputs a bitstring G(s) of length l(n). We call G a
+*pseudorandom generator* with expansion l(n) if for any probabilistic
+polynomial time adversary D ("distinguisher") it holds that
+ | Pr[D(r)=1] - Pr[D(G(s))=1] |  = negl(n)
+for some negligible function negl(n) and for all n.  Here s and r are
+bitstrings of lengths n and l(n), respectively, chosen uniformly at
+random.
+We remark that if l(n)=n then G(s)=s is trivially a pseudorandom
+generator and if l(n)>n and G is any function sending length n
+bitstrings to length l(n) bitstrings will be vulnerable against
+distinguishers with exponential computing power. Simply let D
+enumerate all length n bitstrings s' and check for each of them
+whether its input equals G(s'). Return 1 if yes, 0 otherwise. When fed
+G(s) the distinguisher will output 1 with certainty and if fed a
+random string then it will output one with probability 2^{-(l(n)-n)}
+only. The difference between these probabilities is not negligible.
+In summary, pseudorandom strings are as good as truly random ones so
+long as they are used in a probabilistic polynomial time context and
+one accepts negligible probability of failure.
+Unfortunately, one does not know whether there exist PRG with l(n)>n
+at all, but it is generally believed that they do. At least, there do
+exist several such objects for which no (probabilistic, polynomial
+time) distinguisher is currently known despite a lot of research.
+Semantically secure cryptosystems from PRG (KL3.4.1)
+- - - - - - - - - - - - - - - - - - - - - - - - - - 
+Assuming that we have a PRG G with expansion factor l(n). We can use
+it in order to construct a semantically secure fixed-length encryption
+scheme for messages of length l(n) as follows:
+Gen(n): return a random bitstring of length n as key.
+Enc_k(m): return G(k)+m /* + = bitwise XOR */
+Dec_k(c): return G(k)+c
+Notice that Dec_k(Enc_k(m)) = G(k)+G(k)+m = m
+Theorem: If G is a PRG of expansion l then the above construction has
+indistinguishable encryptions in the presence of an eavesdropper.
+Proof: Let Pi stand for the scheme constructed above from G and
+l(n). Assume further, that A is an adversary against Pi. We construct
+from A a distinguisher D as follows:
+Given a string w (either random or of the form G(s)) the distinguisher
+begins by letting the assumed adversary A produce two messages m0 and
+m1. It then selects b:{0,1} at random and sends m_b + w to the adversary A.
+If A outputs b the distinguisher returns 1 and it returns
+0 otherwise. Let p be the success probability of A in the PrivK^eav
+experiment which must be shown to be 0.5+negl(n). If w is truly random
+then so is m_b+w and the chance that A finds out b is exactly 1/2. If,
+on the other hand, w is G(s) for random s, then this likelihood is
+p. The difference between the two, i.e. p-1/2 is negligible since G
+was assumed to be a PRG and as a result p=1/2+negl(n) as required.
+Q.E.D.
diff --git a/ss2017/cryptography/Notes05.txt b/ss2017/cryptography/Notes05.txt
new file mode 100644
index 0000000..beca359
--- /dev/null
+++ b/ss2017/cryptography/Notes05.txt
@@ -0,0 +1,217 @@
+5 Security against chosen plaintext attacks (KL3.5)
+---------------------------------------------------
+We now consider security of cryptosystems against stronger adversaries
+which may request the encryption of arbitrary plaintexts of their
+choice. There are some practical scenarios where this might occur,
+e.g. a smartcard which can be interacted with arbitrarily or by
+preparing some situation whose description will then be transmitted by
+the opposite party in encrpyted form.
+Given a scheme Pi=(Gen,Enc,Dec) and an adversary A the experiment
+PrivK^cpa_{A,Pi}(n) is now defined as follows:
+  1. Key k is generated by Gen(n)
+  2. The adversary A is given the security parameter n and the procedure
+     Enc_k as a black box (oracle access). After some interaction with
+     the oracle it then outputs two messages m0 and m1 from {0,1}^{l(n)}
+     (or of the same length in the case of variable length schemes).
+  3. A  bit b <- {0,1} is chosen uniformly at random (50/50). The
+     (or rather a) ciphertext  c = Enc_k(m_b) is given to A.
+  4. A continues to have oracle access to Enc_k and then answers by
+     outputting a bit b'.
+  4. The output of the experiment is defined 1 if b=b' and 0 otherwise.
+Definition: A scheme Pi=(Gen,Enc,Dec) has *indistinguishable
+encryptions under a chosen plaintext attack* (is cpa-secure for short)
+if there is a negligible function negl(n) such that for all (pr.-polytime)
+adversaries A
+      Pr[PrivK^cpa_{A,Pi}(n) = 1] <= 1/2 + negl(n)
+There is also a "known plaintext version" where the adversary does not
+get full-blown oracle access to Enc_k but can merely request
+encryptions of a previously submitted list of plaintexts. The details
+of this are left to the reader.
+We first notice that no deterministic scheme can possibly be
+cpa-secure. Given c simply compute (using the oracle) c_0=Enc_k(m_0)
+and c_1=Enc_k(m_1) and return b' such that c_b'=c. Thus, any cpa-secure
+scheme must include a random element into the encryption. But naive
+ways of doing so will not be successful either. Suppose for example
+that Pi=(Gen,Enc,Dec,l) is a deterministic scheme, i.e. Enc_k( ) is a
+function. Design a new scheme Pi' as follows:
+For bitstrings u,v we let u || v stand for the concatenation of u and
+v. If w is a bitstring of even length 2n then we let w.1 and w.2 stand
+for its first and second halves.
+Gen' = Gen
+l'(n) = l(n)/2 /* w.l.o.g. l(n) is even */
+Enc'_k(m) = Enc_k(r || m) where r is a random bitstring
+             of length l(n)/2.
+Dec'_k(c) = Dec_k(c).1
+Clearly, Pi' has indistinguishable encryptions in the presence of an
+eavesdropper if Pi has, but Pi' does not in general withstand chosen
+plaintext attacks. Namely, consider that Enc_k(m)=G(k)+m as above. We
+then have Enc'_k(m) = G(k).1 + r || G(k).2 + m.  As a result, the
+adversary can take the first half of the received ciphertext c,
+request encryptions c_0 nd c_1 of m_0 and m_1 and determine b' so that
+c.1 = (c_b').1 .
+In order to achieve the desired security we need a *pseudorandom function*
+Definition: A pseudorandom function (PRF) is a binary function on bitstrings
+                F : {0,1}^* x {0,1}^* -> {0,1}^*
+with application F(k,x) written F_k(x) such that
+ 1) |k|=|x| implies |F_k(x)|=|k|=|x| (and F need not be defined on
+    arguments of different length)
+ 2) F is deterministic, polynomial-time computable
+ 3) For all probabilistic polynomial-time distinguishers D and all n
+    one has
+     | Pr[ D^{F_k()}(n)=1 ] - Pr[D^{f()}(n)=1] | = negl(n)
+    for some negligible function negl() and probabilities taken over
+    k and f(), see below.
+The distinguisher has access to an oracle, i.e. a subroutine of type
+{0,1}^n->{0,1}^n. In the experiment, first k:{0,1}^n is drawn
+uniformly at random and then a function f from {0,1}^n->{0,1}^n is
+drawn uniformly at random (from the 2^{n*2^n} many possibilities). The
+oracle is then instantiated with F_k() on the one hand and
+with f on the other. The probabilities for the distinguisher to output
+1 in either case should differ only negligibly.
+So, where a pseudorandom generator expands a size n key k to a size
+l(n) bitstring that looks random a pseudorandom function expands a
+size n key to a function F_k which could be written as a size n*2^n
+bitstring. Again, this "very large" object should look
+random. However, only a polynomial portion of it can be examined by
+the distinguisher. On the other hand, which portion that is is up to
+the distinguisher to decide and not priori fixed.
+We remark that one often uses variants of PRF where the size of
+argument x and results are functions of a security parameter other
+than the identity function. (The security parameter must then be
+supplied explicitly to F)
+Relationship between PRF and PRG:
+It is not hard to see that every PRF can serve as a PRG. Given k
+simply output the string F(k,0) || F(k,1) || F(k,2) || ... || F(k,N) for
+sufficiently large N and where e.g. 2 stands for the encoding of 2 as
+a length n bitstring. If a distinguisher could tell this from a random
+string one could easily use it to distinguish F_k from a random
+function. Surprisingly, the converse also holds, see (KL6.5). However,
+we still do not know whether PRG exist.  In practice, PRF are
+constructed using heuristic approaches.
+A CPA-secure cryptosystem from a PRF (KL3.6.2)
+- - - - - - - - - - - - - - - - - - - - - - - -
+Here is, how we use a PRF to construct a CPA-secure cryptosystem. Note
+that by our earlier observation the scheme itself must be randomised.
+Gen(n): return a random bitstring of length n as key.
+Enc_k(m): choose r:{0,1}^n uniformly at random and output the ciphertext
+            c := r || F_k(r)+m
+Dec_k(c): given c extract c.1=r and c.2=F_k(r)+m then return c.2+F_k(r)
+Notice that Dec_k(Enc_k(m)) = F(k,r)+F(k,r)+m = m
+Remark: A random element like r added to a piece of data is known as a
+*nonce*.
+Theorem: If F is a PRF then the above construction yields a fixed
+length (l(n)=n) CPA-secure cryptosystem.
+Proof: Let Pi stand for the scheme constructed above from F and
+l(n). Assume further, that A is an adversary against Pi. We construct
+from A a distinguisher D for the PRF F as follows:
+Given security parameter n and oracle access to a function h (which is
+instantiated as either F_k() or a truly random f()) the distinguisher
+begins by passing the security parameter n to the adversary A.
+Now, whenever the adversary queries its own oracle which it assumes to
+be of the form Enc_k() then the distinguisher chooses r:{0,1}^n at
+random and returns r || h(r)+m to A.
+When the adversary enters the second phase and outputs two messages
+m_0, m_1 to distinguish based on their encryptions the distinguisher
+then chooses a random bit b:{0,1} and r:{0,1}^n uniformly at random
+and returns r || h(r)+m_b to A. Further oracle queries from A are
+answered as before.
+When A eventually outputs a bit b' reply 1 if b=b' and 0 otherwise.
+Let us see what the adversary can do when h is a truly random
+function. Unlike in the "eavesdropper" case, A has a certain
+advantage: namely suppose it has already requested encryptions c_0 and
+c_1 of the messages m_0 and m_1 it then forwards (and it would be
+"wise" for A to do so). If it now happens by accident that
+c.1=c_0.1=c_1.1, i.e. the three random nonces used were all equal,
+then since the rest is deterministic, the adversary can pick b' such
+that c.2=c_{b'}.2 and be correct. Since, however, the adversary can
+only make polynomially many queries the likelihood for this to happen
+is negligible. If, however, the random nonce used in the preparation
+of c is different from all the ones used in oracle queries then c
+looks completely random to A and its success probability is exactly
+1/2. Summing up, if h is truly random then A's success probability is
+1/2+negl(n).
+If, on the other hand, h is F_k(), then let p be the success
+probability of the adversary A on our new cryptosystem. It is clear
+that our distinguisher will output 1 with probability p for we have
+used A according to the rules of the PrivK^cpa-experiment. Since F was
+assumed to be a PRF we thus know that |p-(1/2+negl(n))| is itself
+negligible. It then follows that p is bounded by 1/2 plus a negligible
+function.                                                      Q.E.D.
+Notice that this proof newly introduces negligible probabilities
+rather than (or in addition to) just move them around and thus gives
+further evidence as to why allowing negligible deviations from the
+ideal situation is reasonable.
+We also remark that the way the system was presented the length of the
+key equals that of the message. However, in the case of a CPA-secure
+system we can encrypt several messages with the same key without
+compromising security. We generalise this idea in the next chapter.
+Proposition: Let Pi=(Gen,Enc,Dec) be cpa-secure and define
+Pi'=(Gen,Enc',Dec') by Enc'_k(m1 || m2) = Enc_k(m1) || Enc_k(m2). Then
+Pi' is cpa-secure. Here m1, m2, as well as, c1,c2 are assumed to be of
+equal length.
+Proof: From a hypothetical cpa-adversary A' against Pi' we construct a
+cpa-adversary A against the original system Pi as follows:
+A runs A' and answers any oracle requests according to Pi'. E.g. if A'
+asks for an encryption of m1||m2 it requests encryptions c1 and c2 of
+m1, m2, respectively and then provides the answer c1||c2.
+When A' outputs the challenge m1^0||m2^0 vs m1^1||m2^1 A submits the
+challenge m2^0 vs m2^1 and receives a ciphertext c2 (which should be an
+encryption of m2^0 or m2^1). A then requests an encryption c1 of m1^0 and
+passes c1||c2 to A'.
+The answer bit b' supplied by A' is then output.
+Let p be the success probability of A'.
+If the secret bit b is equal to 0 then the
+success probability of A is p as well.
diff --git a/ss2017/cryptography/Notes06.txt b/ss2017/cryptography/Notes06.txt
new file mode 100644
index 0000000..a14e840
--- /dev/null
+++ b/ss2017/cryptography/Notes06.txt
@@ -0,0 +1,216 @@
+6 Block ciphers, Modes of operation (KL3.6.3 KL3.6.4)
+-----------------------------------------------------
+A pseudorandom function F() with the additional property that F_k is
+invertible and the inverse F_k^{-1} is computable in polynomial time
+is known as *block cipher* or *pseudorandom permutation*. It is not
+hard to see that in this case,
+Enc_k(m) = F_k(m)
+Dec_k(c) = F_k^{-1}(c)
+is semantically secure (hence the name "block cipher"). While a simple
+PRG already suffices to build a semantically secure cryptosystem those
+based on PRF as above enjoy stronger properties as we now explain.
+With the help of a block cipher it is possible to design encryption
+schemes for messages of arbitrary length using various *modes of
+operation* that we now describe. In practice, the underlying block
+ciphers (or pseudorandom functions) are heuristically designed
+functions like DES or AES which we we will look at later. In some
+modes of operation it suffices to have a pseudorandom function rather
+than permutation.
+In each mode of operation we divide the message to be encrypted into
+blocks of size n, where n is the security parameter and hence the size
+of the messages the pseudorandom function can work with. If the length
+of the whole message is not a multiple of n we must *pad* the last
+block in some way, e.g. by adding 1000...0. There are various padding
+schemes which describe this in detail. We henceforth assume that
+padding has already taken place and that the message comprises l
+blocks m_1,...,m_l of length n each.
+Remark: it is possible to vary l from message to message and in this
+way to encrypt a *stream* of bits rather than a block. One thus
+effectively obtains so-called *stream ciphers*. Such stream ciphers
+are also constructed and studied in their own right, see
+e.g. KL3.4. In this course, we do not consider stream ciphers.
+ECB mode (electronic code book)
+- - - - - - - - - - - - - - - - 
+This is the simplest mode of operation and it is actually not secure
+at all, hence obsolete and included only for historical reasons. Let F
+be a block cipher. 
+The encryption of the plaintext m = m1..ml under ECB-mode is given by
+                Enc_k(m) = F_k(m1) F_k(m2) ... F_k(ml)
+i.e. the encodings F_k(mi) of the blocks mi are simply
+concatenated. This mode receives its name from the fact that it
+resembles a code book (which for each short message lists the
+corresponding code) albeit in electronic form.
+It is easy to see that ECB is not even secure against an
+eavesdropper. If, assuming blocklength 3, we choose messages m=000000
+and m'=000111 we can tell them apart on the basis of the corresponding
+ciphertexts c,c' alone because c has the form xyzxyz whereas c' does
+not. This deficiency of ECB shows in practice when it is used to
+encode images, see KL (2nd ed) for an example.
+CBC mode (cipher block chaining)
+- - - - - - - - - - - - - - - - - 
+Here we choose a random initialisation vector IV:{0,1}^n and encrypt as
+Enc_k(m) = IV c1 c2 c3 ... cl
+where c1=F_k(IV+m1), c2=F_k(c1+m2), c3=F_k(c2+m3), ...
+I.e. each block is xor-ed with the encoding of the previous block
+prior to sending it through F_k.
+One can show that CBC-mode is CPA-secure, however it suffers from the
+disadvantage that the ith ciphertext is needed to compute the i+1-st
+one so the scheme cannot be parallelized.
+The following variant of CBC-mode has been used in practice (SSL 3.0, TLS 1.0):
+Chained CBC-mode (not CPA secure)
+- - - - - - - - - - - - - - - - - 
+Rather than choosing a fresh initialisation vector each time it has
+been proposed to use instead the last ciphertext, i.e., c_l as the new
+initialisation vector.  This reduces the size of the subsequent
+message by one block and should intuitively be as good as CBC-mode
+proper. However, this scheme is vulnerable to a CPA-attack as follows:
+Suppose that the attacker knows that m_1 is one out of m_1^0 and m_1^1
+and that m = m_1 m_2 m_3 has already been encrypted as IV c_1 c_2
+c_3. The attacker could now request the encryption of another message
+starting with the block
+                  m_1' := IV + m_1^0 + c_3
+The first ciphertext block c_1' then satisfies
+                  c_1' = F_k(c_3+m_1') = F_k(IV + m_1^0)
+Noticing that c_1 = F_k(IV + m_1) we then have m_1 = m_1^0 iff c_1' =
+c_1.
+OFB mode (output feedback)
+- - - - - - - - - - - - - -
+Here we only need a pseudorandom function. From a randomly chosen
+initialisation vector IV we first compute pseudorandom strings r1...rl
+as follows:
+r1 = F_k(IV), r2 = F_k(r1), ...
+and then encrypt using xor:
+Enc_k = IV m1+r1 m2+r2 m3+r3 ... ml+rl
+So, in essence, we use F_k as a pseudorandom generator of expansion
+l(n) = l*n and then apply the earlier construction of an EAV-secure
+scheme from a PRG. Due to the additional randomization via the
+initialisation vector OFB-mode is even CPA-secure.
+CTR mode (counter)
+- - - - - - - - -
+This mode is similar to OFB, but we compute the pseudorandom string by
+successively incrementing IV:
+r_i = F_k(IV+i)      /* this + is binary addition, not XOR */
+Enc_k = IV m1+r1 m2+r2 m3+r3 ... ml+rl
+This has the advantage that the ri and hence the ciphertexts can be
+computed in parallel. A further advantage that it shares with OFB is
+that the pseudorandom string IVr1..rl can be computed ahead of time
+and stored until the message to be encrypted arrives. Furthermore,
+portions of the ciphertext can be decrypted without computing or
+decrypting anything else. This property is known as *random access*.
+Theorem: If F is a pseudorandom function then (randomised) counter
+mode (as described above) is CPA-secure.
+Proof: (Unlike KL we make the simplifying assumption that message
+length l is always the same). Let A be an adversary as in the Priv^cpa
+experiment. We transform it into a distinguisher D for the underlying
+pseudorandom function as we did in the proof of the last Theorem.
+Given security parameter n and oracle access to a function h (which is
+instantiated as either F_k() or a truly random f()) the distinguisher
+begins by passing the security parameter n to the adversary.
+Now, whenever the adversary queries its own oracle which it assumes to
+be of the form Enc_k() then the distinguisher chooses IV:{0,1}^n at
+random, computes the sequence ri=h(IV+i) and submits IV m1+r1
+... ml+rl to the adversary.
+When the adversary enters the second phase and outputs two messages
+m^0, m^1 to distinguish based on their descriptions the distinguisher
+then chooses a random bit b:{0,1} and IV:{0,1}^n uniformly at random
+and returns the encryption of m^b w.r.t. IV to A. Further oracle
+queries from A are answered as before.
+When A eventually outputs a bit b' reply 1 if b=b' and 0 otherwise.
+Let us see what the adversary can do when h is a truly random
+function. Again, the adversary has a certain advantage which we now
+try to bound from above.
+Let us assume that for any two encryptions requested the sequences of
+the counter values do not overlap, i.e. the sets
+{IV,IV+1,IV+2,...,IV+l} and {IV',IV'+1,IV'+2,...,IV'+l}, where IV and
+IV' are the respective initialisation vecotrs chosen, are disjoint. In
+this case, the ciphertexts follow a uniform random distribution and
+the adversary cannot glean any information whatsoever, i.e. has
+success probability 1/2.  If they do overlap the adversary may or may
+not use the extra information but if we can show that the probability
+of this happening is negligible we are good. So, this is what we now
+try to do.
+Let us first consider that the overlap happens between the first and
+second oracle question. In this case, once IV has been chosen, at most
+2l values for IV' are unfavorable (or favorable depending on
+viewpoint): IV, IV-1 ... IV-l, IV+1, IV+2, ... IV+l. Thus, we have a
+probability of 2l*2^n. Now, since such a clash can happen anywhere we
+multiply this with q(n), the number of questions being asked. Thus, we
+get an overall bound of q(n)*2l/2^n on the probability of an
+overlap. Even if l is not constant, but polynomial in the security
+parameter this is negligible (noting of course that in view of the
+time bounds q(n) is polynomially bounded). As a result, the success
+probability for the adversary can be bounded by 1/2+negl(n).
+The rest of the argument is as before: If h is F_k(), then let p be
+the success probability of the adversary A on our new cryptosystem. It
+is clear that our distinguisher will output 1 with probability p for
+we have used A according to the rules of the
+PrivK^cpa-experiment. Since F was assumed to be a PRF we thus know that
+|p-(1/2+negl(n))| is itself negligible. It then follows that p is
+bounded by 1/2 plus a negligible function.                    Q.E.D.
+We close by remarking that none of the schemes presented so far is
+secure against chosen ciphertext attacks (CCA) where one may request
+the decryption of arbitrary ciphertexts different from the
+challenge. Consider, e.g. the encryption of m as r || F_k(r)+m. An
+adversary can prepare messages m0=00..0 and m1=11..1. Upon receiving a
+ciphertext c= r || s corresponding to one of these two the adversary can
+flip the first bit of s and ask for the decryption of the resulting
+ciphertext. This must be either 100..0 or 011..1 according to whether
+m was 00..0 or 11..1.
diff --git a/ss2017/cryptography/Notes06a.txt b/ss2017/cryptography/Notes06a.txt
new file mode 100644
index 0000000..6791ce9
--- /dev/null
+++ b/ss2017/cryptography/Notes06a.txt
@@ -0,0 +1,105 @@
+Multiple encryptions from cpa-secure cryptosystem
+- - - - - - - - - - - - - - - - - - - -- - - - - -
+We have seen that a block cipher can be used repeatedly with the same
+key (using e.g. CTR-mode) without compromising security. We show now
+that this is true for an arbitrary cpa-secure cryptosystem. From a
+practical point of view this is not so important because cpa-secure
+cryptosystems typically come from block ciphers, and a block cipher is
+used more efficiently through one of the modes of operation given.
+However, the proof of the result (cpa->multiple) is interesting at
+this point because it is the first time where we use semantic security
+(here in the cpa sense) as an assumption rather than establishing
+it. Furthermore, the result becomes practically relevant in the
+context of public key cryptography where by definition security
+against eavesdroppers and cpa security coincide.
+Theorem. Let Pi=(Gen,Enc,Dec) be cpa-secure. Build
+Pi'=(Gen,Enc',Dec')
+Enc'_k(m1 || m2) = Enc_k(m1) || Enc_k(m2)
+Then Pi' is cpa-secure, too.
+Proof: Consider an adversary A' against Pi'.  We run A' and answer
+oracle queries honestly using Enc'_k.  Adversary A' eventually
+produces
+m^0 = m1^0 || m2^0
+m^1 = m1^1 || m2^1
+and asks for a ciphertext c1 || c2.  For b1,b2,b':{0,1} define
+P(b1,b2,b') to be the probability that A' outputs b' when
+c1<-Enc_k(m1^b1); c2<-Enc_k(m2^b2).
+The success probability of A' equals  1/2 * (P(0,0,0) + P(1,1,1))
+The probabilities P(0,1,0) and P(0,1,1) are somewhat funny since they
+correspond to replies to the adversary's question that are against
+the rules. However, we still know that their sum is one and this can
+be used as follows:
+Let us design an adversary A against the original system Pi. This
+adversary A produces m2^0 and m2^1 and receives a ciphertext c2 for
+one of the two. It can then obtain c1<-Enc_k(m1^0), forward c1||c2 to
+A' and output the bit b' it eventually receives from A'. The
+probability that this bit is correct then is 1/2 P(0,0,0) + 1/2
+P(0,1,1) by making a case distinction over the equally likely
+possibilities b=0 or b=1. So, assuming that Pi is cpa secure, we
+obtain a negligible function negl(n) so that
+1/2 P(0,0,0) + 1/2 P(0,1,1) <= 1/2+negl(n)
+Similarly, by producing m1^0 and m1^1 instead and using
+c2<-Enc_k(m2^1), we obtain a bound
+1/2 P(0,1,0) + 1/2 P(1,1,1) <= 1/2+negl(n)
+Summing up and using the fact that the sum of negligibles is
+negligible we get
+1/2(P(0,0,0)+P(1,1,1)) + 1/2 <= 1 + negl(n) 
+and so 1/2(P(0,0,0)+P(1,1,1))  <= 1/2 + negl(n)  QED. 
+Let us now consider the more general case where Pi is used to encode a
+sequence of message of some arbitrary length l=l(n), i.e., depending
+on n. Thus, define Pi'=(Gen,Enc',Dec') by
+Enc'_k(m1||...||ml) = Enc_k(m1) || ... || Enc_k(ml) etc
+Theorem: If Pi is cpa-secure so is Pi' as defined above.
+Proof: The proof is similar to the one before. Adversary A' against
+Pi' eventually produces m1^0...ml^0 and m1^1...ml^1 and waits for
+ciphertexts c1..cl. For 0<=j<=l define P_j(b') to be the probability
+that A' outputs b' assuming that
+c1<-Enc_k(m1^0) ... cj<-Enc_k(mj^0),
+c_{j+1}<-Enc_k(m_{j+1}^1) ... cl<-Enc_k(ml^1)
+The success probability for A' is 1/2 * (P_l(0) + P_0(1)) and, as
+before, we should bound this by 1/2+negl(n).
+For the skew probabilities it holds as before P_j(0)+P_j(1)=1.
+For each t=1..l we build an adversary A_t against Pi by outputting
+mt^0 and mt^1, receiving ct, requesting c1<-Enc_k(m1^0),
+c2<-Enc_k(m2^0) ... c_{t-1}<-Enc_k(m_{t-1}^0),
+c_{t+1}<-Enc_k(m_{t+1}^1)...c_l<-Enc_k(m_l^1). Notice the switch from
+^0 to ^1 after we reach t. Adversary A_t then forwards c1..cl to A'
+and returns its answer. The success probability is
+1/2(P_{t}(0)+P_{t-1}(1)) which is hence bounded by 1/2 + negl(n):
+For all t=1..l : 1/2(P_t(0)+P_{t-1}(1)) <= 1/2+negl(n)
+Summing all these inequalities noticing P_t(0)+P_t(1)=1 yields
+1/2(P_l(0) + (l-1)*1 +  P_0(1)) <= 1/2*l + negl(n) /* another
+                                            negligible function */
+and thus 1/2(P_l(0) + P_0(1)) <= 1/2 + negl(n) as required. QED
diff --git a/ss2017/cryptography/Notes07.txt b/ss2017/cryptography/Notes07.txt
new file mode 100644
index 0000000..6d60d56
--- /dev/null
+++ b/ss2017/cryptography/Notes07.txt
@@ -0,0 +1,204 @@
+7 Practical block ciphers. Principles: s/p- and Feistel networks (5.1-5.2)
+- - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - -- - - -
+We now describe how pseudorandom functions and block ciphers are
+practically constructed. The constructions given do not ensure the
+pseudorandom property, and in fact do not even fit into the asymptotic
+paradigm since the blocklength is typically fixed. Their design is
+guided by heuristic principles and the practical evidence that so far
+no attacks have been found.  As already mentioned, there are other,
+less efficient constructions that merely require the existence of a
+PRG, but again, the latter has not yet been shown to exist
+unconditionally.
+In this section, a block cipher (of key length n and block length l)
+thus is simply a function F:{0,1}^n x {0,1}^l -> {0,1}^l such that F_k
+given by F_k(m)=F(k,m) is invertible (a permutation) for each k.
+Claude Shannon formulated two important principles guiding the design
+of block ciphers:
+Confusion: statistical regularities in the plaintexts, e.g. letter
+frequencies or frequencies of n-grams should not show up in the
+ciphertext.
+Diffusion: changing one bit of the plaintext should (on average)
+change a large proportion, e.g. one half of the bits in the
+ciphertext.
+These principles are not rigorously defined and should be seen rather
+as guidelines. The following procedure shows intuitively how they
+could be observed.
+Let us say that the key comprises 16 byte-substitutions f_1, ... f_16
+: {0,1}^8->{0,1}^8, e.g. various shifts or other operations. So k =
+(f_1,..,f_16). A first attempt at constructing a block cipher of
+length 128 would be as follows:
+F_k(x) = f_1(x_1) ... f_16(x_16)
+where x is decomposed into 16 consecutive 8-bit blocks (bytes) x_1
+.. x_16.
+This introduces confusion because each byte is subjected to a
+different permutation. However, the cipher still has insufficient
+diffusion since x1 only influences the first byte of the ciphertext
+etc. This lack of diffusion implies in particular that with chosen
+plaintexts we can learn parts of the value tables of individual key
+substitutions. 
+To remedy this, we subject the individual bits of the ciphertext to
+some global, known permutation P:{1..128}->{1..128} of the positions
+and repeat this entire procedure several times ("rounds"). After a few
+rounds a large degree of diffusion will be reached ("avalanche
+effect") provided that the permutation mixes across block boundaries
+and that the substitutions in themselves already achieve a certain
+degree of diffusion in the sense that each input bit influences
+several output bits and vice versa.
+The substitutions and permutations used in each round need not always
+be the same which then leads to the following general pattern.
+Repeatedly break the text into equal size blocks, e.g. bytes which are
+subjected to individual substitutions. Afterwards, the bits are
+shuffled across the entire text using a permutation of the
+positions. The key determines the substitutions and permutations being
+used. Some of them may be independent of the key.
+S/P-networks
+- - - - - - -
+A substitution-permutation network (S/P-network) implements this idea
+in principle, however, neither the substitutions f_i, nor the
+permutations on positions depend on the key but are rather fixed
+upfront. In this way, confusion and diffusion are maintained, but of
+course key dependency needs to be introduced one way or another. This
+happens by XOR-ing the intermediate results with parts of or a known
+function of the key. The key itself is often called *master key* and
+the portions of it used as XOR-masks in the different rounds are known
+as *key schedule*. The substitution functions being used are commonly
+known as *S-boxes*. The permutations on positions are called *mixing
+permutations*. Thus, in an S/P-network the message is repeatedly
+subjected to the following three steps:
+a) XOR-ing with keys derived from the master key through a key schedule
+b) blockwise application of the S-boxes
+c) global application of a mixing permutation
+There are a number of principles to be followed when designing an
+S/P-network.
+1. The S-boxes used *should be invertible* for otherwise one would not
+obtain an invertible cipher in the end.
+2. The S-boxes *should not be linear* functions in the sense of XOR
+for then the entire cipher would be a linear function which opens it
+up to a variety of attacks based on solving linear equations, see
+below. In fact they should not even be close to linear functions in
+the sense of Hamming distance. 
+3. The S-boxes should implement the *avalanche effect*. Changing one
+bit in the input changes at least two bits in the output no matter
+what the other bits are set to. In addition, the mixing permutations
+should spread the output bits of any given S-box, i.e. in one block to
+different S-boxes, i.e. blocks in the next round.
+It is not hard to see that after a modest (polynomial in the message)
+number of rounds these design principles ensure sufficient diffusion
+because after each round the number of bits affected by a single bit
+flip in the input roughly doubles.
+Attacks against badly designed S/P-networks
+- - - - - - - - - - - - - -- - - - - - - - -
+Suppose we have an S/P-network with only one round. Given a plaintext
+m and a ciphertext c we can then work out the key as follows. Undoing
+the (known!) mixing permutation and the S-boxes yields c' := m+k from
+which we retrieve k by adding m since m+k+m = k.
+Indeed, practical implementations of S/P-networks usually omit the
+application of S-boxes and the mixing permutation in the last round
+since it can be undone anyway as we just saw. (NB in the 2nd edition
+of KL the S/P-networks are set up in this fashion from the beginning;
+however the present approach looks more systematic)
+Suppose now that we have an S/P-network with two rounds. Suppose for
+the sake of concreteness that the cipher uses 16 S-boxes operating on
+4-bit each so that the total message length is 64. Suppose
+furthermore, that the key k consists of 128 bits of which the first
+half is used in the first round and the other half in the
+second. Given m and c we, being the attacker, first undo the
+S/P-operation of the second round as before. Now consider any 4-bit
+block in the output. It depends on at most four 4-bit blocks from the
+input because in the worst case each bit comes from a different
+S-box. We also know from the mixing permutation in the first round
+which these four 4-bit blocks are. Its 16 bits are transformed into
+the 4 bits of the block we're looking at using a known function and 16
+ 4 bits of the key. These can all be tried out in 2^20 =~= 10^6 time
+which is altogether feasible. A similar argument applies to a three
+round S/P-network, however the number of key bits influencing a four
+bit block then goes up to 4+16+64 = 84 which is still a bit shorter
+than 128, but cannot really be feasibly explored by brute force.
+However, even after three rounds the S/P-network can be efficiently
+distinguished from a truly random permutation. Namely, each output bit
+is then not influenced by 44 of the input bits. A distinguisher can
+then toggle these bits and see whether the observed bit changes. If it
+does not for, say 10^9 out of the 2^44 possible combinations then the
+possibility that the distinguisher deals with a truly random function
+is extremely small so it has an advantage.
+Once we use four or more rounds (and the mixing permutations as well
+as the S-boxes) were well designed then the avalanche effect has fully
+kicked in and attacks like the ones sketched are no longer possible.
+Let us also see why linear S-boxes cannot be recommended: if all
+S-boxes f are linear functions in the sense that f(x+y)=f(x)+f(y) then
+for each key k the function Enc_k() is linear too, i.e., we have
+Enc_k(x+y)=Enc_k(x)+Enc_k(y). In this case, there exists for each key
+an nxn matrix A_k with 0,1 entries (n the message length) such that
+                               Enc_k(m) = A_k m 
+If we now have n^2 many pairs of plaintext-ciphertext we obtain n^2
+linear equations which we can solve for the entries of A_k. Once A_k
+is known then even if we do not know k itself we can decrypt arbitrary
+ciphertexts encrypted with the same key using the inverse matrix. This
+kind of attack applies to all linear encryption functions regardless
+whether they are S/P-networks. It also applies to affine linear
+ciphers, where Enc_k(m) = A_k m + b_k.
+Feistel networks
+- - - - - - - -
+Feistel networks are an alternative approach to designing practical
+pseudorandom permutations. It also operates in rounds and each round a
+key-dependent *mixing function* f_i() is applied to the second half of
+the current text which is then XOR-ed with its first half to become
+the new second half. The new first half is the old second half. Thus,
+if L_i and R_i denote the first and second halves of the text after
+round i then we have
+         L_0 R_0 := "plaintext"
+         L_{i+1} := R_i
+         R_{i+1} := L_i + f_i(k_i,R_i)
+Here k_i is the round key used in the i-th round which is derived from
+the master key through the key schedule as usual.
+Notice that,
+        R_i = L_{i+1}
+        L_i = R_{i+1}+f_i(k_i,L_{i+1})
+so that the Feistel network is invertible as desired no matter what
+f_i() does.
+One can show (KL6.4) that if the mixing functions f_i are pseudorandom
+functions then the function defined by a four round Feistel network is
+a pseudorandom permutation.
diff --git a/ss2017/cryptography/Notes08.txt b/ss2017/cryptography/Notes08.txt
new file mode 100644
index 0000000..fd5c54b
--- /dev/null
+++ b/ss2017/cryptography/Notes08.txt
@@ -0,0 +1,382 @@
+8 DES (Data encryption standard) and AES (Advanced encryption standard)
+-----------------------------------------------------------------------
+The Data Encryption Standard (DES) was published 1975 by the
+US-administration after public consultation. It was designed and used
+as a standard for non-military secrte communication e.g. between
+government bodies but also in the private sector. It is a 16-round
+Feistel network of message length 64 and key length 56. Mainly due to
+its relatively short key length it is no longer considered secure.  It
+is nevertheless still widely used in legacy applications and in the form
+of three-fold repetition (3DES) which is considered secure but
+somewhat inefficient.
+The Feistel mixing function of DES uses a round key k_i of length 48
+which is applied to a 32 bit long half block as follows:
+1) The 32 bit half block is expanded to 48 bit by repeating half of
+its bits using an expansion function E() (see table below) 
+2) The round key is XOR-ed with the resulting block
+3) The result is divided into eight 6-bit blocks to which S-boxes
+S_1..S_8 are applied (see table below). Each S-box has 6 inputs and 4
+outputs, so a 32bit block results.
+4) a mixing permutation P is applied to the resulting 32bit. 
+The DES key schedule is as follows: initially the master key is
+subjected to a mixing permutation on 56 bits. Thereafter, the round
+key for the each round is applied by extracting (in a round-dependent
+fashion) 48 bits from these 56 bits in such a way that the first 24
+bits stem from the first half of the (permuted) key and the second 24
+bits from the second half. See below for the exact form of the bit
+extraction functions.
+The DES encryption function is now obtained by enclosing the 16-round
+Feistel network FN_k described above by an initial permutation IP and
+its inverse listed below, thus Enc_k(x) = IP^{-1}(FN_k(IP(x))).
+The initial permutation has no influence on security. It is there only
+for historical reasons having to do with 8-bit hardware.
+Each S-Box can be regarded as a package consisting of four
+*invertible* operations on size 4 bitstrings. In conjunction with the
+expansion function they operate as follows: The current size 32
+bitstring is divided into 8 blocks of size 4. To the ith such block
+the ith S-box (eight in total) is applied in the following way: first
+the two neighbouring bits (modulo 32) are taken, e.g. for the block
+comprising bits 5,6,7,8 these neighbouring bits are 4 and 9. For the
+first block comprising bits 1,2,3,4 the neighbouring bits are 32 and
+5. These two bits then are used to select the invertible substitution
+to be applied to the block. This intuition is reflected in the
+tabulation of the S-boxes below as binary functions of the central
+size four block and the neighbouring two bits. Also, the expansion
+function E() is designed to implement this idea, i.e. it duplicates
+the "neighbouring bits.
+"
+DES supplementary material
+- - - - - - - - - - - - -
+Initial permutation: The first bit of IP(x) is the 58th bit of x; the
+second bit of IP(x) is the 50th bit of x etc (read left to right and
+top to bottom). 
+IP: 
+58      50      42      34      26      18      10      2
+60      52      44      36      28      20      12      4
+62      54      46      38      30      22      14      6
+64      56      48      40      32      24      16      8
+57      49      41      33      25      17      9       1
+59      51      43      35      27      19      11      3
+61      53      45      37      29      21      13      5
+63      55      47      39      31      23      15      7
+Expansion function: to be read left to right, top to bottom. First bit
+of E(x) is the 32th, i.e. last, bit of x. 
+E:
+32      1       2       3       4       5
+4       5       6       7       8       9
+8       9       10      11      12      13
+12      13      14      15      16      17
+16      17      18      19      20      21
+20      21      22      23      24      25
+24      25      26      27      28      29
+28      29      30      31      32      1
+Mixing permutation P:
+16  7 20 21 29 12 28 17
+ 1 15 23 26  5 18 31 10
+ 2  8 24 14 32 27  3  9
+19 13 30  6 22 11  4 25
+The S-boxes given as binary functions:
+S_1: to be applied to the first block
+E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7
+0 F 7 4 E 2 D 1 A 6 C B 9 5 3 8
+4 1 E 8 D 6 2 B F C 9 7 3 A 5 0
+F C 8 2 4 9 1 7 5 B 3 E A 0 6 D
+This is to be read as follows: upon receiving a 6bit block the S-box
+uses the outer 2 bits to select the row and the inner four bits (read
+in decimal). The decimal number found there represents the output. For
+example, on input 011011 we select row #1 (0xxxx1) and the column #13
+(x1101x) (counting from 0 each time). There we find (hexa)decimal
+number 5 representing output 0101. Similarly, on input 101000 we
+select row #2, column #4, finding D=13, hence outputting 1101.
+For the remaining S-boxes S_2,...,S_8 we refer to the Internet. 
+Initial mixing function selecting 56 out of 64 bits. The remaining 8
+bits serve as parity bits (each of the eight bytes comprising the key
+must have even parity). Does not have any influence on security. 
+PC1:
+57      49      41      33      25      17      9
+1       58      50      42      34      26      18
+10      2       59      51      43      35      27
+19      11      3       60      52      44      36
+63      55      47      39      31      23      15
+7       62      54      46      38      30      22
+14      6       61      53      45      37      29
+21      13      5       28      20      12      4
+The ith round key is obtained from the 56bit key by rotating both
+halves i bits to the left and then selecting 48 bits with the
+following function.
+PC2:
+14      17      11      24      1       5
+3       28      15      6       21      10
+23      19      12      4       26      8
+16      7       27      20      13      2
+41      52      31      37      47      55
+30      40      51      45      33      48
+44      49      39      56      34      53
+46      42      50      36      29      32
+Discussion of DES
+- - - - - - - - -
+The S-boxes have the important property that changing one bit of the
+input always changes two bits of the output. For example, the first
+row of S_4 maps 7=0111 to 10=1010. The 1bit alterations of 7 are
+15=1111, 3=0011, 5=0101, 6=0110. These are mapped to 15=1111, 3=0011,
+6=0110, 9=1001. All of these differ from 10=1010 in two bits.
+Furthermore, the mixing permutation is such that any two bits from the
+same 4 bit block end up in *different* 4 bit blocks where they will
+cause more bit differences and so on thus ensuring the desired
+avalanche effect.
+Purportedly, the S-boxes have been designed so as to withstand a
+particular attack called *differential cryptanalysis* which was known
+to the developers but not published. Only 15 years later was this
+discovered again and published by Biham and Shamir. 
+The text KL contains some more detail about attacks on variants of DES
+with fewer rounds which exploit the fact that the avalanche effect has
+not fully unfolded then.
+With today's technology the 2^56 keyspace can be completely searched
+with special hardware in several hours so that DES is no longer secure
+for critical applications. There is a variant 3DES (Triple DES) which
+uses three DES keys k1,k2,k3 (hence has key size 168) and encrypts according to the formula
+  Enc^3DES_{(k1,k2,k3)}(m) = Enc^DES_k3(Dec^DES_k2(Enc^DES_k1(m)))
+Thus, to encrypt a message we encrypt it (using DES) with key k1 then
+decrypt it with key k2 and then encrypt it with key k3.
+The fact that we use decryption rather than encryption in the second
+phase ensures that standard DES arises as the special case of three
+identical keys:
+              Enc^3DES_{(k,k,k)}(m) = Enc^DES_k(m)
+A version with just two phases does not provide the desired increased
+security due to the possibility of a meet-in-the-middle attack:
+suppose we (temprorarily) define 2DES by
+  Enc^2DES_{(k1,k2)}(m) = Enc^DES_k2(Enc^DES_k1(m))
+Then given a pair (m,c) where c=Enc^2DES_{(k1,k2)}(m) we can recover
+k1 and k2 in time ~2^56 as follows:
+Define x_k = Dec^DES_k(c) and y_q = Enc^DES_q(m) and tabulate x_k and
+y_q for any DES keys k and q. This takes 2*2^56 DES operations (and
+also requires this much space to store the tables). One can then sort
+the tables and in this way quickly find k,q so that x_k=y_q. This pair
+then is the desired 2DES key.
+One can also use this attack against 3DES (splitting either after the
+third or after the second phase) which reduces the effective key size
+to 112.
+It is not clear whether "meet-in-the-middle" is really practical
+because, at least when implemented naively, the storage requirements
+for the tables exceed current technology.
+Advanced Encryption Standard (AES)
+- - - - - - - - - - - - - - - - - - 
+The AES which was selected by the US administration (NIST) from
+submissions to a public competition in 2000 is an S/P-network with
+block size 128bit and key sizes of 128, 192, or 256 bit (AES-128,
+AES-192, AES-256). AES-128 being the most popular variant we will only
+describe this one here. The original name of the cipher prior to being
+selected as the AES was "Rijndael" (Dutch for Rhine Valley) and chosen
+by its Belgian authors Joan Daemen and Vincent Rijmen.
+The substitution operation applies a fixed invertible function (the
+AES S-box) on size-8 bitstrings (bytes) to each of the 16 bytes of the
+text.
+The subsequent mixing step is in the case of AES not just a bit
+permutation but in fact an invertible linear transformation.
+AES operates on the byte level and the text is arranged into a 4x4
+matrix whose entries are bytes (4x4x8=128). The bytes are mostly
+understood as elements of the 256-element field GF(256). Thus, there
+is also a multiplication operation b*b' on bytes which together with
+the XOR-operation (bitwise addition mod2) endows the bytes with the
+structure of a field. In particular, each byte b:{00,...,FF} (one
+adopts hexadecimal notation) except 00 has an inverse b^{-1} such that
+b^{-1} * b = 1. Caveat: this multiplication is not multiplication of
+binary numbers mod 256. We describe below how it is defined in
+detail. For now it suffices to know that together with addition (XOR)
+it satisfies the usual algebraic laws (commutativity, associativity,
+distributivity, inverse).
+Now, each round comprises the following four steps: 
+1. AddRoundKey(): Addition mod2 of the 128 bit round key derived from the
+master key according to the key schedule. Can be viewed as the
+addition of a "round key matrix". 
+2. SubBytes(): Application of the S-Box to each of the 16 bytes of the
+text. First, each of the 16 bytes comprising the current text matrix
+is replaced with its inverse in GF(256). Exception: 00 for which there
+is no inverse is left alone (not replaced).
+Thereafter, each byte is transformed by the following affine map:
+       
+(b_7')        (1 1 1 1 1 0 0 0) (b_7)     (0)
+(b_6')        (0 1 1 1 1 1 0 0) (b_6)     (1)
+(b_5')        (0 0 1 1 1 1 1 0) (b_5)     (1)
+(b_4')    =   (0 0 0 1 1 1 1 1) (b_4)  +  (0)
+(b_3')        (1 0 0 0 1 1 1 1) (b_3)     (0)
+(b_2')        (1 1 0 0 0 1 1 1) (b_2)     (0)
+(b_1')        (1 1 1 0 0 0 1 1) (b_1)     (1)
+(b_0')        (1 1 1 1 0 0 0 1) (b_0)     (1)
+These matrix and vector operations are understood in GF(2),
+ie. addition is (as usual) XOR of bits whereas multiplication is
+logical AND.
+The purpose of the inversion part is to introduce a high degree of
+nonlinearity, i.e. to not even be close to a linear function. The
+purpose of the affine transformation (which is linear) is to introduce
+a certain amount of diffusion, i.e. that changes in one bit in the
+input of the S-box result in >1 bit changes at the output. The affine
+transformation also ensures that not the entire operation of AES can
+be regarded in terms of GF(256) which could possibly result in
+vulnerabilities based on algebraic properties.
+Usually, the entire S-box is implemented as a 256 byte lookup table.
+3. ShiftRows(): The ith row is (cyclically) shifted i-1 columns to the
+left (i=1,2,3,4):
+(a b c d)      (a b c d)
+(A B C D)  |-> (B C D A)
+(u v w x)      (w x u v)
+(U V W X)      (X U V W)
+4. MixColumns(): The 4x4 text matrix is multiplied (from the left) by
+the (invertible) matrix
+(02 03 01 01)
+(01 02 03 01)
+(01 01 02 03)
+(03 01 01 02)
+The entries of this matrix are understood as hexadecimal notation for
+bytes, e.g.03 stands for the element 00000011 of GF(256). Thus, we
+have 02+01=03 but 01+01=00 and 02+03=01.
+In the last (16th) round, MixColumns() is replaced with another
+AddRoundKey().  Notice that---as pointed out earlier---key-independent
+steps at either the beginning or the end of the cipher could be easily
+undone by an attacker and thus are worthless.
+The AES key schedule thus must transform the 128bit master key into 17
+equally sized round keys. Let k be the 128bit master key. Just as the
+texts we view it as a 4x4 matrix with entries from GF(256). We now
+generate a sequence k_0,...,k_{16} of round keys (17 in number) as follows:
+k_0 = k                        (01 01 01 01)
+k_{i} = (k_{i-1} + (c'|0|0|0)) (00 01 01 01)
+                               (00 00 01 01)  
+                               (00 00 00 01)
+                              \______ ______/
+                                     V
+                                     M
+where the column c' is obtained from the last column c of k_{i-1} as follows:
+       (00 00 00 01)               (02^i)
+c' :=  (01 00 00 00) SubBytes(c) + ( 00 )
+       (00 01 00 00)               ( 00 )
+       (00 00 01 00)               ( 00 )
+      \______ _____/
+             V
+             R
+             
+Here SubBytes(c) denotes the result of the application of the AES
+S-box to each byte in the column c. The byte 02^i is 02*02*...*02 (i
+times) in GF(256). The matrix R cyclically rotates the column
+downwards. To then obtain k_i we add c' to the first column of k_{i-1}
+to obtain the first column of k_i. The next three columns are obtained
+by adding the previous one (of k_i) to the corresponding one of
+k_{i-1}. The matrix M does just that.
+We see that the next round key is obtained from the previous one by
+some simple linear operations and one nonlinear function of the last
+column.
+Decryption: We notice that all the functions used are invertible so
+the decryption just performs these inverse operations in the opposite
+order. We omit the details.
+Arithmetic in GF(256)
+- - - - - - - - - - -
+In order to understand how multiplication in GF(256) works one
+represents a byte b=b0..b7 as a polynomial with coefficients in
+GF(2)={0,1}
+b(X) = b7 * X^7 + b6 * X^6 + ... + b1 * X + b0
+In order to multiply two bytes one multiplies them qua polynomials and
+then takes the remainder (using polynomial division) modulo the
+polynomial
+p(X) = X^8 + X^4 + X^3 + X + 1
+For example, 03 is X+1, so 03*03 = X^2+1 = 05 (the coefficients are
+calculated modulo 2).  The inverse of 03 is F6=11111010 because
+(X^7+X^6+X^5+X^4+X^2+X)*(X+1) = X^8+X^7+X^6+X^5+X^3+X^2+X^7+X^6+X^5+X^4+X^2+X =
+\___________ _________/ \_ _/ = X^8+X^4+X^3+X = 1 (mod p) 
+            V             V
+           F6            03
+One finds such inverses using the extended Euclidean algorithm or
+using a table of logarithms. Indeed, each element of GF(256) \ {0} can
+be written as a power of 03 (some other elements can be used in place
+of 03). One may write log(x) for the number t:{0..254} such that x =
+03^t. One then has x^{-1} = 03^{255-t}. The logarithms can also be
+used to reduce multiplication to addition mod 255.
diff --git a/ss2017/cryptography/Notes09.txt b/ss2017/cryptography/Notes09.txt
new file mode 100644
index 0000000..7c6f51d
--- /dev/null
+++ b/ss2017/cryptography/Notes09.txt
@@ -0,0 +1,259 @@
+9 Cryptanalysis
+---------------
+Cryptanalysis is aimed at breaking ciphers in the sense of
+reconstructing a key from given or chosen pairs of plain- and
+ciphertext. In the first chapter, we saw how various historic ciphers
+can be broken using regularities in the statistical distribution of
+ciphertexts or rather using the fact that known properties of the
+statistical distribution of plaintexts shine through in the
+statistical distribution of plaintexts. Modern cryptographic schemes
+avoid such attacks by making sure that the statistical distribution of
+ciphertext looks close to uniform no matter what the statistical
+distribution of the plaintexts is. The principles of confusion and
+diffusion were introduced to ensure this property at least
+approximately in practice. The notion of pseudo random generator
+furnishes an ideal yet useful theoretical model.
+Attacks against cryptographic schemes that properly implement
+confusion and diffusion are much more complicated and in many cases,
+in particular DES and AES not known to exist except in the sense that
+the computational time needed to carry them out is close to that of a
+brute-force attack (trying out all keys) which is always
+possible. Nevertheless, it is important to know and study principles
+of such attacks in order to avoid them when designing a new cipher,
+e.g. a smaller variant of existing ones to be run in
+resource-constrained environments e.g. "Internet of Things".
+We have already seen that a cipher that only uses linear or affine
+linear operations is vulnerable to an attack based on solving systems
+of linear equations. In this chapter we take a brief look at three
+more principles of cryptanalysis. First, we show why feedback shift
+registers which are a popular device for generating pseudorandom
+numbers are not secure.
+Linear Feedback Shift Registers (LFSR): Such a device is given by a
+sequence of shift registers with some outputs fed back via XOR to the
+input. For instance:
+       >[D]-o->[D]-o->[D]---o------------->
+      |  |             |    |
+       -(+)-----------(+)----
+When the shift registers (here noted [D]) are initialised with some
+bits then a stream of bits is generated at the output. E.g. starting from 100
+we obtain the following sequence of states:
+100,110,111,011,101,010,001,100,...
+and the following output sequence 00111010...
+Here, from xyz we get to uxy where u=x+z (mod 2).  Of course, the
+sequence repeats after at most 8 steps, but until then looks rather
+random. With a chain 16 shift registers one can achieve a period
+length of 65535. In order to encrypt a stream of bits ("stream
+cipher") one can initialise the shift registers with the secret key
+and then add (XOR) the generated stream to the data stream. Due to the
+relative simplicity of implementation this has been used for a long
+time in esp. military applications.
+However, LFSR are linear (the sequence generated by the sum of two
+keys equals the sum of the sequences generated by the keys
+individually) and this can be exploited in various ways. The most
+interesting way uses the Berlekamp Massey algorithm which from a given
+sequence of bits computes the shortest LFSR which generates it. If one
+knows the beginning m of a plaintext and the corresponding ciphertext
+c, e.g. because the plaintext starts with the date or some formal
+greeting one can then apply the Berlekamp-Massey algorithm to m+c and
+use the computed LFSR to decode the remaining ciphertext.
+In order to circumvent this one can use nonlinear functions in the
+feedback loop, e.g. logical or use nonlinear combinations of several
+LFSR. The resulting stream ciphers, e.g. A5/1 and A5/2, E0 are still
+in use in GSM and Bluetooth but are known to have vulnerabilities
+too. Indeed, the design of secure stream ciphers is very difficult and
+for high security applications one resorts to block ciphers run,
+e.g. in CTR mode in order to encrypt bitstreams.
+Linear cryptanalysis
+- - - - - - - - - - - 
+Linear cryptanalysis was developed in the 90s by Matsui and apparently
+was not known to the DES developers. Nevertheless, it can only produce
+a theoretical attack on the latter and on AES.
+We illustrate linear cryptanalysis with the toy cryptosystem from
+Michael Heys' tutorial
+https://www.engr.mun.ca/~howard/PAPERS/ldc_tutorial.pdf.
+It is an S/P network with a block length of 16 and four proper rounds,
+i.e. four rounds and an additional key mixing step. We regard each
+block as a 4x4 bit matrix. The substitution step applies the first
+S-Box of DES S_1, i.e.
+0 1 2 3 4 5 6 7 8 9 A B C D E F
+-------------------------------
+E 4 D 1 2 F B 8 3 A 6 C 5 9 0 7
+separately to each of the four rows (regarded as a hexadecimal number). So
+  
+(1 0 1 0)             (A)         (6)          (0 1 1 0)
+(0 1 1 0)   which is  (6) becomes (B) which is (1 0 1 1) 
+(1 1 0 1)             (D)         (9)          (1 0 0 1)
+(0 0 0 1)             (1)         (4)          (0 1 0 0)
+The permutation step is simply matrix transposition, thus our example
+becomes
+(0 1 1 0)           (6)
+(1 1 0 0)  which is (C)
+(1 0 0 1)           (9)
+(0 1 1 0)           (6)
+As usual, each round consists of addition mod2 of the round key,
+substitution (S), and permutation (P), i.e. transposition. After four
+rounds we add one more key.  The master key simply consists of the
+concatenation of the five round keys k = k1 k2 k3 k4 k5.
+We can now observe that the S-box S is not completely balanced in the
+following sense. Let WX1 be (1 0 1 1)^T and WY1 = (0 1 0 0)^T. The ^T
+suffix means transposition so W1, W2 are actually columns. We now have
+              Pr(WX1^T x + WY1^T S(x) = 0) = 3/4
+where the probability is taken over uniformly distributed
+x:{0,1}^4. In other words, for 12 out of the 16 possible inputs x=(x1
+x2 x3 x4)^T and results S(x)=(y1 y2 y3 y4)^T one has x1+x3+x4=y2
+whereas for the remaining four one has x1+x3+x4=/=y2,
+i.e. x1+x3+x4=y2+1.
+One can find such an imbalance by exhaustive search. Another such
+imbalance is given by WX2=(0 1 0 0)^T WY2 = (0 1 0 1)^T:
+              Pr(WX2^T x + WY2^T S(x) = 0) = 1/4
+Chaining these imbalances through the whole cipher (using the first
+one on round 1 and the second one on the other rounds) one obtains a
+similar linear relationship between the plaintext, the ciphertext just
+after round four, and the four round keys that have been used until
+then: Letting P be the plaintext (now written as dim16 vector) and U
+the ciphertext prior to applying the last S-boxes, permuting, and
+mixing with the fifth and last round key we find dim16 vectors A, B
+D1, D2, D3, D4 such that
+    Pr(A^T P + B^T U + D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 = 0) = 15/32
+with keys K1 ... K4 fixed and probability taken over the plaintexts P. 
+Since D1^T K1 + D2^T K2 + D3^T K3 + D4^T K4 is either 0 or 1 we have that
+     Prob(A^T P + B^T U = 0) = 15/32 or Prob(A^T P + B^T U = 0) = 17/32
+irrespective of K1,..,K4. This allows us to work out a likely value
+for K5 or rather some bits of it.
+Namely, given a list of plaintexts P_1..P_N and corresponding
+ciphertexts C_1..C_n we can try out all 2^16 possible choices for K_5,
+for each one compute the corresponding U_i := S^{-1}(P^{-1}(C_i+K_5)),
+and compute the ratio
+   p(K_5) = {i | A^T P_i + B^T U_i = 0} / N 
+We choose a value K_5 for which this ratio is close to 15/32 or 17/32
+and discard all those where it is close to 1/2. Now, p(K_5) does not
+actually depend on all bits of K_5 but only on those where the
+corresponding bit in B^T is set for the other ones only influence
+components of U_i which are not used in p(K_5). Nevertheless, this
+allows us to considerably narrow down the possible options for
+K_5. For each of those, we can then continue in the same fashion and
+thus peel off the fourth round key etc.
+Differential cryptanalysis
+- - - - - - - - - - - - - -
+Differential cryptanalysis was developed by Biham and Shamir as an
+attack against DES but apparently already known to the DES
+developers. It has been used to successfully break other ciphers. It
+uses the fact that the effect of adding a round key cancels out when
+we consider differences between two plaintexts. However, this
+requires the preparation of chosen plaintexts. In more detail, the
+*differential* between two plaintexts X1 and X2 is simply X1+X2,
+i.e. the bitvector which has a 1 where X1 and X2 differ and a 0 where
+they are equal.
+Coming back to our 4bit example S-box for any given differential there
+then are 2^4=16 different ways of realising it. E.g. the differential
+1010 arises for the plaintext pair 0011,1001 and also for
+1111,0101. Indeed, for any plaintext X1 there is a unique X2 such that
+X1, X2 have a given differential.
+Given two differentials X* and Y* we are interested in the following
+question. For how many of the (16) plaintext pairs X1,X2 with
+differential X* is it the case that the results of applying the S-box
+Y1=S(X1), Y2=S(X2) have differential Y*. Tabulating these numbers by
+exhaustive search yields the following table:
+           Output differential
+     0 1 2 3 4 5 6 7 8 9 A B C D E F 
+I 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
+n 1  0 0 0 2 0 0 0 2 0 2 4 0 4 2 0 0 
+p 2  0 0 0 2 0 6 2 2 0 2 0 0 0 0 2 0
+u 3  0 0 2 0 2 0 0 0 0 4 2 0 2 0 0 4
+t 4  0 0 0 2 0 0 6 0 0 2 0 4 2 0 0 0 
+  5  0 4 0 0 0 2 2 0 0 0 4 0 2 0 0 2
+d 6  0 0 0 4 0 4 0 0 0 0 0 0 2 2 2 2 
+i 7  0 0 2 2 2 0 2 0 0 2 2 0 0 0 0 4
+f 8  0 0 0 0 0 0 2 2 0 0 0 4 0 4 2 2 
+f 9  0 2 0 0 2 0 0 4 2 0 2 2 2 0 0 0 
+' A  0 2 2 0 0 0 0 0 6 0 0 2 0 6 0 0 
+i B  0 0 8 0 0 2 0 2 0 0 0 0 0 2 0 2 
+a C  0 2 0 0 2 2 2 0 0 0 0 2 0 6 0 0 
+l D  0 4 0 0 0 0 0 4 2 0 2 0 2 0 2 0 
+  E  0 0 2 4 2 0 0 0 6 0 0 0 0 0 2 0   
+  F  0 2 0 0 6 0 0 0 0 4 0 2 0 0 2 0
+The high entries 8 and 6 are of particular interest because they
+constitute a deviation of an intuitive pseudorandom permutation.
+Let us now see how we can exploit these in order to attack our cipher.
+We feed in a pair of 16bit plaintexts X1 X2 which differ in the second
+block by B=1011 and are equal everywhere else.  With probability 8/16
+this will result in differential 2=0010 at the output of the second
+S-box. The output of the other S-boxes will be equal for X1 and X2.
+This differential 2=0010 will be propagated through the P network and
+results in a differential of 4=0100 at the entry of the third S-box of
+the second round:
+              Differential before P   Differential after P
+                    0 0 0 0                     0 0 0 0 
+                    0 0 0 0   ------P---->      0 0 1 0 
+                    0 1 0 0                     0 0 0 0 
+                    0 0 0 0                     0 0 0 0
+There, with probability 6/16 it will produce differential 6=0110 at
+the output. This results in differential 2=0010 at the input of the
+second and third S-boxes of the third round. With probability 36/256
+this will lead to differential 5 at both outputs which propagates to
+differential 6=0110 at the entries to the second and fourth S-boxes of
+the final round.
+                     0 0 0 0                    0 0 0 0 
+                     0 0 1 0  -------P----->    0 0 0 0 
+                     0 0 1 0                    0 1 1 0 
+                     0 0 0 0                    0 0 0 0
+Summing up, when we encipher plaintexts with differential 0B00 we will
+obtain with probability 1/2*3/8*9/64=27/1024 a differential 0606 at
+the entry to the last round. We now have to set the relevant
+components (second and fourth block) of the final key in such a way
+that this ratio (after inverting the last layer of S-boxes) is
+actually reproduced. This allows us (with high probability) to guess
+hone half of the bits of the last round key. Continuing in this way
+with other differentials combined with exhaustive search allows us to
+undo the entire last round and then to proceed from there.
diff --git a/ss2017/cryptography/Notes10.txt b/ss2017/cryptography/Notes10.txt
new file mode 100644
index 0000000..5457bf3
--- /dev/null
+++ b/ss2017/cryptography/Notes10.txt
@@ -0,0 +1,257 @@
+10 Message integrity, CBC-MAC (KL4.1-4.5)
+-----------------------------------------
+So far, we wanted to prevent an eavesdropper or a more active attacker
+from getting to know the content of encrypted messages or at least
+partial information about them.
+Another important objective for cryptography is message integrity and
+authenticity. We want to prevent the attacker from being able to
+manipulate an encrypted message or from fabricating one such.
+It is not the case that encryption automatically guarantees
+integrity. For example, with the Vernam cipher and derived systems
+flipping one bit in the ciphertext results in the same bit in the
+plaintext being flipped. Also, randomly chosen ciphertexts correspond
+to plaintexts albeit probably nonsensical ones. 
+Message authentication codes
+- - - - - - - - - - - - - - -
+Definition: A message authentication code (MAC) consists of two
+probabilistic polynomial time algorithms (Gen,Mac) where 
+1. On input n (security parameter) Gen outputs a key k of length >= n 
+2. Given key k and message m:{0,1}* the tag-generation algorithm Mac outputs
+   t = Mac_k(m)
+We assume (unlike in KL where this is a special case) that Mac is
+deterministic.
+If m is furthermore constrained to be from {0,1}^l(n) for some length
+function l(n), e.g. l(n)=n then the MAC is called *fixed length*.    EOD
+The intended usage is as follows: Alice and Bob share a secret key k
+that has been obtained with Gen(). If Alice wants to send a message m
+to Bob she could compute the tag t=Mac_k(m) and transmit (m,t) either
+encrypted or not.
+In order to be sure that the received message (m,t) was really from
+Alice he can then compute t'=Mac_k(m) and check that t=t'. The MAC
+should be designed in such a way that no adversary can fabricate valid
+tags unless it knows the key so that if the verification succeeds Bob
+can indeed be sure that the message was from Alice. The following
+experiment formalizes this.
+The message authentication experiment Mac-forge_{A,Pi}(n):
+Pi = (Gen,Mac) a MAC, A an adversary, n the security parameter
+1. A random key k is generated with Gen(n). 
+2. The adversary is given n and oracle access to Mac_k(), i.e. it can
+request arbitrary tags w.r.t k. The adversary eventually outputs a
+pair (m,t). 
+3. The output of the experiment is defined to be 1 if
+  (1) Mac_k(m)=t and (2) m has not previously been queried by A.
+Thus, the adversary must prepare a valid tag for a message it has not
+previously requested a tag for. 
+We define a MAC to be secure if for all adversaries 
+the probability that Mac-forge yields 1 is negligible.
+While this notion of security is indeed very strong there are some
+attacks it cannot withstand:
+1) Replay attacks. The adversary can store an authenticated message
+(m,t) and send it at some later point. In order to prevent that one
+can extend message with timestamps or similar.
+2) Timing attacks. If the check t=t'? proceeds by comparing bits
+starting from the left until the first mismatch is encountered (cf
+strcmp() from the C-library) then the time it takes to perform the
+check depends on the tags. This can be used to find out the bits of a
+valid tag one by one. According to KL this kind of attack has been
+used against an early version of the Xbox in order to install
+unlicensed games. To prevent the attack one should make sure that the
+computation time for the comparison is always the same.
+Construction of a fixed-length MAC from a pseudorandom function:
+Let F_k() be a pseudorandom function. The following then is a secure
+fixed-length MAC.
+1. Gen(): Given n choose k:{0,1}^n uniformly at random
+2. Mac(): On input k and m output the tag t=F_k(m)
+Theorem: The above MAC is secure.
+Proof sketch: As in the other security proofs it suffices to show this for
+truly random F (selected at the beginning of the experiment). The
+passage to a pseudorandom F then incurs a negligible change in
+probability. In order to produce a valid tag the adversary would have
+to correctly guess a random value from {0,1}^n which has indeed
+negligible probability.  QED
+CBC-MAC
+- - - - 
+In order to extend a fixed-length MAC to variable length messages one
+can break the message up into equal length blocks m=m_1,...m_l of
+length n each possibly padding the last block with 0s and then
+authenticate each block separately. In order that this is secure, one
+must, however augment each block with
+- the length l, 
+- its sequence number i:{1..l}
+- a random identifier r:{0,1}^n (the same for each block)
+Thus, the blocks to be authenticated have length 4n each (assuming
+that l and i fit into n bits) and an appropriate MAC must be
+used. Alternatively, one can break m into blocks of size n/4....
+These additional data prevent the adversary from dropping blocks at
+the end (length), changing the order of blocks (sequence numbers),
+mixing blocks from two different messages (random identifier).
+One can show that this scheme is indeed secure, but it is rather
+inefficient since the tags are 4 times as long as the message itself.
+A more efficient method mimicks the CBC-mode encryption but is subtly
+different:
+Rather than an arbitrary MAC we need a pseudorandom function F_k,
+i.e. a block cipher. We then decompose the message m into equal sized
+blocks
+m = m_1 .. m_l
+as before (using padding if needed). We then *prepend* the message
+with another block encoding the length l (which is thus assumed
+smaller than 2^n). Then blocks are encrypted and XORed with the
+encryption of the previous block prior to encryption. The encryption
+of the last block then is the desired MAC:
+t_0 = F_k(l)
+t_{i+1} = F_k(t_i + m_i)
+t = t_l
+We will prove that this scheme is secure. Notice, however, that
+seemingly innocuous modifications thwart its security:
+First, let us see why we need to incorporate the length. Suppose we
+would just put: 
+t_1 = F_k(m_1)
+t_2 = F_k(m_2+t_1)
+...
+t = t_l
+Then, if t=Mac_k(m_1 m_2) and t'=Mac_k(m_3+t m_4) then t'=Mac_k(m_1
+m_2 m_3 m_4) thus allowing us to forge a previously unqueried tag.
+Now suppose that we would start with a random initialisation vector IV
+and transmit it along with the tag:
+t_0 = F_k(l+IV)
+...
+This would allow us to modify the length: If (IV,t) is the MAC of l,m
+then (IV+l+l',t) is the MAC of l',m. 
+It is also not ok to attach the length at the end rather than at the
+beginning. Namely, suppose that we obtain the following tags where the
+length is added in parenthesis for clarity.
+ABCDE(5)  : t
+XYZKL(5)  : t'
+ABCDE5UVW(9) : t''
+We then have that the tag of XYZKL5 U+t+t' VW(9) is also t''. 
+We will now sketch a proof that CBC-MAC is indeed secure. We will show
+the following theorem from which the claim follows by the standard
+reasoning used in earlier security proofs.
+Theorem: Fix a security parameter n. For g:{0,1}^n->{0,1}^n and m=m1
+m2 ... mk write CBC_g(m) = CBC_g(m1,m2,...,mk) =
+   g(ml+g(m_{k-1}+ ... g(m2 + g(m1)) ... )
+Also let f:({0,1}^n)* -> {0,1}^n be an arbitrary function. Let D be an
+adversary which is given oracle access to a function ({0,1}^n)* ->
+{0,1}^n.
+Suppose that D queries its oracle on a prefix-free set, i.e. if X and
+Y are both queried then X is not a prefix of Y nor is Y a prefix of X.
+One then has
+   |Pr(D^{CBC_g()}(n) = 1) - Pr(D^{f()}(n) = 1)| = negl(n)
+where probabilities are taken over g and f chosen uniformly at random.
+To see why this implies the theorem we first consider that the
+difference between CBC_g and CBC_{F_k} is negligible since F_k is
+assumed to be pseudorandom. Furthermore, notice that in CBC-MAC proper
+the function CBC_{F_k} is applied to a subset of P:={|m| m |
+m:({0,1}^n)*} which itself is a prefix-free set. Thus, the whole
+business with including the length and putting it into the first
+position rather than anywhere else serves to ensure prefix-free-ness.
+Proof of Theorem (sketch; for details see KL, 2nd ed): 
+Let Q={X1..Xq} be the (prefix-free) set of queries (q = number of
+queries) made by the distinguisher D to the oracle. Let l be the
+maximum number of blocks in those queries. Notice that both q---the
+number of queries---and l---their maximum length are polynomial in n
+since the distinguished is assumed to run in polynomial time.  If D
+queries X and X = m1 m2 ... mk then (in the case where the oracle is
+instantiated with CBC_g()) the function g() will be evaluated at m1,
+m2+g(m1), m3+g(m2+g(m1)), ...  . Write C_g(X) for this sequence (of
+length < l). We say that Q exhibits a collision if among the
+g-evaluations occurring in {C_g(X) | X:Q} one finds the situation
+g(x)=g(x') where x =/= x'. We also consider it a collision if at some
+point we have mi+g(x) = mj'+g(x') despite x=/=x' (and then necessarily
+mi=/=mj').
+Under the assumption that g() is random, the probability that such a
+collision is exhibited is bounded by q^2 * l^2 * negl(n) for there are
+<= q^2 * l^2 possibilities for where such a collision might occur and
+the probability for a collision at a given position is negligible.
+Since q and l are polynomial in n this probability is therefore
+negligible itself.
+Let us assume that no collision is exhibited. Now, for each X:Q let
+v(X) be the last value in C_g(X). We have CBC_g(X) = g(v(X)).  The
+values v(X) are pairwise distinct and do not agree with any of the
+other values in {C_g(X) | X:Q}. This is where prefix-freeness is
+used. As a result, since g() is random, the answers to the queries
+g(v(X)) are uniformly random and thus adhere to the same distribution
+as f(X) for the random f. Thus, the only advantage that the
+distinguisher has stems from the possibility of a collision being
+exhibited which, as we saw above, is negligible.
+In conclusion, we note that an already existing implementation of
+CBC-mode (for encryption) can be turned into an implementation of
+CBC-MAC by
+ - using IV=0
+ - discarding all but the last blocks of the ciphertext
+ - prepending the plaintext with its length
+Mistakes in any one of these (e.g. using a random IV and transmitting
+it, omitting the length or placing it elsewhere, transmitting other
+blocks of the ciphertext, lead to insecurities. See the nice blog
+article "Why I hate CBC-MAC").
+https://blog.cryptographyengineering.com/2013/02/15/why-i-hate-cbc-mac/
diff --git a/ss2017/cryptography/Notes11.txt b/ss2017/cryptography/Notes11.txt
new file mode 100644
index 0000000..839d24b
--- /dev/null
+++ b/ss2017/cryptography/Notes11.txt
@@ -0,0 +1,310 @@
+11 Hashing, Merkle-Damgard transform, SHA, HMAC
+-----------------------------------------------
+Hash functions serve a related but slightly different purpose to that
+of MACs. Unlike those (and private key encryption) they are not
+immediately useful on their own but rather are building blocks and
+indeed are "a workhorse of cryptography" like e.g. solving systems of
+linear equations is to engineering.
+Intuitively, a hash function H() transforms a long string m into a
+short digest ("hash") t=H(m) such that it is extremely difficult to
+produce two texts x =/= x' with H(x)=H(x'). Such pair x,x' is known as
+a *collision*.
+As such, the notion cannot really be made sense of in the context of
+computational complexity for once we have x=/=x' with H(x)=H(x') then
+we can just hardwire x,x' into a program. In order to get around this,
+we introduce as before randomly chosen keys and security parameters.
+This will allow us to formulate meaningful security goals and
+meaningful security proofs for various constructions involving hash
+functions. 
+Definition: A *hash function* consists of a (probabilistic polytime)
+key generation algorithm Gen(n) as before and a polytime function H
+taking as input a key s ("seed") and m:{0,1}^* then
+H_s(m):{0,1}^{l(n)} where l(n) is a fixed, and typically simple,
+length function, e.g. l(n)=n.
+There may also be another length function l'(n)>l(n) and
+H_s:{0,1}^{l'(n)} -> {0,1}^{l(n)}. In that case, we have a
+fixed-length hash function.
+Definition: A hash function Pi=(Gen,H) as above is *collision resistant* if
+for all probabilistic polytime adversaries A one has
+ 
+            Pr[Hash-coll_{A,Pi}(n)=1] = negl(n)
+for some negligible function negl(n). Here, Hash-coll_{A,Pi}(n) is the
+following experiment:
+1) s <-Gen(n) /* s = "seed" */
+2) A is given s and then outputs x,x'
+3) The result of the experiment is 1 if x=/=x' and H_s(x)=H_s(x') and
+0 otherwise.
+Thus, the adversary must provide a collision given a key s and the
+probability that it succeeds in doing so should be negligible. In
+contrast, in the case of MACs the adversary does not get access to the
+key.
+Weaker notions of security
+- - - - - - - - - - - - - -
+In this context one may ask why being able to produce a collision
+should be considered dangerous. Firstly, if a hash function is
+collision resistant then it follows that it is also not possible to
+forge a message x' =/=x such that H(x)=H(x') given x. So collision
+resistance is certainly a "good enough" notion. However, one could
+instead ask for the weaker property of "second preimage resistance":
+it should be difficult to, given x, produce x'=/=x such that
+H(x)=(x').
+A typical application of hash functions is message integrity: If we
+send someone a message x via some untrusted channel (which can be
+modified by an adversary) and the hash value t=H(x) via a trusted
+channel then the receiver can check that the message has not been
+tampered with by checking that t=H(x') with x' being the received
+(possibly modified) message.
+While being able to produce an arbitrary collision may be harmless,
+being able to find a collision in a given set of texts of size ~
+2^{l(n)} can be very dangerous and is not covered by "second preimage
+resistance": Consider the following scenario: we produce a letter L,
+have its hash signed by a responsible person, and then send it to some
+recipient. Now, it is very easy to produce functions
+f,g:{0,1}^{l(n)}->{0,1}^* such that for each z the string f(z) is,
+say, a different rejection letter, and g(z) is a different acceptance
+letter. E.g. by introducing various switches dear/esteemed, Sir/Madam
+or pleased/delighted etc. Or, if one is dealing with images rather
+than letters by slightly altering intensity values. Then if, one can
+produce z, z' such that H(f(z))=H(g(z')) one can have the hash
+t=H(f(z)) signed and communicate along with g(z'), for which the hash
+is also valid due to the collision. While this still requires
+collisions of a special kind it is sufficiently close to arbitrary
+collisions and for this reason one considers collision resistance as
+the "gold standard" for hash functions.
+Birthday paradox
+- - - - - - - - -
+Let us see, how many steps a brute force attack requires. In order to
+do that, we (simplifyingly) assume that H_s is a random function. Let
+us write H(x)=H_s(x). So we assume that H is drawn uniformly at random
+from {0,1}^*->{0,1}^{l} where l=l(n). 
+Clearly, if we fix x then the likelihood for x' drawn uniformly at
+random to satisfy H(x)=H(x') is 2^{-l(n)} and thus, we need to try out
+ca N/2 values (N=2^l) to get a 50% success probability. This also
+holds, if x has been drawn at random. Things are different, if we
+randomly choose a set of values S and then *search* for a collision
+within S. Of course, if |S|>N then we find a collision with certainty,
+but perhaps surprisingly |S|>=2*sqrt(N) ensures that the likelihood of
+a collision within S is above 50%.
+So, to find a collision one can choose a set S of size 2*sqrt(N), sort
+it according to hash values and then search (in linear time) for the
+desired collision. If one does not find one, repeat with new data.
+Thus, for example if l(n)=64 then trying out 2^63 values will (with
+probability >=50%) lead to a collision with a fixed left hand side,
+but trying out 2^33 values will suffice to get an arbitrary collision
+using the birthday paradox. While 2^63 steps can only be carried out
+with special hardware resources, carrying out ~2^33 steps is trivial on
+today's computers. One must bear this in mind when trying to estimate
+"bits of security" of a hash function. Even assuming that the hash
+function does not contain systematic security holes we will only get
+one half of the hash size (size of the digest) many bits of security.
+Improved birthday attack
+- - - - - - - - - - - - -  
+The implementation of the "birthday attack" sketched above takes not
+only roughly O(sqrt(N)) steps but also O(sqrt(N)) memory which is a
+more expensive resource. Indeed, storing 2^33 thirty-two byte words
+requires 64TB. We now describe a version of the birthday attack that
+runs in constant space.
+If we choose x0:{0,1}^l uniformly at random then, assuming as we did,
+that H is a random function then
+S={x0,H(x0),H(H(x0)),H(H(H(x0))),...,H^{2*sqrt(N)-1}(x0)}
+is also random and thus contains a collision with >=50%
+likelihood. Let us write xi:=H^i(x0). Suppose that i,j are such that
+xi=xj, but x_{i-1} =/= x_{j-1}, so x_{i-1},x_{j-1} form a
+collision. Note that xi=H(x_{i-1}). Assuming i<j and writing j=i+d we
+then also have x_i=x_{i+kd} for all k>=0 and thus x_{i+z}=x_{i+kd+z}
+for all k>=0 and z>=0. If we choose k such that kd>=i then
+x_{kd}=x_{2kd} follows with z=kd-i.
+This suggests the following
+algorithm for finding a collision with high likelihood:
+x0 <- {0,1}^l /* uniformly at random */
+x = x0;y = x0; 
+for (r=1..4*sqrt(N)) {
+   x = H(x); /* x=H^r(x0) */
+   y = H(H(y)); /* y=H^{2r}(x0) */
+   if (x==y) break;}
+if (x!=y) start over with new x0; else
+y=x;x=x0; /* y=H^{kd}(x0) */
+if (x==x0) start over with new x0; else /* very unlikely */ 
+for (i=0..r) do {
+   x = H(x); /* x=H^i(x0) */
+   y = H(y); /* y=H^{i+kd}(x0) */
+   if (x==y) break; /* must happen before loop runs out */}
+return H^{i-1}(x0),H^{i+kd-1}(x0); /* can optimize recomputation */
+Merkle-Damgard Transform
+- - - - - - - - - - - - -
+The Merkle-Damgard Transform constructs a collision-resistant hash
+function (Gen,H) from a given *fixed length* collision-resistant hash
+function (Gen,h) with length 2*l(n), i.e. the h-hash is half as long
+as the input. The key generation remains unchanged and H is as follows.
+1) Given key s and input x:{0,1}^* divide message into blocks x1,x2,...,x_B
+of length l=l(n) possibly padding the last block with zeroes. Set x_{B+1}=L. 
+2) Set z0=0...0 /* l zeroes */ 
+3) for i=1..B+1 do zi = h_s(z_{i-1} || x_i)
+4) Output z_{B+1}
+  x1           x2              x_B          L  
+0 |  _____     |  _____         |  _____    |  _____
+| |_|     |    |_|     |        |_|     |   |_|     |
+|___| h_s |______| h_s |_...._____| h_s |_____| h_s |-----> H_s(x)
+    |_____|      |_____|          |_____|     |_____|
+Theorem: If (Gen,h) is a collision-resistant fixed length hash
+function of length l'(n)=2*l(n) then (Gen,H) is a collision-resistant
+hash function.
+Proof: We first show that any collision in H_s yields a collision in
+h_s.  Suppose that H_s(x)=H_s(x') and write x_1..x_B,L for the blocks
+and length of x and x'_1..x'_B' L' for x'. Also write z_1..z_{B+1},
+z'_1..z'_B' for the intermediate results. We have H_s(x)=h_s(z_B||L)
+and H_s(x')=h_s(z'_B'||L'), so, if L=/=L' we already have produced the
+desired collision in h_s. This also holds, if L=L' and z_B =/= z'_B
+(notice that B=B' in this case). If L=L' and z_B=z'_B then, since
+z_B=h_s(z_{B-1}||x_B) we get a collision unless x_B=x'_B. Continuing
+in this way, we eventually find our collision since x_i =/= x'_i for
+at least one i.
+Now, suppose that A is an adversary against H_s. We construct an
+adversary A' against h_s as follows. If A outputs x,x' we check
+whether H_s(x)=H_s(x') and if so, construct the corresponding
+collision h_s(y)=h_s(y') in h_s as described above and output
+y,y'. Otherwise, we output 0,0. The likelihood for succeeding, i.e.,
+for the output to be a h_s-collision is negligible by assumption and
+thus the probability that A output a collision is negligible as
+well. Notice that the computations we carried out as A' were clearly
+in polynomial time. QED
+The well-known and standardized cryptographic hash functions MD5,
+SHA-1, SHA-2 use the Merkle-Damgard transform. The underlying
+fixed-length hash function (called compression function) in this
+context is built using the heuristic principles that were described in
+Section 7 (Practical block ciphers. Principles: s/p- and Feistel
+networks).  We remark here that MD5 and SHA-1 are no longer consider
+sufficiently secure whereas SHA-2 is.
+The latest standard SHA-3 which is not yet widely used but might
+eventually supersede SHA-2 does not use Merkle-Damgard but instead
+relies on the so-called *sponge construction*, see Wikipedia. 
+MACs from hash functions
+- - - - - - - - - - - - - 
+A natural question is whether one can use hash functions also for the
+purpose of authentication, i.e. MAC. First, we note that naive
+attempts (which according to KL have been actually used in practice!)
+do not work. Consider for example the definition Mac_k(x) = H_s(k||x)
+where s is some public seed. If H_s was constructed with the
+Merkle-Damgard transform from h_s then we can easily forge MACs as
+follows: if t is the MAC of x then t'=h_s(t||L') is the
+MAC of (x || |x|) (glossing over the padding).
+Suppose that we are given a *fixed-length* MAC (Gen,Mac) capable of
+authenticating messages of length l(n). Suppose furthermore, that we
+have a collision-resistant hash function (Gen',H) which produces
+hashes of length l(n). We can then build a MAC (Gen'',Mac'') for
+arbitrary length messages as follows:
+Gen''(n): k<-Gen(n); s<-Gen'(n);return (k,s) /* use some encoding of pairs */
+Mac''_{(k,s)}(m): return Mac_k(H_s(m))
+Theorem: Assuming that (Gen,Mac) is secure and (Gen',H) is
+collision-resistant then (Gen'',Mac'') is secure.
+Proof: Suppose that an adversary A asks questions Q and then succeeds
+in forging a MAC t=Mac''_{(k,s)}(x), i.e., x is not in Q. If H_s(x) is
+different from all the H_s(y) for y:Q then A has succeeded in forging
+a MAC for Mac_k. Namely then t=Mac_k(H_s(x)) and H_s(x) was not
+previously asked. If, however, H_s(x) is contained in {H_s(y) | y:Q}
+then A has produced a collision. Since both events occur with
+negligible probability the probability that A can forge a MAC is also
+negligible. QED 
+Suppose that H_s() above has been constructed using the Merkle-Damgard
+construction from the fixed-length hash function h_s(). For seed s
+define a fixed-length MAC by
+                    Mac_k(m) = h_s(k||m)
+We now make the assumption that h_s() is such that this is a secure
+fixed-length MAC. Notice that the abovedescribed extension attack on
+MACs of this sort does not apply due to the fixed length.
+If we now apply the above hash-and-MAC construction then it turns out
+that the arbitrary-length MAC obtained satisfies "essentially"
+Mac''_{(k,s)}(m) = H_s(k||H_s(m))
+because the last H_s degenerates to h_s due to the shortness of
+H_s(m). The quoted "essentially" refers to the fact that for this
+equation to hold exactly one needs to assume some conventions about
+how exactly the padding in the Merkle-Damgard construction is
+performed. We want to ignore these details here and refer to the
+literature.
+The HMAC construction
+- - - - - - - - - - -
+The widely used and standardised HMAC construction is essentially the
+above except that another key is added to the message m prior to
+invoking the inner hash function and that the two keys that are thus
+needed are derived from a single one by XOR-ing with certain fixed
+words ipad and opad:
+HMAC_{(k,s)}(m) = H_s( (k+opad) || H_s((k+ipad) || m))
+In practice, ipad and opad are chosen to be 00110110 and 01011100 each
+repeated as often as necessary.
+The additional inner keying with k+ipad allows one to weaken the
+assumptions on the collision freeness of H_s(). The details of these
+assumptions and their practical relevance are still subject to debate
+(see a 2012 paper by Koblitz and Menezes); however, no practical
+attack has been found to date and HMAC is considered secure in
+practice.
diff --git a/ss2017/cryptography/Notes12.txt b/ss2017/cryptography/Notes12.txt
new file mode 100644
index 0000000..a1299d7
--- /dev/null
+++ b/ss2017/cryptography/Notes12.txt
@@ -0,0 +1,274 @@
+12 Number theory and hardness assumptions
+-----------------------------------------
+The next few sections are about *public-key cryptography* (different
+keys for encryption and decryption, one public, the other
+private). This area makes heavy use of number theory, in particular
+arithmetic. In this section we review the necessary mathematics and
+formulate hardness assumptions on which the security of public key
+cryptography is based.
+We assume standard knowledge about
+- arithmetic: division with remainder, gcd, extended Euclidean
+algorithm, i.e. for all integers u,v, there exists integers s, t such
+that gcd(u,v)=su+tv, Chinese remainder theorem, modular arithmetic, in
+particular inverses modulo.
+- groups: additive and multiplicative notation; exponentiation; the
+  additive group Z_N (addition modulo N); the multiplicative group
+  (Z_N)^* of order phi(N) (Euler's phi function); the consequence
+  x^(phi(N)) = 1 (mod N) and the special case x^{p-1}=1 (mod p); the
+  group isomorphisms derived from the Chinese remainder theorem (Z_pq
+  =~= Z_p x Z_q, Z_pq^* =~= Z_p^* x Z_q^*), primitive elements in
+  Z_p^* (cyclicity of Z_p^*). 
+The required knowledge is part of "Logik und Diskrete Strukturen" and
+can also be looked up in KL.
+Next, we discuss the computational complexity of various arithmetic
+operations.  We always measure runtimes in terms of the *sizes*
+(lengths in binary representation) of the numbers in question. Notice
+that the size of a number n is O(log(n)). The basic arithmetic
+operations +,-,*,/ can be performed in polynomial time (in terms of
+the sizes) with the standard algorithms taught at school. The same is
+true for gcd-computations using Euclid's algorithm. Furthermore,
+computing modular powers, i.e. a^b mod N can also be done in
+polynomial time using repeated squaring based on the binary expansion
+of N. Note though that a^b cannot be computed in polynomial time
+simply because the result is too long.
+Testing whether a given number n is prime can also be done in
+polynomial time; the most efficient algorithms for this use
+probabilistic polynomial time computation, however. This means that
+one has randomised algorithms which given a number n answer either
+"composite" in which case n is definitely composite or "prime" in
+which case n is prime except for a negligible error probability. The
+simplest of these tests is the Miller-Rabin test (KL 7.2.2).
+Using such a primality test it is also possible to generate a random
+prime number of a given size s. To that end one chooses random n of
+size s and then tests n,n+1,n+2,n+3,... for primality until a prime
+number has been found. Theorems about the distribution of prime
+numbers assert that this search is successful after a small (in any
+case expected polynomial) time.
+Factoring
+- - - - -
+The factoring problem asks the following question. Given a natural
+number N find a non-trivial factor of N unless N is prime (as usual
+the trivial factors are 1 and N).
+If N is even or has some other small divisor then this question is
+easily solved by trial division (the factoring method taught at
+school). If, however, N has only large factors, e.g. is the product of
+two primes of roughly equal size then factoring is believed to be a
+very hard problem and at any rate no efficient algorithm is
+known. Notice that trial division is linear in sqrt(N) hence
+exponential in the size of N. (sqrt(N)=2^{1/2*log(N)}). There do exist
+better algorithms for factoring, but with today's technology finding
+out the factors of a product of two ~512bit primes must be considered
+infeasible with a broad security margin. 
+RSA Cryptosystem
+- - - - - - - - -
+It is believed to be hard to reconstruct x from x^e knowing merely N
+and e so this procedure is used as the basis of a public key
+cryptosystem: RSA. Intuitively, Enc_e(x)=x^e mod N and Dec_d(y)=y^d
+mod N. The encryption key e can be publicized whereas the decryption
+key d must be kept private.
+Consider the situation where N=pq for primes p,q. The ring Z_N of
+integers mod N contains phi(N)=(p-1)(q-1) invertible elements. Recall
+that x:Z_N is invertible if gcd(x,N)=1. The elements with a gcd > 1
+are the multiples of p and of q of which there are (q-1) + (p-1) + 1
+with the last summand corresponding to 0=pq. Thus, the number of
+invertible elements is pq-q-p+1 = (p-1)(q-1). Equivalently, with the
+Chinese Remainder Theorem we have Z_N =~= Z_p x Z_q and the invertible
+elements correspond to those pairs (x,y):Z_p x Z_q where x=/=0 and
+y=/=0.
+The (multiplicative) group Z^*_N of invertible elements (mod N) thus
+has order /* number of elements */ phi(N)=(p-1)(q-1). For each element
+x:Z^*_N it holds that x^phi(N) = 1 (mod N) (general result about
+groups). 
+Now suppose that we have e such that gcd(e,phi(N))=1 and ed=1
+(mod phi(N)). Then we can solve the RSA problem,
+ i.e. determine x such that x^e = y mod N for given y, N, e.
+More precisely, if x:Z^*_N we can recover x from y:=x^e as y^d
+(all computations done mod N). Indeed, (x^e)^d = x^{ed} =
+x^{t*phi(N)+1} = x (for any t).
+Security of RSA relative to factoring
+- - - - - - - - - - - - - - - - - - -
+If given e we are able to find d such that x^{ed}=x holds for all
+invertible (mod N) x then we can find the factors of N as follows:
+Namely, in this case, we have a value a (=ed-1) such that x^a=1 (mod
+N) holds for all x:Z^*_N. Such value must necessarily be even as can
+be seen by taking x=-1. Now we can successively divide a by two until
+we have found a value b such that x^{2b} = 1 (mod N) holds for all x
+but x^b =/= 1 for some x. The crucial observation is that once x^b =/=
+1 for some x then x^b =/= 1 for at least 50% of all x so we can find
+such a witness value x by repeated guesses. The reason for the 50%
+bound is that {x | x^b=1} is a proper subgroup of Z^*_N and its order
+must therefore be a proper divisor of |Z^*_N|=phi(N). 
+So, summing up, we are in possession of a value b such that x^{2b}=1
+for all x, but x^b=1 not for all x. Let us consider u := x^b mod p and
+v := x^b mod q. Since u^2=v^2=1 we have u,v:{-1,1}. Moreover, by the
+Chinese Remainder Theorem, we have Z_N^* =~= Z_p^* x Z_q^* so that if
+x is chosen uniformly at random then u and v are also uniformly at
+random and independent. As before, if one of them *may* be =/=1 then
+this happens for at least 50% of them. As a result, under the
+assumptions, the event u*v=-1 has a a likelihood of at least 50% and
+in this event gcd(x^b-1,N) yields a nontrivial divisor of N. 
+Unfortunately, there is no known reduction from RSA to factoring in
+the sense that we would be able to prove that RSA is semantically
+secure on condition that factoring is hard. Namely, notwithstanding
+the above, it might be possible to break RSA without actually
+producing an exponent d such that x^ed=1 mod N. To date, it is not
+known whether RSA is sec ure relative to factoring, let alone in
+itself. 
+RSA hardness assumption
+- - - - - - - - - - - - 
+The computational hardness of RSA is thus not proven and not reducible
+to factoring but at least to some extent plausible. It is thus
+justified to formulate the following RSA hardness assumption.
+There exists a randomised polynomial time algorithm GenRSA() which
+given security parameter n produces (N,e,d) such that N is the product
+of two primes N=pq and d = e^{-1} mod phi(N) and such that the following 
+experiment has negligible success property for all probabilistic polynomial time adversaries A:
+RSA-inv_A(n):
+1. Run GenRSA(n) to obtain (N,e,d)
+2. Choose x<-Z^*_N
+3. A is given N,e, and x^e mod N and outputs x'
+4. The output of the experiment is 1 if x=x' and 0 otherwise.
+It is believed that the RSA assumption holds with GenRSA(n) selecting
+two random primes of length n, choosing e at random and outputting
+(pq,e,e^{-1} mod phi(N)).
+Groups for cryptography 
+- - - - - - - - - - - - 
+A *group generating* algorithm G(n) is a probabilistic polynomial time
+algorithm which given security parameter n produces a description of a
+group G_n whose order is polynomial in n. The description of the group
+comprises generators, a well-formedness test, testing whether a given
+bitstring encodes a group element or not and algorithms implementing
+multiplication and inverses on these representations.
+For example, G(n) might determine a random prime p of length n and
+then return an algorithmic description of the group Z_p^*. As for a
+generator, it could produce a primitive element.
+For some applications, groups of prime order are preferred. One
+example of those is the following: If q is a prime such that p=2q+1 is
+also prime (p is then called a "strong prime") then the subgroup of
+Z^*_p consisting of the elements of the form y^2 for y:Z_^*p (the
+quadratic residues mod p) form a subgroup of Z^*_p of order q which by
+assumption is prime.
+Further examples are furnished by groups of points on an elliptic
+curve. The elements of such a group are pairs (x,y) of elements of
+a finite field GF(q) (q prime) such that
+                   y^2 = x^3 + Ax + B 
+holds for fixed parameters A, B which thus determine the elliptic
+groups. In addition to these pairs (x,y) the group contains a special
+element O. 
+For example, if the field is GF(7) = Z_7 and A=B=3 then the group (the
+elliptic curve) has elements (1,0), (3,2), (3,5), (4,3), (4,4),
+O. Indeed, whenever x^3+3x+3 is (zero or) a quadratic residue mod 7 we find (one
+or) two points and otherwise not.
+The group operation, i.e. how elements of the group are added (or
+multiplied, depending on notation), is omitted for now. 
+We come back to it later.
+Discrete logarithm and Diffie Hellman
+- - - - - - - - - - - - - - - - - - -
+Given a cyclic group G with generator g the discrete logarithm problem
+is the following: given h:G determine e such that h=g^e. Formally,
+suppose that we are given a group generating algorithm G(n) outputting
+for each n a cyclic group G_n and a generator g.
+For adversary A the experiment DLog_{A,G}(n) then is the following:
+1. Run G(n) to obtain (G,q,g) where G is a cyclic group of order q
+with generator g.
+2. Choose h<-G /* e.g. by putting h=g^d for d<-Z_q */
+3. A is given G,q,g,h and outputs e:Z_q
+4. The output of the experiment is 1 if g^e = h
+By definition, the discrete logarithm problem is hard for G() if for
+all adversaries A one has Pr(DLog_{A,G}(n)=1) = negl(n).
+It is generally believed that discrete logarithm is hard for the
+aforementioned example groups. 
+Related problems are computational Diffie-Hellman (CDH) and decisional
+Diffie-Hellman (DDH).
+Computational Diffie-Hellman: given a cyclic group G of order q and
+generator g as above and h=g^x, h'=g^y determine (without knowing x,y !)
+the value g^{xy}. 
+Decisional Diffie-Hellman: given a cyclic group G of order q and
+generator g as above and h=g^x, h'=g^y and h''=g^z determine (without
+knowing x,y,z !) whether or not g^{xy}=g^z.
+Clearly, a solution to the discrete logarithm problem implies a
+solution to Diffie-Hellman, but the converse is not known.  As a
+result, hardness of discrete log is a weaker assumption than hardness
+of DDH (and CDH)
+Diffie-Hellman is important for the eponymous Diffie-Hellman
+key-exchange protocol: In order to share a common secret key Alice may
+choose random x:Z_q and send u=g^x to Bob. Bob does the same and thus
+sends v=g^y to Alice.  Now, Alice and Bob can both compute the shared
+key g^{xy} as u^x (Alice) and v^y (Bob). We will show later that
+assuming DDH is hard this scheme is secure in a precise sense.
+Formal definition of hardness of DDH:
+Given a group-generating algorithm G() producing cyclic groups as
+above, we say that the decisional Diffie Hellman problem is hard for
+G() if for all  probabilistic polynomial-time adversaries A and all n we have
+ |Pr(A(G,q,g,g^x,g^y,g^z)=1) - Pr(A(G,q,g,g^x,g^y,g^{xy}))| = negl(n)
+where G(n)=(G,q,g) and x,y,z <- Z_q.
+Thus, DDH is hard, if g^{xy} looks like a random element of G even if
+g^x and g^y are known.
+We close this chapter by remarking that under the assumption that RSA
+is hard thre exist pseudo random generators, pseudorandom functions,
+and permutations (see KL6 and 7.4.1). Under the assumption that
+discrete logarithm is hard for some group then there exist collision
+resistant hash functions. (KL7.4.2). 
diff --git a/ss2017/cryptography/Notes13.txt b/ss2017/cryptography/Notes13.txt
new file mode 100644
index 0000000..b8e5f0a
--- /dev/null
+++ b/ss2017/cryptography/Notes13.txt
@@ -0,0 +1,162 @@
+13 Diffie-Hellman key exchange and RSA
+--------------------------------------
+Diffie Hellman
+- - - - - - - -
+As already mentioned, the Diffie-Hellman key exchange system allows
+two people, say Alice and Bob, who do not share any secret information
+to obtain a shared secret key (which can then for example be used for
+subsequent private key cryptography). It thus avoids the need for the
+sharing of secret keys which is difficult in practice and involves
+trusted third parties or similar (see KL9.1-3 for a discussion).
+We now formalize what should be expected from such a key exchange protocol: 
+The key exchange experiment KE^eav_{A,Pi}(n):
+1. Two parties both holding n execute the protocol Pi. This results in
+   a transcript trans containing all messages exchanged and a string
+   ("key") k:{0,1}^n that is output by both parties. Thus, both
+   parties must output the same key.
+2. b<-{0,1}. if b then k':=k else k'<-{0,1}^n
+3. Adversary A is given trans and k' and outputs a bit b'.
+4. Output = 1 iff b'=b
+The protocol Pi is deemed secure if the success probability for any
+prob.polytime adversary A is bounded by 1/2+negl(n).
+The Diffie-Hellmann protocol Pi for some cyclic group (generating
+algorithm) G() now works as follows:
+1. Alice runs (G,q,g) := G(n) and sends (G,q,g) to Bob. 
+2. Alice chooses x<-Z_q and computes h1:=g^x
+3. Alice sends (G,q,g,h1) to Bob
+4. Bob chooses y<-Z_q and sends h2:=g^y to Alice. Bob then outputs k_B:=h1^x
+5. Alice outputs k_A:=h2^x /* Note that k_A=k_B=g^{xy} */
+We will now show that the Diffie-Hellman protocol is secure under the
+assumption that DDH is hard for the group in question and the further
+assumption that the random key k' chosen in the KE^eav experiment is a
+random element of G rather than an arbitrary random string. Let us
+call KE'^eav the thus modified experiment. We note that a random group
+element can be taken of the form g^z for z<-Z_q. 
+We now have
+Pr(KE'^eav_{A,Pi}(n)=1) = 1/2*Pr(KE'^eav(n)=1 | b=1) + 1/2*Pr(KE'^eav(n)=1 | b=0) =
+1/2*(Pr(A(G,g,q,g^x,g^y,g^{xy})=1)+ Pr(A(G,g,q,g^x,g^y,g^z)=0))  =
+1/2+1/2*(Pr(A(G,g,q,g^x,g^y,g^{xy})=1)- Pr(A(G,g,q,g^x,g^y,g^z)=1))  <=
+1/2+1/2*|Pr(A(G,g,q,g^x,g^y,g^{xy})=1)- Pr(A(G,g,q,g^x,g^y,g^z)=1)|  <=
+1/2 + 1/2*negl(n) /* assuming hardness of DDH for G() */
+We remark that while (under assumptions) Diffie-Hellman is secure
+against passive eavesdroppers it is vulnerable against more active
+adversaries that may also send, modify, and swallow messages during
+the run of the protocol. Achieving security against those requires
+some means of authenticating messages sent between the parties; a
+precise formulation of the additional assumptions needed for this is
+beyond the scope of this book.
+Public key cryptosystems
+- - - - - - - - - - - - - 
+We give a formal definition of a public key cryptosystem and its
+security.
+Definition: a public key cryptosystem Pi comprises
+-) a (probabilistic) key generation algorithm Gen() which given
+ security parameter n outputs a pair of keys (pk,sk) /* public, secret
+ */
+-) a (probabilistic) encryption algorithm which given plaintext m from
+some message space and the public key pk produces a ciphertext c:
+c<-nc_pk(m)
+-) a deterministic decryption algorithm which given ciphertext c and
+the secret key sk produces a plaintext m<-Dec_sk(c) in such a way that
+if c<-Enc_pk(m) then Dec_pk(End_sk(m)) = m.
+Definition: A public key scheme Pi is semantically secure in the
+presence of an eavesdropper if the for all (prob. polytime)
+adversaries A the success probability in the following experiment is
+bounded by 1/2+negl(n). 
+Experiment PubK^eav_{A,Pi}(n):
+1. (pk,sk)<-Gen(n)
+2. Adversary A is given pk and outputs messages m0 m1
+3. b <- {0,1}; c <- Enc_pk(m_b); c is given to A
+4. A outputs a bit b'
+5. Outcome is 1 if b=b' and 0 otherwise    End of definition
+We notice that this is just like semantic security in the private key
+setting except that the adversary has access to the public key which
+after all is supposed to be public. This means that, since by
+Kerckhoff's principle, the encryption algorithm is also public, the
+adversary has access to an encryption oracle and thus---in the public
+key setting---semantic security in the presence of an eavesdropper
+(the one above) is as powerful as cpa security.
+This has the consequence that no public key cryptosystem in which
+Enc_pk() is deterministic can be semantically secure. Indeed, all the
+adversary has to do in this case is to precompute the encryptions of
+its chosen messages m0 and m1. 
+We remark that a semantically secure public key cryptosystem can be
+combined with any cps-secure private key system by first using the
+public key to transmit a secret key. This is known as *hybrid
+encryption*. One can show (Theorem 10.13 of KL) that hybrid encryption
+yields semantically secure public-key encryption schemes.
+RSA cryptosystem
+- - - - - - - - -
+For this reason, the "textbook RSA" cryptosystem, as indicated in the
+last chapter, is actually insecure. In order to make it work, it must
+be enhanced with randomization for example as follows.
+Fix a length function l(n).
+Gen(): On input n run GenRSA(n) to obtain (N,e,d). Output
+       pk=(N,e) and sk=(N,d)
+Enc(): On input m:{0,1}^l(n) and pk=(N,e) choose
+       r<-{0,1}^{||N||-l(n)-1} and output c=(r||m)^e mod N
+       /* the -1 is there to make sure that r||m is in Z_N^* */
+Dec(): On input c and sk=(N,d) compute m'=c^d mod N and output the l(n)
+       low-order, i.e. rightmost, bits of m'
+Now, if the "padding" r is too short, then this does not help because
+an adversary could try out all possible values for r. It is believed
+that when r has a length comparable to |m|, then the above scheme is
+secure (relative to the RSA assumption), but unfortunately, no proof
+has been found. According to KL (Theorem 10.19) one can show that the
+scheme is secure when l(n)=O(log(n)), again under the RSA
+assumption. Note that this means that in order to increase the message
+size by just one bit one has to double the security parameter and
+hence the length of the modulus N and the ciphertexts. 
+In practice, one does not want to waste so much bandwidth on the
+padding and uses heuristic schemes like e.g. RSA-PKCS#1v1.5 which uses
+a random padding string of length at least 64bit.
+Another, more modern, method is RSA-OAEP which mixes the padding with
+the actual plaintext by way of a Feistel network of cryptographic hash
+functions. Intuitively, this makes the input to the actual RSA
+encryption look like a random string which then makes the RSA
+assumption applicable. Viewed more practically, it prevents attacks
+which only recover parts of the plaintext. In the random oracle model
+(Lecture 16) RSA-OAEP can be proved semantically secure.
diff --git a/ss2017/cryptography/Notes14.txt b/ss2017/cryptography/Notes14.txt
new file mode 100644
index 0000000..e6075ff
--- /dev/null
+++ b/ss2017/cryptography/Notes14.txt
@@ -0,0 +1,208 @@
+14 El Gamal and elliptic curve cryptography
+-------------------------------------------
+The El Gamal cryptosystem is based on the DDH assumption and is
+provably secure under the assumption of the latter. Just like DH-key
+exchange it is parametrised by a group (generating algorithm) G(n) and
+works as follows:
+Gen(n) :
+  (G,q,g) = G(n)
+  x <- Z_q
+  h <- g^x
+  pk = (G,q,g,h)
+  sk = (G,q,g,x)
+  output (pk,sk)
+Enc_(G,q,g,h)(m) :
+  check that m:G
+  y <- Z_q
+  c = (g^y,h^y*m)
+  output c
+Dec_(G,q,g,g,x)(c1,c2) :
+  m = c2 / c1^x
+  output m
+Let us first see that this scheme is correct: If c1=g^y and c2=h^y*m
+where h=g^x then c2 / c1^x= g^(xy) * m / g^(xy) = m
+We also notice that messages and ciphertexts are actually elements of
+the group G which must therefore by somehow encoded as binary
+strings. This can be done in various ways for the various groups, see
+KL for details.
+Theorem: If the DDH assumptions holds for a group (generating
+algorithm) G() then the El Gamal cryptosystem basd on G() is
+semantically secure.
+Proof: Let A be an adversary against the El Gamal system and let its
+success probability be 1/2+eps(n). We must show that eps(n) is
+negligible.  We construct from A an adversary A' as in the DDH
+assumption.
+Suppose that A' receives g^x, g^y, u where u is either g^(xy) or g^z
+for random z and A' is to decide which of the two.
+We forward (G,q,g,g^x) as "public key" to A and wait for it to output
+m0,m1:G. We choose a random bit b and build c = (g^y,u*m_b) and let b'
+be the bit that A returns.
+If u=g^(xy) then A is played with according to the rules of the El
+Gamal cryptosystem and therefore b'=b with probability 1/2+eps(n). If
+u=g^z for random z then u*m0 is also uniformly random (since x|->x*m0
+is a bijection G->G) and therefore the probability that b'=b is
+exactly 1/2. The difference between the two probabilities is precisely
+eps(n) which under the DDH assumption will be negligible. QED
+Elliptic curve cryptography
+- - - - - - - - - - - - - -
+As already mentioned, elliptic curves furnish a large reservoir of
+cyclic groups for which taking discrete logarithms, CDH, and DDH are
+believed to be hard. With existing algorithms, solving these problems
+on elliptic curve groups is consistently harder than doing so for
+groups of invertible elements mod p or in a finite field.
+Recall that an elliptic curve over some field F of characteristic =/=
+2,3 is defined by two constants a, b and comprises the points (x,y)
+that solve the equation
+                        y^2 = x^3 + ax + b
+Here, one assumes that the *discriminant* 4a^3+27b^2 =/= 0 for
+otherwise the curve might be *singular*, i.e. contain anomalies like
+isolated points, sharp bends ("cusps") or similar. When viewed over
+the reals such a curve looks roughly as follows:
+                                     |           |
+                                     |          /
+                                   - | -       /
+                                  /  |   \ _ -
+                                 |   |
+---------------------------------------------------------------------
+                                 |   |     - _
+                                  \  |   /     \ 
+                                    -| -        \
+                                     |           |
+Depending on the parameters the knob to the left may also become
+detached and form a closed loop sitting to the right. In either case
+whenever a line intersects the curve in two points it meets it again
+in a third point except when the line is parallel to the y-axis in
+which case we say that the third point of intersection is at infinity,
+here denoted O.
+This property also holds for curves over other fields, in particular
+finite ones which are of interest for cryptography. There is an
+explicit formula for obtaining the third point of intersection. To
+wit, if (x1,y1) and (x2,y2) lie on the curve then the line through
+them is y=m(x-x1)+y1 where m=(y2-y1)/(x2-x1). Plugging this into the
+curve yields
+                    (m(x-x1)+y1)^2 = x^3+ax+b
+The three solutions of this equation are x=x1, x=x2, x=m^2-x1-x2, so
+the last one is the desired third point of intersection.
+Now, it is a remarkable fact that the points on the curve together
+with O form an abelian group, traditionally written additively, where
+the sum of points P and Q is obtained by first taking the third curve
+point on the line through P and Q, call it R=(x_R,y_R), and then
+reflecting R on the x-axis, i.e. P+Q=(x_R,-y_R).
+This, however, only applies when P=/=Q and when P and Q do not have
+the same x-coordinate. In the latter case, one has P+Q=O, the special
+point at infinity. In the former case, i.e. when P=Q, one must take
+the tangent through P and intersect it with the curve.
+One can show using implicit differentiation that the coordinates of
+that point are given by
+       (((3x_1^2+a)/2y_1)^2 - 2x1, -y1+((3x_1^2+a)/2y_1)^2(x1-x2))
+and this formula, just as the one above for the general case, also
+makes sense for finite fields where there are no "tangents". Yet
+another special case is when P or Q is O. One simply puts O+P=P+O=P.
+Curves over fields of characteristic 2 or 3 are also possible, but
+then both the defining equations and the formulas for addition of
+points become slightly different (notice e.g. that in the formula
+above we divide by 2). We refer to the literature, but stress that it
+is very important not to make mistakes with the implementation of the
+addition formulas and that doing so has lead to security flaws.
+We have already seen that addition of points is well-defined and
+commutative. It is also clear that the inverse of a point (x,y) is
+given by (x,-y), i.e. reflection on the x-axis. Associativity of
+addition follows from nontrivial results, e.g. the Weierstrass
+p-function, but can in principle also be validated "by hand" on the
+basis of the above formulas.
+For integer n and point P on a curve we use the notation n.P for
+P+P+...+P with summands. This is the same as the exponent notation in
+multiplicatively written groups. Therefore, the "repeated squaring"
+method can be used here as well:
+              100.P = 4.P+32.P+64.P = 2.(2.(P + 2.(2.(2.(P+2.P)))))
+Now, the plan is to use the groups coming from elliptic curves for
+cryptosystems like Diffie-Hellman key exchange or El Gamal. For this
+to work, a couple more considerations are in order: we need to encode
+plaintexts as points on the group and we need to ascertain that the
+group is cyclic or else move to a cyclic subgroup.
+In order to find a cyclic group one can rely on one of the publicly
+available and recommended curves. Many of them have a prime number of
+points and hence are cyclic with any element other than O being a
+generator. Alternatively, one can try out random coefficients a and b
+until one finds a curve with a prime number of points. To do this, one
+needs to invoke Schoof's algorithm which allows one to determine the
+number of points on a given elliptic curve.
+If one would like to use a curve with a nonprime number, say N, of
+points, then one must factor N and it is then important that N has a
+large prime factor p, i.e. p>sqrt(N). In this case, by the
+representation theorem for finite(ly generated) abelian groups there
+is a subgroup of order p and any of its nonzero elements are
+generators. Moreover, the remaining subgroups have smaller order.  To
+check whether a point P is in the subgroup of order p it suffices to
+check that p.P = O. To obtain a point in that subgroup one can pick a
+random point P and if Q := N/p.P =/= O then p.Q is in the subgroup,
+hence Q can serve as a generator.
+For the most, however, one uses one of the already publicised elliptic
+curves such as those recommended by the NIST "for government use".
+See http://csrc.nist.gov/groups/ST/toolkit/documents/dss/NISTReCur.pdf
+There is no need to choose a "fresh" curve for every application and
+on the other hand, some curves are "weak" and should be avoided.
+In order to encode plaintexts as points we can proceed as
+follows. First, we note that it suffices to store the x-coordinate of
+a point and a single bit telling whether the y-coordinate is above or
+below the x-axis, i.e., which of the two square roots of x^3 +ax+b are
+meant. This is known as point compression. As for the x-coordinates
+one fixes a bijection alpha between bitstrings of length l=log(q) and
+field elements. Now one puts t=l-k where k is chosen sufficiently
+large. Then, to encode message x of length t one searches for a string
+z of length k so that alpha(x||z) has a square root y over GF(q),
+i.e., is a quadratic residue. At that point, (alpha(x||z),y) can be
+taken as the desired encoding of x as a curve point.  For random
+padding z the chance for alpha(x||z) to be a quadratic residue is
+about 1/2, so after a short number of iterations one is almost sure to
+find an appropriate encoding. We also remark that there exist
+efficient algorithms for testing whether a given field element is a
+quadratic residue and if so to compute its square root.
+With all this machinery in place, the Diffie-Hellman and El Gamal
+cryptosystems can be used with elliptic curve groups without any
+change, bearing in mind, of course, that instead of exponentiation
+g^x, we now use multiplication x.P and instead of multiplication we
+use addition, so that, e.g. the encryption of message m encoded as a
+point M becomes (y.P,y.Q+M) where P is the generator and Q=x.P
+We close by remarking that since the early 2000 elliptic curve
+cryptography (ECC) has become more and more common. Also the bitcoin
+cryptocurrency uses it.
diff --git a/ss2017/cryptography/Notes15.txt b/ss2017/cryptography/Notes15.txt
new file mode 100644
index 0000000..e76ada5
--- /dev/null
+++ b/ss2017/cryptography/Notes15.txt
@@ -0,0 +1,279 @@
+15 Factoring and computing discrete logarithms
+----------------------------------------------
+In this chapter we survey algorithms for factoring large numbers, in
+particular products of equal sized primes, and for computing discrete
+logarithms. These algorithms or rather their runtime provide lower
+bounds on the necessary key lengths for public key cryptography for it
+relies on the difficulty of these problems.
+For the sections about factoring we let N=pq be a product of two
+roughly equal sized primes and n = O(log(N)) be the length of N in
+binary. The task is to find p,q from N. 
+For the sections about discrete logarithms let G be a cyclic group of
+order q and g,y:G with g a generator. The task is to find x:Z_q such
+that g^x=y. 
+Trial division
+- - - - - - - -
+In order to factor N=pq, i.e. to find p or q given N, we can try out
+divisors d=1...sqrt(N). The runtime of this is polynomial in sqrt(N),
+hence takes time exponential in n. Once n>120, hence ||sqrt(N)||=~=60
+this is no longer feasible.
+Pollard's p-1 method
+- - - - - - - - - - -
+Pollard's p-1 method may help when p-1 is a product of powers of small
+primes. We let p_1,...p_k be the first k primes and construct
+              B_k = prod_{i=1..k}p_i^{n / ||p_i||} 
+The exponent n/||p_i|| is chosen so that if some power of p_i divides
+p-1 then that power is at most n/||p_i||. As a result, if p-1 is a
+product of powers of primes from among the set {p_1..p_k} then p-1 |
+B_k. If, in addition, q-1 does *not* divide B_k then we can find out p
+as we show shortly. In order that q-1 does *not* divide B_k we can
+repeatedly execute the method with increasing values for k hoping that
+at some point one of the two prime factors "works" and the other does
+not.
+OK so, now let B any number so that p-1 | B and q-1 does not divide
+B. For any x it holds that x^B mod p = 1 whereas with constant
+probability x^B mod q =/= 1. The former holds since the order of Z_p^*
+is p-1 and B is a multiple thereof. The latter holds since when x is a
+generator of Z_q^* then x^B mod q is not 1 and the proportion of those
+generators is about 50% (not hard to see or use Thm B.15 of KL).
+Now let y = x^B-1 mod N. We have y mod p = 0 and y mod q =/= 0. Thus,
+p divides y and q does not divide y so gcd(y,N)=p.
+The resulting algorithm is now as follows:
+for k=1,2,3,... do the following until either a factor of N is found
+or computing resources are exhausted.
+   Put B = B_k
+   for a certain number of times do the following
+        x<-Z_N
+        d = gcd(x^B-1 mod N,N)
+        if d=/=1 then a factor of N has been found.
+Pollard's rho method
+- - - - - - - - - - -
+This algorithm is based on the observation that once we have two
+distinct values x,y:Z_N such that x mod p = y mod p then p =
+gcd(x-y,N). Thus, factoring amounts to finding a collision in the
+"hash function" H(x) = x mod p. Notice that while we cannot compute
+this function we can detect whether x,y constitute a collision by gcd
+computation.
+Now, by the birthday paradox, a random set of size S := 2*sqrt(p) =
+O(N^{1/4}) contains such a collision with likelihood 50%. Finding it,
+requires S^2 collision tests thus, when implemented naively, this
+method is no better than trial division.
+However, the constant space method for collision detection described
+earlier only has a runtime of O(S), but of course, as already
+mentioned, we cannot compute H(). However, we can replace it with any
+function F which has the property that x = y (mod p) implies F(x)=F(y)
+(mod p) and which is such that for random x0 the set
+{x_0,F(x_0),F(F(x_0)),..., F^S(x_0)} is also sufficiently close to a
+random set so that it is likely to contain a collision. The function
+F(x)=x^2 + b for some b =/= 0, -2 mod N is believed to have this
+property and at any rate behaves well in practice. Just as in the
+earlier algorithm we have that F^i(x0) = F^j(x0) (mod p) implies
+F^k(x0) = F^{2k}(x0) (mod p) for some k.
+So, the set of the first S iterates of F() starting from random x0 is
+likely to contain a collision. But since once x,y form a collision so
+do F(x), F(y), it suffices to search for a collision among the pairs
+F^k(x0), F^{2k}(x0) which leads to the following algorithm:
+b <- Z_N \ {0,-2}
+F(x) = x^2+b mod N
+x <- Z_N
+y = x
+for k=1..2*sqrt(N) do
+  x = F(x); y=F(F(x))
+  if gcd(x-y,N) != 1 a factor is found. 
+repeat the whole procedure until either a factor is found or resources
+are exhausted.
+Assuming that the function F() has the required properties (which has
+not been proved to this date!) the runtime of this algorithm is
+proportional to N^{1/4} which is still exponential in n but requires
+one to double n if one wants to stay immune against this
+method.
+The quadratic sieve method
+- - - - - - - - - - - - - -
+The starting point for the most efficient factoring algorithms to this
+date is the following observation made by Pierre de Fermat:
+If we succeed in expressing N as the difference of two
+squares N=x^2-y^2 then N=(x+y)(x-y) so we have factored N.
+Fermat suggested to try out successive values for x starting at
+sqrt(N) and for each of those checking whether x^2-N is a square.
+The first improvement one can make on this observation is that it
+suffices that N divides x^2 - y^2 for if kN = (x+y)(x-y)
+then---assuming that x != (+/-)y mod N---the factors p and q of N
+must straddle the product so that gcd(x-y,N) or gcd(x+y,N) will be
+different from 1.
+Thus, we seek x,y:Z_N such that x^2-y^2 = 0 (mod N) and x !=y, x!=-y mod N. 
+Notice that by the Chinese Remainder Theorem each square has four
+roots modulo N, so given x, two out of the four roots of x^2 are good.
+Now, to find such x,y more efficiently we proceed as follows. We
+select a set B of small primes and try out various values x>sqrt(N) so
+that x^2 mod N can be factored (because after reduction mod N it is
+sufficiently small) and its factors are in B. In this way, we find
+numbers x_1 .. x_l such that each x_i^2 mod N is expressed as a
+product of primes in B.
+With each of these numbers x_i we associate a B-indexed 0,1 vector
+whose entry at p:B is the exponent of p in the factorisation of x_i
+modulo 2.
+For example, if B={2,3,5,11} and x_1^2 mod N is 2^5*3^2*11^2 then the
+associated vector is (1 0 0 0).
+If l exceeds the size of k then there is a (mod 2) linear combination
+of these vectors which results in the all zero vector. But multiplying
+the corresponding x_i together then yields a square because all prime
+factors have even exponents.
+Continuing the example, if x_2^2 mod N is 2^15*11, i.e. (1 0 0 1), and
+x_3^2 mod N is 11^13, i.e. (0 0 0 1) then, since 1000+1001+0001=0000,
+we obtain the following identity mod N:
+                      (x_1*x_2*x_3)^2 = (2^10 * 3 * 11^8)^2
+and, heuristically, we may assume that there is a 50% chance for this
+identity to be nontrivial in the sense that it yields a good pair of
+square roots.
+Summing up, we build up a large basis of values x whose squares mod N
+can be completely factored, then compute a linear combination (mod 2)
+of the zero vector from the corresponding B-indexed vectors and check
+whether the resulting identity of squares mod N is nontrivial which
+then yields the desired factor.
+We now study the problem of finding discrete logarithms in a cyclic
+group G as above. 
+The Pohlig-Hellman algorithm
+- - - - - - - - - - - - - - -
+If we have a factorisation of the group order q as prod_{i=1}^n q^i
+with the q_i relatively prime (gcd = 1) then, if we have values x_i
+such that
+                 (g^{q/q_i})^x_i = y^{q/q_i}
+we can find x by solving (using the Chinese Remainder Theorem) the
+system of modular equations
+                          x mod q_i = x_i
+Since g^{q/q_i} generates a subgroup of order q_i which is easier if
+the q_i are small. It remains to see why x satisfies the above system
+of equations. Indeed, if g^x=y then g^{q/q_i}^x = y^{q/q_i} so x mod
+q_i solves the discrete log problem of order q_i.
+It remains to see how to deal with groups whose order is a prime power
+q=p^f. In this case, one can expand the unknown x as
+     x = x_0 + x_1 * p + x_2 * p^2 + ... + x_{f-1} p^{f-1}
+Now, by exhaustive search or using any other algorithm we can
+determine x_0:Z_p such that g^{q/p}^{x_0} = y^{q/p} (mod p). Once we
+know x_0 we can then recursively solve (g^p)^{x'} = y/g^{x_0} which
+yields x_1,... The details are left as an exercise.
+The giant-step baby-step algorithm
+- - - - - - - - - - - - - - - - - - 
+Here the idea is that the unknown x can be decomposed as x =
+t*[sqrt(q)] + s for some s <sqrt(q).  Here [sqrt(q)] stands for the
+greatest integer <= sqrt(q). We then precompute a table of the values
+y_t := g^{t*[sqrt{q}]} for t=0..sqrt(q), this are the giant
+steps. Now, for each s we look for (using binary search or similar)
+for a t such that
+         g^{t*[sqrt(q)]} = y*g^{-s}
+Once we have found it we can output x = s*[sqrt(q)]+t. The runtime of
+this method is O(sqrt(q)) which again essentially halves the "bits of
+security".
+The index calculus method
+- - - - - - - - - - - - - 
+Unlike the earlier methods the index calculus makes special
+assumptions on the group G for which to solve the discrete logarithm
+problem. Namely, it must be the group Z_p^* for some prime p.  It can
+be generalised to the group of invertible elements in a finite field,
+but does not work in the case of ellipic curves, for instance.
+So, let us assume that G=Z_p^* and that g,y:Z_p^*. As before, we seek
+x such that g^x=y. As before, we write q=p-1 for the group order. 
+Let us write log_g(z) for the discrete logarithm of some value z:Z_p^*
+so that g^{log_g(z)} = z. The idea is to precompute a table of such
+logarithms. To that end, we choose values x_1..x_l such that g^{x_i}
+can be factored modulo p into small primes from a previously fixed set
+B={p_1,...,p_k}.From
+        g^{x_j} = prod_{i=1}^k p_i^{e_{j,i}} (mod p)
+we then obtain
+          x_j = sum_{i=1}^k e_{j,i} * log_g(p_i) (mod q)
+Now, assuming that l in relation to k was chosen large enough this
+system can be olved for the unknowns log_g(p_i). We thus have obtained
+the desired "table of logarithms".
+It allows us to easily compute the discrete logarithm of any number
+w:Z_p^* which can be factored into primes from B: if w = prod_{i=1}^k
+p_i^{e_i} then log_g(w) = sum_{i=1}^k e_i * log(p_i) (mod q).
+If y itself has that property we are obviously done. Otherwise, we can
+search for a value With the latter at hand we can then --- with luck
+--- take the logarithm of our y as follows: we choose some value z:Z_q
+at random such that y*g^z can be so factored and thus taken a
+logarithm of. This helps, since log_g(y) = log_g(y*g^z)-z.
+Using advanced number-theoretic methods one can analyse the
+probabilities that the various random choices in this method succeed
+and thus derive an expected runtime that is subexponential (in
+||q||). Indeed, the index calculus method is currently the best known
+method for computing discrete logarithms.
+We also remark that it is the possibility of factoring elements of G
+into primes what makes this algorithm possible. For other groups, in
+particular the multiplicative group of a finite field, analogous
+notions of "primes" can be found, but for groups derived from elliptic
+curves no such notion has been discovered and hence these groups are
+immune against index calculus making them particularly promising for
+cryptography. Having said that, taking discrete logarithms in a group
+Z_p^* where p has 1000+ digits remains unfeasible.
diff --git a/ss2017/cryptography/Notes16.txt b/ss2017/cryptography/Notes16.txt
new file mode 100644
index 0000000..f75607d
--- /dev/null
+++ b/ss2017/cryptography/Notes16.txt
@@ -0,0 +1,318 @@
+16 Chosen ciphertext and random oracles
+- - - - - - - - - - - - - - - - - - - -
+Many public key schemes including the ones using the RSA assumption
+presented earlier make use of an unkeyed hash function to scramble
+ciphertext and thus to enhance security, thus in particular preventing
+chosen-ciphertext attacks. In this chapter, we will formulate these
+attacks, remedies against them, and a semi-rigorous method for proving
+there relative security.
+Chosen ciphertext attacks
+- - - - - - - - - - - - -
+Chosen ciphertext attacks (CCA) model scenarios where the attacker can
+request the decryption of ciphertexts of its choice. This is relevant
+in a number of practical scenarios, where cryptosystems are used as
+building blocks in larger protocols. Here a "man in the middle" can
+intercept message and assume the role of one of the participants. The
+messages it fabricates will then---at least up to a point---be
+decrypted by the honest parties.
+None of the schemes presented to far is secure against CCA. Consider
+for example a symmetric key encryption given by Enc_k(m) =
+(r,F_k(r)+m) where r is a random nonce and F_k() is a PRF.
+Consider an attacker who wishes to distinguishes encryptions of m0 =
+000...0 and m1 = 1111..1 as in the PrivK^eav experiment. If the
+attacker is given (r,c) being the encryption of either m0 or m1 all it
+has to do is to request the decryption of the chosen ciphertext (r,c')
+where c' is c with the first bit flipped. Clearly, this will be
+1000..000 or 0111...111 according to whether m0 or m1 has been
+encrypted and hence the attacker wins.
+The situation is similar in the public key setting. Suppose for
+instance that (c1,c2) is an ElGamal encryption of some message
+m. Then, as we know, c1=g^y and c2=h^y*m where y is random but private
+and h=g^x is public. Now, for any m' we have that (c1,c2*m') is a
+valid encryption of m*m'. It is clear that this allows an adversary to
+win the PubK^eav experiment: Given (c1,c2), the adversary can request
+the decryption of (c1,c2*g). The adversary thus obtains m*g and can compute
+m by multiplication with g^(-1). A similar argument can be made for
+various cryptosytems based on RSA.
+The property of the cryptosystems making that form of attack possible
+is known as *malleability*. From a given ciphertext it is possible to
+produce another one that corresponds to a known transformation of the
+plaintext.
+We now define security against chosen ciphertext attacks formally in
+the public key setting. The corresponding definition for private keys
+is analogous.
+Let A be a ppt adversary and Pi be a public key cryptosystem. The experiment
+PubK^cca_{A,Pi}(n) is defined as follows.
+1. (pk,sk)<-Gen(n)
+2. A is given n, the public key pk, and access to a decryption oracle Dec_sk()
+/* in the private key setting A also gets access to an encryption oracle like in CPA security */
+   A outputs two messages m0,m1
+3. b<-{0,1}
+4. c<-Enc_pk(m_b) is given to A
+5. A continues to have access to the decryption oracle but may, of course, not
+   request the decryption of c itself. Any other ciphertext is allowed.
+6. A outputs a bit b' and the outcome is 1 if b=b', 0 otherwise.
+The cryptosystem is deemed CCA-secure if the success probability in
+this experiment is 1/2+negl(n).
+CCA secure private key cryptography using MACs
+- - - - - - - - - - - - - - - - - - - - - - - -
+In the private key case such CCA-secure schemes can be constructed by
+combining a CPA-secure system PiE with a secure MAC PiM as follows:
+A key pair (k1,k2) is generated, k1 for PiE and k2 for the MAC PiM.
+The encryption of m under (k1,k2) is given by (c,t) where
+c<-Enc_{k1}(m) and t<-Mac_{k2}(c).  Thus, we authenticate the
+ciphertext so as to prevent any modification.
+To decrypt (c,t) we first check that Mac_{k2}(c)=t and only then
+return Dec_{k1}(c). If Mac_{k2}(c) is different from t we return a
+default value _|_.
+One can now show that---assuming PiE and PiM are themselves
+secure---this scheme achieves CCA security. The proof idea is that
+assuming PiM is secure the probability that the adversary makes a
+query which is answered with a value different from _|_ is
+negligible. We can therefore assume that all queries are answered with
+_|_ and in this case the experiment degrades to a PrivK^cpa experiment
+against which PiE is assumed secure. For details see KL4.8.
+CCA-secure public key cryptography using hash functions
+- - - - - - - - - - - - - - - - - - - - - - - - - - - -
+Building CCA-secure public key systems is more difficult and typically
+uses hybrid encryption with a CCA-secure private key system and,
+additionally, a "cryptographic hash function" H(). Here we show one such
+system using RSA, which at the same time gives an example of a
+security proof based on the RSA assumption.
+Assume thus that Pi0=(Enc0,Dec0) is a CCA-secure private key
+cryptosystem which uses uniformly distributed keys from {0,1}^n (so
+there is no "Gen0"). Assume further that H:{0,1}^n->{0,1}^n is a
+"cryptographic hash function". We will later discuss what this should
+mean formally in this context. We will need stronger properties than
+just the collision-resistance that we have required of hash functions
+before.
+Finally, assume that GenRSA() is an RSA key generator for which the
+RSA hardness assumption holds.
+We then construct a public key cryptosystem Pi=(Gen,Enc,Dec) as follows.
+Gen(n):
+  (N,d,e)<- GenRSA(n)
+  pk = (N,e), sk=(N,d)
+  return (pk,sk)
+Enc_{(N,e)}(m):
+   1. r <- {0,1}^n
+   2. k = H(r)
+   3. return (r^e mod N, Enc0_k(m))
+Dec_{(N,d)}(c1,c2):
+   1. r = c1^d mod N
+   2. k = H(r)
+   3. return Dec0_k(c2)
+Why should this scheme be CCA secure? Intuitively, given the RSA-assumption,
+the attacker has no way of recovering r, hence the likelihood
+that it applies (the publicly known function) H() to r is negligible
+and as a result all H()-values it sees would look random to it. Thus,
+the key used in the private key encryption of the actual message is as
+good as random so that the security assumption on Pi2 applies and the
+entire scheme is secure.
+We need the function H to make the key k look random. It may be tempting
+to just try to use r as the encryption key. But the RSA-assumption gives
+us only that the attacker cannot recover r fully. It does not rule out
+that the attacker obtains a few bits of r. If we used r as key, then
+it would not be fully random and the security assumption on Pi2 could
+not be used.
+The random oracle model
+- - - - - - - - - - - -
+The question is under what assumptions on H() the above argument can
+be made rigorous. Assuming H() to be a PRF does not work because those
+always require a secret key. Once you know the key there is nothing
+random about a PRF. Taking a PRG also does not work, as its output
+looks random only if the input is random as well.
+Assuming that H() is merely a secure hash function does not suffice
+either because nothing prevents a good hash function from, for example,
+producing hash strings which contain predictable portions.
+E.g. if H0() is a secure hash function so is H(x) = 0.....0 H0(x).
+Unfortunately, there have not been found any plausible assumptions on
+the function H() under which schemes like the above could be proved
+secure. A possible way out is the random oracle model which we now describe.
+In this model, one instantiates H() with a function that is randomly
+chosen at the beginning of each experiment, i.e., the probabilities
+are taken over uniform random choices of H() which is therefore called
+*random oracle*. All participants in the experiment get access to H.
+It would be possible to actually implement this by using a trusted
+party, e.g. an internet server which must called for any evaluation of
+H().  It can either precompute the entire value table of H() (finite
+for a given security parameter) using random choices or alternatively,
+fill this table on demand, i.e. regurgitate H-values on previously
+requested arguments and make random choices for new requests.
+In practice, however, this approach is unfeasible, and so the random
+oracle, for which schemes like the above one *can be proved secure*,
+is replaced with a cryptographic hash function such as SHA-2. Whether
+or not the security proof in the random oracle model has any bearing
+on this instantiation remains open.
+In fact, it has been shown that there exist contrived cryptographic
+schemes which are secure in the random oracle model but insecure with
+any concrete instantiation. Nevertheless, a proof in the random oracle
+model consists strong evidence as to the security of a scheme and
+perhaps even more importantly, failure to prove a system secure in the
+random oracle model might point out actual flaws of a scheme. KL
+contains a longer discussion of the metamathematical or epistemic
+implications of the random oracle model without however being able to
+get over the central issue that a proof in the random oracle model
+does not imply anything---in the rigorous mathematical sense---about
+the "standard model" which is used in actual practice.
+Secure hash functions and pseudorandom functions in the random oracle model
+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+As a warmup let us see how random oracles can be used to construct
+secure hash functions and pseudo random functions.
+We fix a length function l(n), e.g. l(n)=n or l(n)=n/2 and let H :
+{0,1}^n->{0,1}^{l(n)} be a random function.
+We now show that H() itself is a secure hash function. Consider a ppt
+adversary A who gets oracle access to H() and conduct the Hash-coll
+experiment from Chapter 11 with random H():
+1. Choose H() at random
+2. Adversary A gets oracle access to H() and outputs values x,x':{0,1}^n
+   such that x != x'.
+3. The outcome of the experiment is 1 iff H(x)=H(x')
+Theorem:
+The success probability of this experiment, taken over uniform random
+choice of H and the random choices that A might make is a negligible
+function of the security parameter.
+Proof. Let x1...xk be the sequence of values that A queries of H prior to
+outputting x,x'. Clearly, k is polynomial in the security parameter n
+due to the polynomial time bound on the computation time of A. The
+probability that H(xi)=H(xj) for some i!=j is therefore
+negligible. Notice that we can assume that H-values are selected "on
+demand". Thus, when x,x' are among the x1..xk the success probability
+is negligible. If, however, A chooses at least one of them afresh
+then, again, the success probability is negligible because only a
+negligible portion of the possible H-values is favourable. QED
+Notice that, in a way, the earlier account of hash functions in
+Chapter 11 was also based on some kind of random oracle in that we
+assumed a uniformly random selection of the seed which also does not
+happen in practice.
+A pseudorandom function from a random oracle
+- - - - - - - - - - - - - - - - - - - - - - -
+The random oracle can be used to obtain a PRF as follows:
+Theorem: Let l(n)=n/2 and define F_k(x) = H(k || x). Notice that for
+k,x:{0,1}^n we have F_k(x):{0,1}^n. Now, F_k() is a PRF in the
+following sense. Let D be a ppt distinguisher with oracle access to a
+function g():{0,1}^n->{0,1}^n *and* the random function H(). We write
+this as D^{g(),H()}(n). We then have
+| Pr[ D^{F_k(),H()}(n)=1 ] - Pr[ D^{f(),H()}(n)=1 ] | = negl(n)
+with probabilities taken over a random function f(), the key k and the
+random function H() appearing in F_k().
+Proof: As before, we can condition on the queries made to H() by D
+during its computation. So long as these are disjoint from those made
+as part of evaluations of F_k() we are in the situation of the
+previous theorem. But then the probability for such a coincidence is
+negligible by the randomization over k. The details are left as an
+exercise.  QED
+It is now also possible to show that the abovedescribed hybrid
+encryption system is CCA secure; however, for simplicity, we only show
+its CPA security which is already nontrivial. Also, we have not
+seen any provably secure RSA-based cryptosystem so far.
+Theorem: Let Pi0 be a CPA-secure private key system and let
+Pi be the hybrid system as described above. Then for any ppt adversary, which
+is given additional access to the random oracle H(), one has
+Pr[ PubK^eav_{Pi,A^{H()}}(n) = 1 ] = 1/2 + negl(n)
+with probabilities taken over random choice of H(), the random steps
+taken within A, and those in the PubK^eav experiment.
+Proof sketch: For convenience we recall the steps of the experiment.
+1. H() is chosen at random.
+2. (N,e,d) <- GenRSA(n); pk=(N,e); sk=(N,d)
+3. A is given access to pk and H()
+4. A outputs messages m0, m1.
+5. r<-{0,1}^n; k=H(r)
+6. b<-{0,1}
+7. c1<-r^e mod N
+8. c2<-Enc0_k(m_b)
+9. A is given (c1,c2) and returns b'
+10. Outcome is 1 iff b=b'
+The encryption and decryption algorithms that A is given oracle access
+to essentially follow lines 5-9 and are not repeated here.
+Let us call QUERY the event that A queries H() on one of the random
+nonces r used either in line 5 or as part of its queries to the
+encryption oracle.
+The idea is that since---assuming as we do---GenRSA satisfies the RSA
+hardness assumption, the probability of QUERY is negligible.
+This is because we can use A to construct an adversary in the RSA-inv
+experiment as follows. We use A as a subroutine and run it according
+to protocol. We monitor all queries of A to H() and for each one of
+those, call it r_q, check whether r_q^e mod N is correct. If this is
+the case, then we output r_q. By the RSA hardness assumption, this
+adversary can output the correct r only with negligible probability,
+which means that the probability of QUERY must be negligible.
+Once we know that QUERY is a negligible event we can assume without
+loss of generality that it does not occur, i.e., oracle access to H()
+returns random values that are completely unrelated to the experiment.
+We can thus assume that A does not get access to H() at all and then
+the CPA-secure scheme Pi0 is secure by assumption.
+We refer to KL Theorem 13.1 for details and to KL Theorem 13.5 for a
+proof that the hybrid scheme even yields CCA security assuming Pi0 is
+CCA secure.
+We also remark that the deterministic version of RSA used to encrypt
+the symmetric key can be replaced by an arbitrary so-called key
+encapsulation mechanism (KEM), without essentially changing the
+proof. See KL, 2nd ed for details.
diff --git a/ss2017/cryptography/Notes17.txt b/ss2017/cryptography/Notes17.txt
new file mode 100644
index 0000000..6a15827
--- /dev/null
+++ b/ss2017/cryptography/Notes17.txt
@@ -0,0 +1,194 @@
+17 Digital signatures
+---------------------
+Digital signatures are the public key analogues of message
+authentication schemes (MAC). They allow a *signer* to authenticate a
+message with their *private key* by producing a tag that can
+subsequently be verified by anyone who has the corresponding public
+key. Just as with MACs the tags are specific to the message being
+signed. Changing the message in any way should invalidate the tag,
+thus the "signature". Also no-one who is not in the possession of the
+private key corresponding to the public key used for verification
+should be able to produce a valid signature.
+Definition: a signature scheme is a triple Pi=(Gen,Sign,Vrfy) of ppt
+algorithms where
+1. Gen(n) produces a public/private key-pair (pk,sk) 
+2. The signing algorithm Sign takes a secret key and a message m and
+produces a signature sigma<-Sign_sk(m).
+3. The (deterministic) verification algorithm Vrfy takes a public key,
+a message m, and a signature sigma, and answers 1 or 0. If
+sigma<-Sign_sk(m) then Vrfy_pk(m,sigma) must return 1.
+If Sign_pk is defined only for messages of length l(n) for some
+function l() then we speak of a *fixed length* signature scheme.
+The signature scheme Pi=(Gen,Sign,Vrfy) is *secure* if the
+success probability of the following experiment is negligible for any
+ppt adversary A:
+Experiment Sig-forge_{A,Pi}(n):
+1. (pk,sk)<-Gen(n)
+2. A gets n, pk, and oracle access to Sign_sk(). The adversary then
+outputs (sigma,m). Let Q be the sequence of questions to the oracle
+posed so far, i.e. messages for which a signature has been requested.
+3. The outcome of the experiment is 1 iff Vrfy_pk(m,sigma)=1 and m is not in Q.
+End of definition.
+In KL our "secure" is called "existentially unforgeable under an adaptive
+chosen-message attack" to differentiate from other, weaker notions of
+security.
+Insecure signature based on textbook RSA
+- - - - - - - - - - - - - - - - - - - - -
+A naive attempt at defining a signature scheme based on RSA goes as
+follows:
+Gen(n): (N,e,d)<-GenRSA(n);pk=(N,e);sk=(N,d)
+   output (pk,sk)
+Sign_{(N,d)}(m):
+   output m^d mod N
+Vrfy_{(N,e)}(m,sigma):
+   output 1 iff sigma^e = m (mod N)
+Clearly, verification of a valid signature succeeds but the scheme is
+insecure for several reasons. First, if an adversary can choose an
+arbitrary signature sigma first and then fabricate a corresponding
+"message" as m=sigma^e mod N. Notice that Vrfy_{(N,e)}(m,sigma)
+succeeds, so this adversary will win the experiment with certainty. 
+Second, and perhaps more realistically, if the adversary would like to
+get a signature on a previously fixed message m it could request
+signatures sigma1 for random m1 and sigma2 for m2:=m/m1 (mod N). It is
+then clear that sigma1*sigma2 mod N is a valid signature for the
+original m.
+Secure signatures with random oracles
+- - - - - - - - - - - - - - - - - - -
+It is possible, but rather inefficient, to construct secure signature
+schemes based on the RSA assumption and a collision-resistant hash
+function, see KL12.6. In practice, one uses signature schemes based on
+cryptographic hash functions which can only be proved secure in the
+random oracle model. We thus modify our definition of secure signature
+scheme to the effect that both parties now have access to a function
+H:{0,1}^n->{0,1}^n which is selected uniformly at random at the
+beginning of the experiment.
+In this situation we have the following signature scheme RSA-FDH
+("full domain hash").
+RSA-FDH:
+Gen(n): (N,e,d)<-GenRSA(n);pk=(N,e);sk=(N,d)
+   output (pk,sk)
+Sign_{(N,d)}(m):
+   output H(m)^d mod N
+Vrfy_{(N,e)}(m,sigma):
+   output 1 iff sigma^e = H(m) (mod N)
+In practice, the random oracle H() will be replaced by a cryptographic
+hash function such as SHA-2 and the scheme is just like the
+abovedescribed insecure one with the exception that the message is
+hashed prior to signing.
+Let us see how this prevents the previous two attacks. Firstly, if the
+attacker starts with arbitrary sigma then in order to fabricate a
+corresponding message it would have to find m such that sigma^e=H(m)
+(mod N) and thus for a given H()-value which it *can* compute, namely
+sigma^e, find an inverse image, which is infeasible.
+Similarly, in order to perpetrate the other attack it would have to
+fabricate for given m two values x and y so that H(x)*H(y)=H(m).
+While this is infeasible for truly random H() and not possible with
+current knowledge for SHA-2 it is not prevented by mere
+collision-resistance.
+Theorem: If GenRSA() fulfills the RSA assumption then RSA-FDA is
+secure in the random oracle model.
+Proof: Let A be an attacker in the Sig-forge_{A,RSA-FDA}
+experiment. We may fix a polynomial q(n) bounding the queries of A to
+H() and also assume that A first queries H(m) for any message for
+which it requests a signature or which it finally outputs. We also
+assume that A makes exactly q(n) queries. All this is done in order to
+save a few case distinctions in the argument.  We now build from A an
+attacker A' against GenRSA() in the RSA-inv(n) experiment from Chapter
+12 as follows.
+Attacker A' against GenRSA:
+Given N, e, y* = x^e mod N we (assuming the role of A') choose
+j<-{1..q(n)} and forward the public key pk=(N,e) to A.
+When A makes its i-th query m_i to H() we answer as follows:
+If i=j we return y*.  If m_i has been asked previously, i.e., if
+m_i=m_j for some j<i, then we answer consistently using a table to
+store earlier answers.  Otherwise, we choose sigma_i <-Z_N^* uniformly
+at random, answer y_i=sigma_i^e mod N and remember both y_i and
+sigma_i in our table.
+As a result, for all the H()-values that A has access to we know their
+e-th root modulo N.
+When A requests a signature on message m then, since H(m) has been
+queried before, we have stored in our table sigma so that sigma^e=H(m)
+and can thus return that very sigma which clearly is a valid
+signature.
+The only one message for which we do not have an e-th root is the j-th
+one. Should the adversary A request a signature for the latter we
+abort the experiment, i.e. return an arbitrary guess.
+Finally, A will output a pair (m,sigma). If m=m_j and sigma^e=y* (mod
+N) then we can output sigma (and we will win the RSA-inv experiment).
+This concludes the description of the adversary A' against GenRSA().
+We first note that even though we prepare the answers to A's
+H()-queries using a table and case distinctions the resulting function
+looks uniformly random from A's perspective. So, it, i.e. A, will work
+just as in the standard Sign-forge experiment.
+Now, A' wins the RSA-inv experiment if A produces a valid forgery
+(m,sigma) and m = m_j.  If epsilon(n) is A's winning probability in
+the Sig-forge experiment then this happens with probability
+epsilon(n)*1/q(n). But by the RSA hardness assumption the latter
+quantity is negligible and therefore, since q(n) is polynomial in n,
+so is epsilon(n). QED 
+We remark that a similar scheme based on discrete logarithms is known
+as the Schnorr signature. 
+Hash-and-sign
+- - - - - - -
+We note that if the hash function used here is arbitrary length then
+the above scheme allows for the signing of messages of arbitrary
+length. The security proof can be generalised to this, but becomes
+somewhat awkward since arbitrary length random oracles have not been
+properly defined.
+Alternatively, one can turn any secure fixed length signature scheme
+into an arbitrary length one by first hashing the message to be signed
+with an arbitrary length collision-resistant hash function and then
+signing the hash. The proof of security is analogous to the one for
+the Hash-and-MAC paradigm given in Chapter 11. This is known as
+"Hash-and-sign".