Throughout the notes the acronym KL stands for the book "Introduction to Modern Cryptography" by Jonathan Katz and Yehuda Lindell, Chapman & Hall, 2008. Set braces {} may be used to enclose a subscript consisting of more than one letter. E.g x_{10} is "x sub ten". Sometimes we omit the underscore vor subscripting and write x1 or xi for x_1 and x_i. 1 Overview, Historic Ciphers: Caesar, Vigenere, (KL1.3) --------------------------------------------------------- The Shift Cipher - - - - - - - - - Identify the 26 upper case alphabet letters with numbers 0..25, e.g. A=0, B=1. Choose an integer k:{0..25} as "key" and encrypt a letter x as x+k mod 26. Encrypt a text as the sequence of the encryption of its letter. For example, if k=3 then the text BEGINTHEATTACKNOW is encrypted as EHJLQWKHDWWDFNQRZ. The decryption of a letter x is just x-k mod 26 and the decryption of a text is obtained by applying this procedure letterwise. The special case k=3 is known as "Caesar Cipher" since purportedly Julias Caesar used it to encrypt some of his letters. The special case k=13 is known as ROT-13. There are two simple attacks on a shift cipher. The first one uses the fact that there are only 26 possible keys so one can try them all out. This, however, requires that one can recognize the right plaintext, e.g. because it is correct English. The second is based on frequencies of letters. In a standard English text the letters occur with typical frequencies. E.g. E is the most frequent letter with likelihood 12.7%. Then follows T with 9.0% and A with 8.2% etc. Thus, we can count the relative frequencies of the letters in the ciphertext and thus work out the key. An improved and more systematic attack based on frequencies goes as follows: Let p_i stand for the frequency of letter i, e.g. p_4 = 12.7% since E=4. One has sum_{i=0..25} p_i^2 =~= 0.065 So, if q_i is the frequency of the letter i *in the ciphertext* then we expect q_{i+k} to be close to p_i so that we can recover k by choosing j in such a way that the following quantity I_j is as close to 0.065 as possible. I_j := sum_{i=0..25} p_i * q_{i+j mod 26} The Vigenere Cipher - - - - - - - - - - The shift cipher suffers from two problems: 1) the number of possible keys is too small 2) the frequency distribution of letters in the plain text shows in the ciphertext A famous historic cipher not suffering from the first problem is the Vigenere cipher. Here one selects a word or a short text as key, writes the key repeatedly under the plaintext and then "adds" letters one above another (as numbers mod 26). Here is an example with key BEADS: THEMANANDTHEWOMANRETRIEVEDTHELETTERFROMTHEPOSTOFFICE BEADSBEADSBEADSBEADSBEADSBEADSBEADSBEADSBEADSBEADSBE ULEPSOENGLIIWREBRRHLSMEYWEXHHDFXTHJGVOPLIIPRKUSFIADI Alas, the frequency distribution of the plaintext still seeps through and we can use this information to break the cipher without knowing the key: If we know the length of the key, e.g. 5, and the ciphertext is x_0 x_1 x_2 ... then we can apply the previous frequency analysis (with the I_j scores) to the 5 texts x_0 x_5 x_10 ... x_1 x_6 x_11 ... ... x_4 x_9 x_14 ... all of which are obtained by encrypting the corresponding plaintext fragments with a shift cipher. Therefore, their frequency spectrum should be a shift of the standard one and the corresponding shift, hence the individual letters of the key, can be retrieved. In order to get the key length we can follow a similar approach. For a given shift t let q_i be the relative frequency of letter i (e.g. E when i=4) in the sequence x_t, x_2t, x_3t. If the shift t is the actual key length then this subsequence will have the frequency spectrum as English plaintexts just shifted by t so we expect that J_t := sum_{i=0..25} q_i^2 =~= 0.065 since the summation is invariant under the any shift. If, however, the shift was the wrong one then we expect the letters to appear completely random so that in that case the above sum is rather close to 1/26 = 0.038. Thus, we compute J_t for t=1,2,3,... and pick the t for which J_t is largest (and close to 0.065). Other historic ciphers include monoalphabetic substitution (key = arbitrary permutation of the 26 letters), Grand Chiffre, the German Enigma, etc. All these historic ciphers have been broken. See Wikipedia and other online resources.