1.2. CODE BREAKING

The Enigma Machine (Disclaimer)

One of the first uses of computers dealt with text processing: breaking cryptographic codes during World War II. You are probably familiar with substitution codes where each letter is replaced by another, for example:

THE DOG ATE MY HOMEWORK
ZRP OTQ LZP GD RTGPJTXU

Such codes are easy to break by looking at how often letters appear, so that, in this example, you may guess that "P" may correspond to "E". The correspondence between letters and their encrypted form is called an encoding table. Figure 1.2.1 shows parts of several such tables.

We can make a code harder to break by using several tables and select a different table for each position in the text. If we change the table after each letter and we use the tables of Figure 1.2.1 the word ERROR will be encoded as TTMZS, taking E from table (a), the first R from table (b), the second R from table (c), etc.

A => G
B => S
...
E => T
...
O => F
...
R => K

A => M
B => Z
...
E => F
...
O => L
...
R => T
A => Q
B => E
...
E => S
...
O => A
...
R => M
A => J
B => R
...
E => V
...
O => Z
...
R => A
A => N
B => F
...
E => Q
...
O => B
...
R => S
(a)
(b)
(c)
(d)
(e)
(f)
Figure 1.2.1: Encoding using several tables.

Figure 1.2.1f shows a possible way to implement such a method by using two wheels on the same axis and each having "openings" corresponding to letters. In one of the wheels the openings are connected to the keyboard and on the other wheel are connected to a board of letters with lights. We may add a mechanical way to rotate one of the wheels when we type a letter so we are going to use a different table for each letter. For the table of Figure 1.2.1a, when you type E, the light next to T goes on. But when you type the next letter, say R, the light next to T will go on again because the table of Figure 1.2.1b will be in effect.

As long as both parties have similar devices, they can communicate without a problem and it will be very hard for a third party to figure out their encryption strategy. The German Enigma machine was based on this principle although it implemented a more sophisticate scheme than the one described. The machine was used during World War II for wireless communications and the Germans were confident that no one could figure their encryption strategy.

One way to try to break any code is to find a part of the message where we know the non-encrypted (plaintext) version and from that try to guess the encoding scheme. Can you guess the scheme for the text below?

THE DOG ATE MY HOMEWORK
YNL MYR NHT NA LTSLEXBV

Can you guess the encoding scheme? (if not, click here or turn this page upside down).

Therefore the key step to decryption is to find the plaintext version of a piece of a message. One way is to try all possible combinations of words and see which one of them suggests a plausible encoding scheme. (Note that in a real encrypted message there will be no visible spaces.) How many combinations do we need to try? A natural language has over 100,000 words [2] but we may wish to consider only those likely to appear in a message, so let us say we will try 1,000 words. If we need a sentence of ten words to try to infer the encoding scheme we need to consider 100010 possible sentences. Not all sequences will make sense so let us say that on each position we need consider only 10 possible words, so that the number of sentences is 1010 or 1 followed by 10 zeros, that is 10 billion. (The fact that we can get a huge number out of multiplying a relatively small number by itself a few times is called combinatorial explosion.) To find a human yardstick for a number we start with the number of seconds in a year.

A Number to Remember: There are 3600 seconds in an hour or 86,400 seconds in a day or 31,536,000 seconds in a year. We can write the last number approximately as 3X107and we will use it several times in this document as a humanly meaningful yardstick.

A human looking at one possible sequence per second would need 1010 divided by 3X107 or about 300 years (without stopping to eat or sleep). Allowing for some sleep the time would be closer to 500 years. Assembling a team of 500 people for the task would still require a full year. Therefore the Germans may have been justified in thinking that the Enigma code could not be broken (they changed the settings of the machine every few months). But suppose we have a device that can look at a 1,000 strings a second rather than only one. Then we would need only 107 seconds or about four months! By using a few such devices we could reduce the time to days. Suddenly, the "unbreakable" code can be broken.

This is, roughly, what happened during World War II when Alan Turing supervised the building of a computer that was used to break the Enigma code. It took a few months to break the code the first time and because the Germans changed periodically their settings, the process had to be repeated. This way the British were able to decipher the orders that the German command gave to its U-boats in the Atlantic. As a result convoys bringing supplies to Britain from the United States would be re-routed away from the areas patrolled by the U-boats. This had a significant effect on the war effort [1].

While the above account is an oversimplified version of the code breaking story, it conveys its essence. There was one unexpected gift to the British. The Germans often started their messages with the phrase "I have the honor to inform your excellency", and that helped the initial search.

The Germans knew that the British had found the location of the U-boats because the U-boats could find no convoys to attack. But because they were sure the Enigma code was unbreakable, they thought that spies around the U-boat ports were finding the information from the crews.

We may be tempted to score:

Computers: 1 — Humans: 0

but this would not be fair. First, the computer was designed and programmed by humans. Second, the computers were not smarter than humans. They could do very fast a lot of routine operations.

What Happened to Turing

You would think that at the end of the war Turing would be showered with honors. Not only he had made a fundamental theoretical contribution to the theory of computation, he had also applied his insights into actually building a machine that contributed significantly to the war effort. Unfortunately, Turing was gay and he got in trouble with the law because homosexuality was illegal in Britain at that time. He was hounded and committed suicide in 1954 (at age 41). In 1966 the Association for Computing Machinery (ACM) established the Turing Award that is given annually for outstanding contributions to the field of computer technology. It is considered the equivalent of the Nobel prize for Computing.

Notes

[1] This section has made use of the following excellent biography of Turing:
     Andrew Hodges, Alan Turing: the enigma, Touchtone, 1983

[2] According to http://www.oxforddictionaries.com/page/howmanywords English has over
     175,000 words.

Back to Contents --- Previous Section --- Next Section