Randomness
Part of
22C:169, Computer Security Notes

A classic encryption method involves what is called a onetime pad, sometimes called a Vernam Mauborgne cypher. The modern history of onetime pads begins in the 20th century when Joseph Mauborgne recognized that Gilbert Vernam's cryptographic machine produced cyphertext that was truly unbreakable if the key was truly random. This is actually a reinvention  Steven Bellovin has recently discovered that a California banker named Frank Miller published the same basic idea in his Telegraphic Code to Insure Privacy and Security in the Transmission of Telegrams in 1882. Variants of this idea may be older, since book cyphers where you never reuse any page of the book achieve a similar effect.
In a book cypher, the key is some book that both the sender and recipient of the message happen to own. Various editions of the Bible are perennially popular here. To send a message, you use some page of the book, where the sender and receiver have some way of agreeing on the page to use. A typical scheme is to pick the page depending on the date  for example, for messages encrypted on the 53rd day of the year, use page 53.
In addition to agreeing on a starting page, the sender and reciever need to agree on how to use that page. The oldest book cyphers were actually codes (in that words were transmitted, not letters) in which a search for the word to be sent was made by looking forward from the designated starting point.
So, if this sentence was being used, the message "sentence used twice" would be encoded as 4712. If we have a code wheel, as used for classic Caesar cyphers, we could as easily use the consecutive letters on the page, starting from the first letter, as keys to set the code wheel. This is called a running key cypher.
ABCDEFGHIJKLMNOPQRSTUVWXYZ Key = S SRQPONMLKJIHGFEDCBAZYXWVUT Plaintext = S ^ Cyphertext = A ABCDEFGHIJKLMNOPQRSTUVWXYZ Key = O ONMLKJIHGFEDCBAZYXWVUTSRQP Plaintext = E ^ ^ Cyphertext = K ABCDEFGHIJKLMNOPQRSTUVWXYZ Key = I IHGFEDCBAZYXWVUTSRQPONMLKJ Plaintext = N ^ ^ Cyphertext = V ABCDEFGHIJKLMNOPQRSTUVWXYZ Key = F FEDCBAZYXWVUTSRQPONMLKJIHG Plaintext = T ^ ^ Cyphertext = M
In modern runningkey cyphers, we use an exclusive or on the bit patterns representing each character with the bit pattern of the key instead of using a code wheel. That is what Vernam's machine was the first to do.
A cryptanalyst who suspects that a book cypher is being used will, of course, begin by trying to figure out what book might be in use. This is why common bible editions are such a bad bet. The cryptanalyst will also need to work out how the pages of the book are being used.
Running key cyphers are subject to cryptanalytic attack if the key is not random. Knowing that the key is an English text, for example, and knowing that e is the most common letter, then English text encrypted with this key will encrypt e with e more frequently than any other encryption. As a result, we can use letter frequency distributions to attack such a code.
If the same page of the book is ever reused, whether for is a running key cypher or a classic wordbased book code, frequency distributions of common plaintext letters or words will begin to reveal the structure of the text. So, book codes where you use one page of the book for every message sent in that day are weak. Furthermore, once you begin to discover properties of the encrypted text, you can also begin to discover properties of the text used as a key, and that helps you hunt up the right book in the library.
Running key cyphers where the key is never reused and is entirely random are unbreakable using cryptanalytic attacks. This leads naturally to the idea of printing random text in the book instead of fixed text, and destroying each page of the book as it is used. That is how a onetime pad works. Nowdays, onetime pads are distributed on media such as CDRW disks instead of paper  RW so that your crypto software can erase each segment of the key as it is used.
Generating a onetime pad by hand is entirely possible. Just use a roulette wheel, dice or coin tosses to generate random numbers, and take down two identical copies of the sequence of numbers generated. Carbon paper can be used to avoid making clerical errors in the result.
Automatic generation of streams of random numbers is not easy! We usually build machines to have predictable behavior. Most genuinely effective hardware random number generators rely on measuring quantum phenomona, thermal noise or radioactive decay. On computers, we can also use arrival times of network packets and key presses as effectively random events, although we can only extract a few bits of randomness from each such event, since the average rate of events is extremely predictable  the randomness is in the deviation of each event from the average rate.
It is very tempting to use algorithmic methods to generate random numbers. The problem with this, of course, is that the very fact that something is computed by an algorithm implies an entirely deterministic process. We call an algorithm that generates apparently random numbers a pseudorandom number generator, and we refer to the output as pseudorandom to emphasize that it is actually predictable.
"The generation of random numbers is too important to be left to chance." This famous warning about the difficulty of designing pseudorandom number generators is attributed to Robert R. Coveyou of Oak Ridge National Laboratory. He's right.
Most pseudorandom number generators are not very good.
In general, a pseudorandom number generator with a finite internal state must generate a periodic sequence. For an n bit state, the period can be no longer than 2^{n}. Unfortunately, for many algorithms, the period is much shorter. Thus, if the sequence of outputs is the vector V v_{i} = v_{i+p}, where p is the period of the generator.
We say that the seed for a pseudorandom number generator is the initial value v_{0} from which the sequence is computed.
Linear congruential pseudorandom number generators are the most common family of generators, but even the best of these are a bad choice for cryptography.
v_{i+1} = (a × v_{i} + b) mod m
For example:
v_{i+1} = 65539 × v_{i} mod 2^{31}  the infamous RANDU
v_{i+1} = 16807 × v_{i} mod (2^{32}1)  Park and Miller's minimal standard
Bad selection of the constants a, b and m can lead to very bad generators. Such generators are unfortunately common (the worst offender is an ancient generator known as RANDU which, unfortunately, made it into the Unix/C world as rand. The good and bad linear congruential generators use essentially the same algorithm. All that differ are the "magic numbers" used. If you find such a generator being used, you can find essentially all of the research done on its performance by doing a web search on these numbers, since the combination of values is unlikely to be used anywhere else.
Nobody should ever use a linear congruential generator for cryptography. Cryptographically secure pseudorandom number generators exist, and they have significant applications. Note the following properties of an ideal pseudorandom number generator:
It is noteworthy that a good encryption function should have the exact same properties for any particular encryption key! If the encryption function does not map the entire domain to the entire domain, it is not useful. If the mapping creates multiple small cycles instead of one big cycle, then the job of the codebreaker is simplified. Therefore, good encryption functions ought to be closely related to good pseudorandom number generators!
If we take the output of a pseudorandom number generator and exclusive or it with our text to encrypt or decrypt it, then the seed used for pseudorandom number generation is our encryption key.
Note that if we use an encryption function as a pseudorandom number generator, we have now demonstrated a new way to use that function for encryption!
The Wikipedia is a good source:
For onetime pads, see: http://en.wikipedia.org/wiki/Onetime_pad
For book cyphers, see: http://en.wikipedia.org/wiki/Book_cipher
For runningkey cyphers, see: http://en.wikipedia.org/wiki/Running_key_cipher
For hardware random number generation, see: http://en.wikipedia.org/wiki/Hardware_random_number_generator
For pseudorandom number generation, see: http://en.wikipedia.org/wiki/Pseudorandom_number_generator
For cryptographically secure pseudorandom number generation, see: http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator