Diffie Hellman Key Exchange
Part of
22C:169, Computer Security Notes

Conventional or symmetric key cryptography rests on the use of a secret key that is known both to the sender and receiver of the message and nobody else.
cyphertext = encrypt( plaintext, key )
plaintext = decrypt( cyphertext, key )
Public key or assymetric key cryptography emerged in the 1970's, and it represents a new model. Instead of a single key, there are two keys, one for encryption and one for decryption.
cyphertext = encrypt( plaintext, key_{1} )
plaintext = decrypt( cyphertext, key_{2} )
To be interesting, it must be practically impossible to compute one key from the other, so given key_{2}, the only way to find key_{1} should be trial and error, and visa versa.
It's easy to outline the idea, as above, but hard to come up with algorithms that implement this idea. DiffieHellman key exchange was the first scheme to demonstrate that something like this was possible. This is a technique allowing two users of a public communications channel to synthesize a shared secret key without allowing an eavesdropper to know the key. Generally, symmetric key cyphers are computationally easier to use, so assymetric key techniques are frequently used for key exchange, to this day.
DiffieHellman key exchange allowing Alice and Bob to share a key works as follows:
Step 1: Alice and Bob agree on a large prime number p and an integer g that is a primitive root modulo p  that is, an integer less than p such that every value from 0 to p1 is equal to g^{i} mod p for some value of i. (The general definiton of primitive root is messier, because it allows for a nonprime modulus). They disclose, in public, the values of p and g, so it doesn't matter who first proposes these values.
Step 2: Alice and Bob each select secret integers less than p. They never disclose these integers. Alice selects a and Bob selects b. These integers determine the key they will end up sharing.
Step 3: Alice tells Bob A = g^{a} mod p and Bob tells Alice B = g^{b} mod p; A and B are disclosed in public. Note that all a member of the public has to do in order to reconstruct, for example, a from A is to invert the function f(x) = g^{x} mod p. We rely on the fact that inverting this function is hard  this is a trapdoor function.
Alice and Bob now compute their shared secret key. Alice computes it as K = B^{a} mod p while Bob computes it as K = A^{b} mod p
In effect, a and b are the secret keys used in this exchange, while A and B are the public keys.
The proof is straightforward:
B^{a} mod p ?=? A^{b} mod p
Substitute definitions of A and B:
(g^{b} mod p)^{a} mod p ?=? (g^{a} mod p)^{b} mod p
Note that modular arithmetic can be simplified:
g^{ba} mod p = g^{ab} mod p
The equality is proven because multiplication is commutative.
This is how ssh (the Unix secure shell protocol) works. The client uses /dev/rand to pick its random number (a hardware random number generator), and the serveer uses either /dev/rand or a cryptographically secure pseudorandom number generator to start the process. The use of pseudorandom numbers on the server side may be required because a busy server may need to generate many keys per second in order to handle multiple clients, but the hardware random number generators of most Unix systems can only generate a few tens of bits per second  they use things like the least significant bits of the realtime clock at the time of events like keypresses, arrival of internet packets and completion of disk transfers as a source of randomness.
The wikipedia articles are good: