However, as with many other technologies (e.g. medicine <-> bioterrorism; nuclear power <-> nuclear weapons; etc.) encryption can be used by criminals just as easily as by any other people. In this lecture we look at how encryption creates a serious impediment to forensic analysis of retrieved information.
A ciphertext is the output of an encryption algorithm after you input some plaintext.
A very simple example of how we turn a plaintext message into a ciphertext is the Caesar cipher, purportedly invented by Julius Caesar to keep his message to his generals secret from his enemies. This cipher involves shifting every letter in the message "forward" by three places in the alphabet. For letters at the very end of the alphabet, you go back to the beginning.
a -> D
b -> E
c -> F
w -> Z
x -> A
y -> B
z -> C
So for example if we want to encrypt "hello world" we would get the ciphertext "KHOOR ZRUOG"
To decrypt a message with the key is easy - all you have to do is apply the algorithm with the appropriate decryption key. In the above example, the key is 3, because 3 is what we add to every letter to get the ciphertext, so to get the plaintext back again, we subtract 3 from every letter.
However, to decipher the message without key requires a little more thought. We need to know firstly what encryption algorithm is being used, and secondly, what the decryption key is.
It is generally assumed (using the so-called Kerckhoff's Principle) that the algorithm used to encrypt a message is known by the cryptanalyst. Also, it is obvious that the ciphertext is going to be known by the cryptanalyst. Hence the only real secrets are the plaintext and the encrypting/decrypting key/s. The science of recovering the plaintext and/or the key is called cryptanalysis.
In every single case, it is theoretically possible to find the key to decrypt a ciphertext. Note however that we say theoretically possible - to actually find an encryption key in practice is designed to be as difficult as possible.
So, how is it always theoretically possible to find a decryption key?
The most obvious way to find a key is the brute-force search. This is essentially to try every possible key and to see which one turns out to generate a sensible message. This method can be used with every single encryption technique. In the above example, if we knew we were up against a shift cipher, we would have to work out what the shift was (in this case it is 3). So we would go through all the possibilities (26 in all**) until we found the right one.
So, given it is possible to find they key, how does encryption make it difficult to do so?
Encryption algorithms aim to make it computationally infeasible to either find the decryption key or to find the plaintext message without having the decryption key. This is done by having such an enormous number of possible keys that it would take far too long to try them all out, with current technology and processing speeds, or at least with a desktop computer. Of course, note that it is theoretically possible that you could still guess the key, say if you had great good luck and managed to guess it first time. However the probability of this happening is vanishingly small** and it is not considered a threat for modern encryption.
(Another approach that is similar is to make it not worth the effort to decrypt a ciphertext. For example, I was a consultant for a Cambridge telecomms company and we were developing a hierarchical encryption system. The lowest-level encryption was actually relatively simple, and could be broken with enough desktop computing time, however the benefit in doing this was very small, as it only decrypted a very small amount of data, which it would probably be more efficient to purchase legitimately. Note however that this approach does not address the technical difficulty but rather the motivation for trying to cryptanalyse a ciphertext.)
So encryption algorithms generally rely on the need to have a vast number of keys, which is defined by the number of bits in the key (unless you use a passphrase to generate it!**).
So generally, encryption algorithms cannot guarantee that nobody can decipher the ciphertext without having the key, but we can give a reasonably accurate prediction of how long it would probably take someone to guess the key, using complexity theory.
For example, if we did a brute force search to guess the key for an algorithm, where the key is n bits in length, then it would take us 2n guesses to try all the possibilities (note that it is of course possible to guess they key well before trying out all of the possibilities). However, if the person using the encryption decided to use a bigger key, and added a single bit to the key length, we immediately have twice as many possible keys to try - this is an O(2n) algorithm. As you can see, a small increase in the key length results in a massive increase (a doubling) of the number of possible keys, thus making the job of the cryptanalyst much harder.
A brute force search is not the only way of finding the secret key, although it applies to every encryption algorithm (although for the one-time pad it is ineffectual**). An alternative is to look at the way the algorithm is defined, because sometimes there are characteristics that allow one to (theoretically at least) find the decrypting key from the known information.
An example is the RSA encryption system where there are two keys, the public key and the private key, which are mathematically related. A person's public key is published and anyone wanting to send a secret message will use the public key to encrypt the message for that person. However, the public and private keys are mathematically related, and it is theoretically possible to derive the private (decrypting) key from the public key.
The keys in RSA are derived by choosing two secret numbers p and q, multiplying them together to get N, then calculating phi(n)=p x q which is the Euler totient function. We choose a public key e, then calculate the multiplicative inverse of e modulo phi(n), and this is our decrypting key d. The encrypting and decrypting algorithms are the same, to raise the input to the power of the key, modulo N.
Now if we wanted to cryptanalyse this RSA algorithm, we could easily calculate the decrypting key d from the public information e and N. However, to do this, we need to know either p or q which lets us find phi(n) which in turn lets us calculate d, knowing e. Now this is where we find complexity hindering us - to find p or q knowing N (which is p x q remember), we need to factorise N, yet there are no efficient algorithms for factorising numbers, and the best-known algorithms are subexponential.
Note this applies only to normal digital computers, and there is actually a fast factorisation algorithm developed by Peter Shor that runs in polynomial time, i.e. is relatively efficient. The drawback is that it runs on a quantum computer, and so far these are very much in experimental stages. However if a big quantum computer is built, RSA may no longer be secure. See the Wikipedia article on this algorithm.
These days, most people use encryption algorithms where the complexity of finding the key or else calculating the decryption key from other information is computationally infeasible. These are either secret-key systems like Twofish which have very large keys used in very complex ways, or are public-key systems like RSA and the discrete logarithm systems, whose security is based on the difficulty of finding logarithms in finite fields.
There is a famous quote from one of the most reputable cryptologists today, Hendrik Lenstra, who has said
Suppose that the cleaning lady gives p and q by mistake to the garbage collector, but that the product pq is saved. How to recover p and q? It must be felt as a defeat for mathematics that the most promising techniques are searching the garbage dump and applying memo-hypnotic techniques.(from Ian Stewart's book, previewed at google).
Cryptography us used in many ways that pose a problem for forensics analysts:
What this means is that forensics may only catch the less smart or well-resourced criminals. Those who either have some knowledge or hire experts will be able to use encryption well enough to hide their activities.
This leaves governments and agencies with the need to deal with encryption by other means than technological. Usually they opt for a legislative approach, with varying success.
Given that secrecy of strong encryption algorithms cannot be guaranteed, governments have since attempted to legislate aspects of their use, so that users of encryption are in breach of law if they use a strong encryption or do not tender up the key to a government agency upon request. This has not been entirely successful, as the following cases show:
DES was developed by IBM, and originally was designed to have 64 or 128-bit keys. DES has now been retired and a new Government-approved Advanced Encryption Standard (AES) is now the standard, after an open competition.
We think that markets, not governments, should be the primary determinants of technology solutionswhich is somewhat reminiscent of the storyline from the 1953 classic sci-fi The Space Merchants.