Monday, July 2, 2012

How to fail at cryptography

In my last post, I discussed the number 2128 and explained why it’s not possible to brute-force 2128 possible keys.  Does this mean that we can use 128-bit cipher like AES with confidence?  Not quite.  Brute-force against AES with 128-bit or larger keys is impossible with any non-quantum computer we will build for the foreseeable future, but that’s only one avenue of attack.  In practice, cryptosystems are broken in a variety of ways.  Sometimes, the algorithm is flawed.  Other times, the algorithm is sound but the implementation is bad.
This post attempts to explain, at a high level, some of the technical vulnerabilities that exist in real-world cryptosystems.  I hope that it will help developers, IT and security people gain a basic understanding of the difficulties that exist and give them some ideas of what to look for in code reviews, testing, or product selection.  I also hope to make clear why writing your own implementation is usually a bad idea.  For more information, check out the book Cryptography Engineering and Matthew Green’s blog.  For a look at management/business failures, check out Ross Anderson's  Why Cryptosystems Fail.

Too little entropy

Generating a secret, random, 128-bit key requires you to have at least 128-bits of entropy (randomness) that are unknown to any potential attacker.  If your key is based on a password or passphrase, you almost certainly do not have enough entropy.  Using the “correct horse battery staple” method of password selection, you would need to pick at least 12 words at random from a wordlist of 2,000 words or more.  This would produce a passphrase of more than 60 characters.  A randomly selected password could be as short as 22 characters, but it needs to be random: something like “%iYxG(3aPB^” not “P@ssw0rd”.  Good luck remembering that.  

Many implementations generate keys for you, but this may not be any better.  A good implementation will use several potential sources of entropy such as drive seek timings, mouse movements, time between keystrokes, etc.  If a program requires you to type random keys or move your mouse around the screen while it generates keys, this is a good sign.  A bad implementation might only use the current date and time.  For instance, take a look at the salt generation here (salt generation is more forgiving that key generation since the salt doesn’t have to stay secret, but this is still bad).  

What this means in practice is that an attacker does not have to guess your 128-bit key, he only needs to guess the input used to create the key.  For instance, to recover a key based on a password, we only need to know the hashing algorithm that was used to generate the key in order to begin a password guessing attack.  Then, we hash possible passwords and use the hash as a key to try to decrypt a file/message that we know is encrypted with that key.   If we recover a meaningful plaintext (e.g. a readable message, or correctly formatted file) then we’ve found the right password and key.  This attack also applies if the password is used to create a second key that is used to encrypt the user's randomly generated key.

If you must generate keys from passwords, use a key derivation function (KDF).  These functions, e.g. PBKDF2, can also be used as password hashing algorithms and slow down the process of turning a password into a key.  This adds to the difficulty of an attack against the passwords which brings it closer to the difficulty of brute-forcing the keys themselves.  Unfortunately, you probably can’t add more than about 20-bits of difficulty.

To attack a system that generates keys using the system time, we can guess when the key was created and use that time to generate possible keys just as we did with password-based key generation.  If we have information about the key, that’s even better.  There are only 86,400 seconds in a day which gives us about 16 bits of entropy if we know the day that a key was created, but not the time.  If we can only guess the key creation time to within a year’s time-frame, there are about 25 bits of entropy.  If microsecond time resolution is used, that adds 20 more bits of entropy, but most systems don't provide actual microsecond resolution.  You may get 1/10 of a millisecond which is about 13 bits.  That's 38 bits of entropy at best, not even close to the 128-bits of entropy that should be used to generate a key.      


Even if there is enough entropy, a system can still fail if the pseudo-random number generator (PRNG) that converts the entropy pool into a stream of output bits is poorly designed.  For instance a PRNG may reuse output allowing an attacker to discover some of the bits used in someone else's key.  It may also be possible to generate future bits of the output stream of a badly designed PRNG given the current output.  

Most operating systems have an random number generator built in.  Use that if you can.  If not, use an existing library.  Do not design your own algorithm.  NIST has a list of approved algorithms with links to more information.

Poor key storage

Some implementations store keys in the clear alongside the data.  In other situations, the administrator may store the keys where an attacker can access them (e.g. on a file server).  All the attacker has to do is find the key.  Encrypting the key using another static key or some sort of reversible “encryption” won’t help.

Transparent Data Encryption

Many databases use “transparent data encryption” (TDE).  The purpose of TDE is to “prevent unauthorized access to the data by restoring files to another server.”  This does not protect data from legitimate users.  It only protects stored data from an attacker who is able to steal a backup tape but not able to gain access to a live system.  This isn’t necessarily bad, but you have to understand what TDE does and does not do.

If an attacker is able to get access to a legitimate account, TDE no longer protects against him.

Wrong mode of operation

Block ciphers typically encrypt block of 8 or 16 bytes at a time.  How this is handled depends on the mode of operation.  In electronic codebook (ECB) mode, every block is encrypted independently.  This leads to attacks where one block of ciphertext can be replaced with another by an attacker, where an attacker can supply his own messages for encryption, and where the lack of uniqueness in the plaintext data leads to information leaks.  

Block swapping attacks are useful when the format of the data is known to the attacker even if the specific content is not.  For instance, an attacker who can intercept a wire transfer might replace the block containing the destination account number with a ciphertext block from a previous transmission that the attacker knows contains the number of an account he controls.  This is partially fixed by using a mode of operation such as cipher-block chaining (CBC) where the result of one round of encryption affects the next.  This prevents the attacker from naively swapping blocks of ciphertext around, but it is only part of the solution; some active attacks are still possible with CBC mode.  Why should we let an attacker tamper with messages at all?

If we let an attacker tamper with messages, bad thing happen.  For example, CBC-PAD (that is, CBC mode with padding) is subject to a Padding Oracle Attack where an attacker can figure out the plaintext content of a block by repeatedly changing the previous block of ciphertext and submitting it to a system to see if it results in a message with valid padding.  The attacker does not need to see the decryption by the system; it’s sufficient for the system to return an error when the padding is invalid.  To avoid this, we should use an authenticated mode of operation such as GCM or use a message authentication code (MAC) such as HMAC on our ciphertext whenever active attacks are possible.  

Broken algorithms

Sometimes, an algorithm is just broken.  In cryptography, an algorithm is broken if it can be defeated with less than brute force effort.  This causes some confusion because many of the published attacks are not practical or only work against some weakened variant of the real algorithm.  Does it matter if I can break 128-bit AES with 2126 effort?  Not really.  The sun will still burn out long before I finish; it’s completely impractical.  The problem is that attacks get better over time.

The hash algorithm MD5 is a good example of this.  MD5 was released in 1992 to improve on the security of MD4 by adding another round of compression.  It was (or maybe still is) the most popular cryptographic hash function in use.  Weaknesses started to appear after only a few years.  The initial attacks were not practical breaks, but by 1996 cryptographers suggested that we might need to move away from MD5.  Over time, the attacks kept getting better.  In 2005, a team of researchers constructed two different digital certificates with the same hash.  In 2008, researchers improved on this by demonstrating a new way to create collisions with fewer restraints.  All the while, people kept using MD5 even though better alternatives such as SHA-1 and later SHA-256 and SHA-512 were available.

In 2012, the Flame malware was discovered.  Flame used a collision attack on MD5 to sign its own code with a forged certificate from Microsoft.  Why was Microsoft still using MD5? 

Check the current standards/recommendations (from NIST if you're in the U.S.) and re-evaluate your choice every few years to see if the algorithms you picked are still what you should be using.  You can also check the status of an algorithm on Wikipedia.  Most current algorithms have a section on their page that lists currently known attacks, e.g. AES.  If you see attacks listed that sound anywhere close to practical, find out if there are better alternatives; you're probably already behind the curve.
Side-channel attacks

Side channel attacks are based on physical or timing information rather than the contents of a message or channel.  An attacker can, for instance, determine a key by monitoring the power usage of a device or by analyzing the time required to perform an encryption or decryption.  Some attacks require local access.  Others, such as power analysis require physical access.
Some of the problems with timing attacks can be solved using the existing libraries, but if you need tamper resistance you should hire a consultant with expertise in the area.  Like these guys

1 comment:

  1. I updated the numbers in the entropy section dealing with microsecond time resolution. Originally, I wrote microsecond but did the math for milliseconds.