Password Hashing Done Right
After reading a very informative piece by Coda Hale about the proper way to store a password, I thought i’d provide a bit of information on the topic to help drive his point home.
There are at least two important things to think about when generating and storing passwords for any application. The first is general password strength, and the other is how you store that password.
Strong Passwords
The first thing to consider when dealing with passwords is enforcing some general measure of password strength. All of the salting and encryption in the world won’t do you any good if you have users with passwords that are too short, easily guessable, etc. By enforcing a reasonable minimum length, and some rules for the composition of the password, you can introduce your first line of defence against brute force and/or dictionary attacks.
My personal stance is that the most important rule to enforce is password length. The simple act of adding a character to a password makes a brute force attack take an order of magnitude longer. 8 characters takes longer to brute force than 7, and 9 takes longer than 8, etc. I often default to 8 characters, so that’s the example I will use here.
For the purpose of this example, I’m going to build on the assumption that all of my passwords are made up of the following characters:
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789
!@#$%^&*()_+-=?
It is important to note that you may have more available characters, and you may have less. I’m going to base my example around these 77 characters. Again, the more characters available, the stronger a potential password is.
Your passwords are now made up of a minimum of 8 characters with 77 possible characters in each position. To calculate the number of total possibilities for this schema, the math is simple. Number of characters raised to the power of length of password, or more simply:
77 * 77 * 77 * 77 * 77 * 77 * 77 * 77 = 1,235,736,291,547,681
In plain english, that’s 1.2 quadrillion. Seems like a lot, doesn’t it? If you were a human trying to guess the password through brute force 00000000, 00000001, 00000002 etc and you could type passwords at a rate of 1 per second, it would take you just under 40,000,000 years (without stopping for coffee or bathroom breaks!)
If your computer, on the other hand, was trying to guess your password by means of brute force, it would take substantially less time. My laptop, as of May 2011, can generate roughly 1,300,000 guesses per second, and that’s based on unoptimized node.js code running on an already busy macbook pro. As Coda pointed out , there are methods available currently (for around $2,000 USD) that will allow for the generation of 5,600,000,000 hashes (not just attempts, full hashes) per second. Your 40 million human years is now broken down to 3 days before your 8 character password can be guessed.
The problem with relying on strength alone is that these numbers are going to change quickly, and leaving the onus on your users to come up with more and more unique/strong passwords is a horrible solution. Strong passwords increase the number of required guesses, but now you need to slow down the guessing.
Generating and Storing Password Hashes
Modern computers can generate hashes incredibly fast. While the example of 5,600,000,000 per second was a very specialized (albeit cheap) solution, your own computer can generate a standard MD5 hash very quickly. When you loaded this article, some simple code was run in the background, and determined that your web browser was able to generate 0 hashes per second. You would have made 0 attempts at an md5 hashed password in the time you spent reading this. As technology improves, so will the number of attempts per second. The same specialized application that attained 5,600,000,000 attempts per second a year ago may be able to do 10,000,000,000 next year.
What about SHA1, SHA256, or even SHA512
While methods like SHA512 are slower than MD5, they are not the answer. SHA512 will increase in speed on the same path that MD5 does, eventually allowing for billions or even trillions of attempts per second.
Enter bcrypt
Coda explains why to use bcrypt best with the following:
How? Basically, it’s slow as hell. It uses a variant of the Blowfish encryption algorithm’s keying schedule, and introduces a work factor, which allows you to determine how expensive the hash function will be. Because of this, bcrypt can keep up with Moore’s law. As computers get faster you can increase the work factor and the hash will get slower.
How much slower is bcrypt than, say, MD5? Depends on the work factor. Using a work factor of 12, bcrypt hashes the password yaaa in about 0.3 seconds on my laptop. MD5, on the other hand, takes less than a microsecond.
So we’re talking about 5 or so orders of magnitude. Instead of cracking a password every 40 seconds, I’d be cracking them every 12 years or so. Your passwords might not need that kind of security and you might need a faster comparison algorithm, but bcrypt
Basically, you can specify the amount of work bcrypt performs to generate your hash. As technology improves, you can increase the amount of work you do to derive your hashes, keeping up with the modern technology curve.
Until there is a massively parallel version of bcrypt that can benefit from CUDA/OpenCL, this method moves brute forcing your hashes back into the thousands/millions of years range. I’d say that is about as “difficult to brute force” as anyone can currently ask for.
Closing Thoughts
While this focuses primarily on the generation and hashing of passwords, and is meant to deal with ensuring the security of your passwords should your password database be compromised, it is not all you need to focus on when securing your password data. There are a host of other methods that can and should be used to ensure proper password handling such as lockouts, resets, etc. Other security measures can be covered in a separate post. This one covers the important thing:
Use bcrypt
Comments