People throw around a lot of advice about password security, but we don't often talk about what we're trying to accomplish. The purpose of this post is to flesh out some of the questions we should be asking to determine who the potential attackers are and what attacks we are trying to prevent. This allows us to consider if and why we would want to implement defenses such as password salting and stretching, two-factor authentication, time delay, password expiration, and account lockouts.
At a basic level, a password is used to secure access to an account so let's assume there is a corresponding threat that someone would try to get into that account. But, we need to get more detailed than this. Who would try to get into account (i.e. the threat actor)? Why? Would they be happy with any account or just a specific one like root or Administrator? Does the attacker need to crack a large number of accounts or will he be satisfied with just a few? Would the attacker be likely to use an online attack or an offline attack?
Attacks are used to carry out or realize a threat. In this post, I'm only considering three attacks: online password cracking, offline password cracking, and denial of service via password authentication. My goal is to explore the reasoning behind the advice given for password selection and hashing so it makes sense to focus on the attacks directly affected by this advice.
What types of attackers are we worried about?
To begin, let's ask what data or services our site has that are valuable. Does our site store personal information? Can accounts at our site be used to access other sites (e.g. due to password resets). Does our site perform financial transactions? Does our site contain valuable content unrelated to the users of the site (e.g. music).
Who are the data or services valuable to. Would the information be valuable to a third party? Is the ability to change the information valuable (e.g. a student's grade or a person's account balance)? Does our site have the ability to process or initiate financial transactions?
These questions will lead us to identify potential attackers. For each possibility, we need to get more detailed about the motivations and capabilities of the potential attacker. What does the potential attacker want to accomplish? What access does he have? Is he an insider or outsider? What are his motivations?
As I mentioned previously, the basic threat we're dealing with here is unauthorized access to an account but an attacker will have additional or more specific goals. The attacker's goal might be to gain unauthorized access to personal information or to initiate an unauthorized wire transfer. Once we have some idea of our potential attackers, we can consider the goals they might have and the attacks that can accomplish those goals. My description here is very informal and is meant to demonstrate some of the things we should be thinking about with regards to passwords specifically. For a detailed introduction to attack modeling using attack trees, go here.
Note that what is a goal for an attacker is a threat to us. We're just looking at this from the attacker's perspective instead of our own. For our purposes, we'll assume that the attacker has some interest in cracking passwords via an online or offline attack. If he fully realizes his goal some other way, then the passwords aren't relevant.
One possible goal for an attacker is to carry out a fraudulent financial transaction. An attacker who targets a bank or a site like PayPal may use compromised accounts to transfer money into other accounts that can withdraw from (e.g. by using money mules or pre-paid debit cards). In this case, the attacker wants to gain access to as many accounts as possible, but only up to a point. He might be able to use tens or hundreds of accounts, but not thousands or millions. For this attacker, either an online or offline attack will suffice. If our site has a weak password policy, he might use a script that attempts to login to thousands of different accounts by guessing a few common passwords for each. If he can grab the hashes via SQL injection or some other vulnerability, he can use an offline cracking attack. Note that in this case, the attacker ultimately wants to gain access to individual user accounts. If the attacker can get administrator access he may use it to retrieve password hashes or even to change them to correspond with a password that he knows.
Consider a variation of the previous example: an attacker who wants to sell access to bank accounts rather than carry out transactions himself. In this case the attacker wants access to thousands or millions of accounts. He plans to sell the accounts at a discount rate and will not be able to make much if he only has ten accounts to sell. Online password cracking will not help him because it's highly unlikely that he can gain access to thousands of accounts via online password guessing. It's too noisy and the passwords would have to be extremely weak for his guesses to succeed often enough to make this worthwhile. This attacker will want to get the hashes so that he can carry out an offline attack. The attacker will only want administrator access if it helps him to get to the hashes/account information. Changing user passwords to gain access is not a reasonable option because he wants to sell access in bulk
The goal of an attacker interested in large-scale fraud or identity theft will be to gain access to personal information. This attacker will want to gain information about as many accounts as possible. The attacker may plan to sell or trade the information and not use it himself. This is similar to the previous example except that the attacker needs information, not access. This attacker may also not need to carry out an offline attack at all. If the information the attacker needs is in the same database as the password hashes, what's the point of cracking passwords? He can just grab the information directly.
An attacker may be intrested primarily in gaining root or administrator access. He may wish to install a rootkit/backdoor, use the system as a sprinboard into other systems or to sniff network traffic. In this case, he only wants access to one or a small number of accounts. If he's able to get access to a user account, that may allow him to use a local privilege escalation attack, but he won't be interested in cracking the passwords for more than a few accounts. An online password cracking attack may work here for initial user-level access, but if the admins are halfway competent, it won't be an option for gaining root. If the attacker can get access to the hashes, he can launch an offline attack against the passwords for the administrator accounts. This is different from previous examples because the attacker wants access to specific accounts. This attacker won't care, for instance, if the passwords are salted since he doesn't need his efforts to scale.
An attacker might wish to cause a denial of service attack against the site or a specific user. This is potentially possible if the site has an account lockout policy or if the site uses a strong password hashing algorithm (which requires a lot of CPU time) but does not have corresponding controls to limit or delay login attempts.
An attacker might wish to use information from the site to gain access to more valuable accounts somewhere else. This would be a possible motivation for an attack on a site such as LinkedIn (although I doubt this was the case with the recent actual attack since the attacker posted publicly). The data stored on LinkedIn is mostly available for free. The data or accounts would have some direct value for spam or phishing attacks, but that's about it. But, people often reuse passwords so if a user has the same password for LinkedIn as for their bank, email account, PayPal, etc. the attacker may be able to gain access to something more valuable. In 2011, Troy Hunt analyzed dumps from Sony and Gawker and found that 67% of the users with accounts at both sites reused their password.
An attacker may also wish to crack passwords in order to ensure that he keeps access to a system. This may be true of an attacker who currently has a working exploit but fears that it may be patched in the future. He could also be an insider who plans to leave the organization (perhaps not willingly). He can perform an offline password cracking attack so that he has a backup route into the system. This attacker would strongly prefer to crack an administrator account, but a user account may suffice. It's possible that one or more of the passwords will change, especially if the attacker tries to use them a month or more down the road.
Salting is useful in every case where an attacker will want to attack multiple accounts. Password salts also prohibit the use of rainbow tables completely. See my previous post here. Since the attacker will have to try each guess separately for every user, it greatly increases the effort that it takes to recover a single password. Many people reporting cracking over half of the 6.5M password hashes that were leaked from LinkedIn in a matter of days. With password salts in place, an attacker could have been limited to cracking between a few hundred and a few thousand accounts in the same time frame. Since the attacker cannot attack multiple users at once, his best approach is probably to use a dictionary attack (possibly with permutations) so that he can spend only a few seconds or less per user. This will allow him to crack many of the weakest passwords without wasting time on passwords that are harder for him to crack.
Other than preventing the use of rainbow tables, salts offer no benefit when the attacker is only interested in a single account and only a small benefit if the attacker is interested in a small number of accounts (e.g. only the administrator accounts). The reason is that salts only prevent the attacker from scaling his efforts across multiple users. So, it's twice as hard to crack two accounts as one and ten times harder to crack ten, but it's exactly the same effort to crack one user with or without a salt. Salting does not affect online password guessing attacks. For more discussion of salts, check out Dan Kaminksy's blog.
Stretching increases the time that it takes to perform a password hash and works well whether the attacker is interested in a single account or many. Because stretching can slow down password hashing drastically (1,000 times or more), it can also prevent the use of rainbow tables. A rainbow table that would have taken days to build previously would take years with password stretching in place. Per Thorsheim tells me that some of the tables at freerainbowtables.com took months to build even with many participants. With password stretching, these would take centuries to make. In addition, lookups in the table would take hours or days instead of minutes. Stretching is arguably stronger than salting since it affects all offline cracking efforts, but there is no reason not to use both. Scrypt, bcrypt, and PBKDF2 all use salting and stretching. Stretching could potentially slow online password guessing but that's not really the goal and sites should probably limit the rate at which users can attempt to login anyway.
When salting and stretching are used together, an attacker will have little success with offline password guessing. He may still recover some of the weakest passwords, but not much more than that. If a single password hash takes 1/1000th of a second to compute, an attacker would need almost two hours to try a single password for each of the 6.5M users in the LinkedIn password dump. Trying a wordlist of 100 common passwords would take about a week. A wordlist of 1000 common passwords would take 3 months. The users with the very weakest passwords would have their accounts cracked but everyone else would be safe.
Another strong measure that can be used is two-factor authentication. Unfortunately, there are costs involved and implementation is more complicated than just telling the developers "use bcrypt." The decision to use two-factor or not has to be based on an assessment of the risk. What is the cost of implementing two-factor authentication? What is the potential cost of a compromise? How much will two-factor authentication do to prevent that? My guess is that most organizations can handle a small number of customer accounts being cracked, but will be less tolerant of internal accounts being compromised. The cost of implementing two-factor authentication internally should be a
lot lower than for customers, but the disruption caused by lost or
forgotten authenticators will be higher. In practice, a few companies have made two-factor authentication an option for customers, but I don't know of any that require it. I'd love to see some good analyses of the costs and benefits of implementing two-factor internally. Post links in the comments if you have them.
Password stretching can allow an attacker to launch a denial of service attack by attempting to login hundreds or thousands of times per second. The reason is that a single password hash might take between 1/1000th and 1/10th of a second depending on the cost factor that you use (i.e. how slow you make it). To prevent this, sites should use some combination of time delay, CAPTCHA, and account lockouts. I don't like account lockouts, however, so I suggest using the other two. Many sites employ CAPTCHA only after a few failed logins; this seems reasonable and limits the impact to users in most cases. Time delays are much better than lockouts and only force users to wait 1-5 seconds before attempting to login for a second time.
Account lockouts in response to failed login attempts help to prevent online password cracking attempts but can also be used to deny service to a user. If you lockout an account after X number of attempts, then an attacker can methodically lock accounts by attempting to login X number of times with a random password. Again, I think time delay is a better solution. It also limits online password guessing but doesn't allow an easy denial of service and won't allow users to lock themselves out. If lockouts are used, I recommend setting a high number of maximum attempts (perhaps 20 or more) so that users won't ever trigger the lockout accidentally.
Password expiration is a common recommendation, but in many cases I don't see the benefit of forcing users to change their passwords every 60/120/180 days. What are we trying to accomplish with password expiration? How does this prevent an attacker from achieving his goals? I do think it makes sense to change passwords when an organization is hacked or when an adminstrator leaves his position. In these cases, we have reason to suspect (or know) that a person who is not currently authorized to have access to the system has knowledge of the hashes so we're responding to a specific threat. But, many of the attackers I described above do not need long-term access to the system so expiration would have almost no affect on them. Password expiration probably does more to limit account sharing that to prevent attacks from outsiders.
Obviously, password selection affects offline and online password guessing. If users pick strong passwords, it will limit the effectiveness of both offline and online password guessing attacks. Unfortunately, getting users to pick strong passwords is an open problem. Password stretching and salting complement good passwords since the number of guesses that an attacker can make and the scalability of his efforts. Without these measures, users need to pick passwords that are very long to avoid falling prey to an offline attack.