A couple of different friends sent me over a link to an article about The Usability of Passwords this weekend, clearly thinking it would strike a chord. Well, let’s just say I was enthralled before I even finished the second line:
Security companies and IT people constantly tells us that we should use complex and difficult passwords. This is bad advice
The crux of the article (and subsequent FAQ), is that so long as a password is sufficiently long – the example used is “this is fun” – you’re pretty damn secure (apparently eleven characters is just right). Actually, the term used was secure forever. Wow, two pretty absolute terms.
This actually sounded alarmingly familiar:
Eleven characters are probably above average as far as password length goes, no arguing there. Of course the choice of all lowercase characters and a couple of spaces is problematic, but I’ll get back to that. What I found most interesting though was the basis on which the conclusion was formed and I thought that could do with some clarifying. So let’s take a look at these and apply a bit of objective analysis to see if they hold water.
Does a brute force attack really only run at 100 attempts per second?
Is "this is fun" really 10 times more secure than "J4fS<2"?
Do rainbow tables really work by an attacker copying and pasting a hash into a website?
Are bad password management practices on the server really not your problem?
And perhaps most interesting of all – and the whole crux of this post – is a simple lowercase password you can easily remember really more secure than a shorter but more complex version? Let’s find out.
When is a password “secure”? Easy – never
The thing about the word “secure” is that it tends to get thrown around in a very binary fashion; you’re either secure or you’re not. The reality is there are a whole bunch of shades of grey. Security is all about mitigating risk, nothing more. Certainly being as emphatic as saying “secure forever” is a misguided statement at best.
It’s a little bit like saying a Volvo is “safe”. Sure, you get lots of airbags and intrusion bars and stability controls and whatnot but I’ll tell you what; if you hit that semi-trailer head on at 100kph it’s all over red rover. Of course comparatively the Volvo is very safe compared to, say, a Chery, but it’s certainly not an absolute.
When it comes to password security, the accepted measure of strength is entropy. To achieve greater entropy we need passwords consisting of more possible symbols and of greater length with more randomness. When you limit yourself to one character set – lowercase letters in this case – you create very low entropy passwords.
The speed of brute-forcing passwords
The term brute force is used a little liberally and can encompass a variety of different attack vectors (although strictly speaking, it’s an attack against a cryptography system), but in short, the one constant is that it relates to attempting to break a password scheme at the fastest rate possible. In reality, this rate varies enormously depending on the nature of the attack. For example, a dictionary attack where passwords are hashed on the fly is a completely different beast to a rainbow table which is pre-optimised for speed. More on that a little later.
There’s a suggestion in the article that “most web applications would not be capable of handling more than 100 sign-in requests per second” and in fact “Google maxes out at only 25 attempts per seconds”. Now, I don’t know exactly how capable Google is of handling simultaneous authentication requests, but when Gmail alone has a couple of hundred million users, it’s a safe bet that it’s more than 25 a second. Rather a lot more.
There’s a couple of other problems with the assertion about rate: Firstly, there’s a fundamental difference between being able to issue a certain number of login requests to an application from a remote environment and the application’s ability to simultaneously handle a high volume of authentication requests. All sorts of things get in your way as a remote user including bandwidth, latency (in particular) and any unknown variables on the server side such as intrusion protection systems throttling request rates. Of course many applications also lock out an account after a series of failed logon attempts so your “brute force” may only get three or five goes anyway (not much “brute” in there!)
The other problem you have is that very frequently, hashed passwords are disclosed as part of an entire database disclosure (note: hashing is not encryption, they’re two different beasts). We saw that happen recently with Gawker and then again a couple of months later with rootkit.com and any number of other examples before and since. Now, once you have your hands on the database, all the rules change as you’re no longer at the mercy of the previously mentioned constraints. Suddenly your passwords are local and even if a cryptography mechanism has been applied (not necessarily encryption, but I’ll get back to that), it may still be vulnerable.
How vulnerable? Well it depends on what scheme is being hacked with what attack mechanism, but that 25 per second could well jump up a little, perhaps up to 400,000 passwords per second for a dictionary attack. When the entire basis of your password entropy criteria is dependent on an assumption that’s wrong by a factor of 16,000, you have a problem.
Whilst this example is about cracking the password of a WPA-PSK encrypted wireless network, the point is that brute force is about working at a serious scale. Particularly now in the age of very cheap commoditised compute resources of potentially enormous scale (such as the 400 CPU cluster), the accessibility of the service and low cost of the process ($1.68 sounds pretty reasonable), means a whole new world of cracking opportunities.
Serenity to accept, courage to change, wisdom to know the difference
This little phrase often pops in situations where things may be a bit beyond your control (and frequently accompanied by the mention of a deity). In the world of software, we as end users unfortunately don’t have much say as to what goes on under the covers. In fact about the only thing we can control is our password and how we use it (and even then, often within tight constraints).
Software developers use dodgy practices. Frequently. You’ll find that in the previously mentioned examples of Gawker and rootkit.com (the latter didn’t even salt their password hashes) and you’ll observe it firsthand every time you logon with no HTTPS or are sent your password in clear text. So what do you do about it? Well, as an end user you have absolutely zero control over it. Incidentally, this is the same level of control you have over allowable password retry attempts, delays between retries and account lockouts. Nada. Zip.
Look, if you’re a software developer and you want to put some effort into security, go read the OWASP series. If you’re someone who creates passwords in other peoples’ systems then go read the passwords series.
What you can’t do is get up and bleat about “well it’s not my problem, the developers should have implemented it securely” because it is your problem and the only thing you can do about it is to construct and protect your own passwords. There’s one thing for sure; when that dodgy MD5 hashed database with no salt gets disclosed and yours is the lowercase password, you’re going to be first in line for exploitation. And remember, you often have absolutely no visibility as to how a website stores your password.
Don’t think you can always just create any password you like
There’s another little problem with devising password schemes like “this is fun” or just about any other pattern-based structure, for that matter. The problem is simply that you may well not be able to use some of those characters in particular websites. If this sounds a bit odd, go back and take a look at my earlier article about Who’s who of bad password practices – banks, airlines and more.
For example, if you bank at St. George you can forget about using spaces and as we’ve already been told, “thisisfun” is absolutely unacceptable!
Well, you could always just drop the spaces and add another couple of letters and achieve the same degree of statistical randomness (assuming a scheme of letters and spaces). Oh, hang on, that won’t work if you have a TPG ADSL account:
Ah well, maybe just concede a little length and try for “thisisok”. Unfortunately Singapore Air has different ideas:
The point of these examples is that you simply can’t apply a single password scheme consistently across all your accounts. I’m not going to forgo the great interest rate St. George gives me because I can’t use spaces, or drop TPG who offers one of the best ADSL plans going because I can’t type the phrase I want, or stop flying on SIA who has the best damn seats going on the A380 because they’ll only take digits on their frequent flyer program. Ultimately this goes back to the previous section about accepting what you cannot change; I’m sorry, but you simply have no control over these password polices and all you can do is maximise your entropy within their (frankly ridiculous) constraints.
Rainbow tables are not about copying and pasting individual passwords into a website
Rainbow tables are magical little things (and I use the term “little” very, very loosely!) The basic theory is that a rainbow table is a form of time-memory trade-off in that they consume large amounts of capacity once generated (and let me tell you, generation ain’t fast either!), but they’re very fast to then process against large password lists.
Speaking of password lists, the specific purpose of a rainbow table is to break hashing algorithms. I’ve got part 7 of the OWASP Top 10 for .NET Developers about to roll out and in that I go into a lot more detail (including compromising an MD5 hashed table of passwords with Rainbow Crack), but in short, storing passwords that are only hashed and not salted is a bit of a no-no. But as we covered in the last couple of sections, we poor end users really have no control over that.
Anyway, the thing about rainbow tables is that they churn through huge volumes of hashed passwords in next to no time. They do this by applying some very clever mathematics which means it’s not a process of simply enumerating a hash file password by password. It’s certainly not a case of “simply going to a website and pasting in a single hash” which implies it’s a slow, laborious process.
Let’s see this in practice. In the following video (link to .wmv), Rainbow Crack is used against a set of ten MD5 password hashes based on the ASCII character encoding scheme which contains 95 common symbols (upper and lower letters, numbers, punctuation and a few other bits). The character set has been used to generate a rainbow table of passwords ranging in length from 1 to 7 characters which subsequently occupies a key space of 70,576,641,626,495 different possible combinations. Yes, that’s over 70 trillion. Based on the logic in the article in question, at a maxed out brute force rate of 100 attempts per second, we’re going to need up to more than 22,000 years to crack these passwords. Let’s see if that holds true (I promise the video is shorter than that!):
Hmmm, 220 seconds, not quite 22,000 years! Rainbow tables are simply a very efficient brute force mechanism
Low entropy = low hanging fruit
When you take a look at commonly available rainbow tables (Rainbow Crack has a bunch, as does Free Rainbow Tables), you’ll notice a common trend; most of the tables deal with very basic password entropy schemes. What you’ll typically find is tables with hashed versions of all lowercase passwords, lowercase and numeric or numeric only passwords.
Why? Well it’s partly because the total possible combinations are kept manageable, but it’s also because people are notoriously bad at creating strong passwords. For “strong” passwords we need three factors:
- Length (which “this is fun” doesn't do too badly at)
- Breadth of character set (all lowercase and a couple of spaces is far from this)
- Randomness (concatenated dictionary words isn’t great)
Now compare that back to these 25 passwords:
123456, password, 12345678, qwerty, abc123, 12345, monkey, 111111, consumer, letmein, 1234, dragon, trustno1, baseball, gizmodo, whatever, superman, 1234567, sunshine, iloveyou, fuckyou, starwars, shadow, princess, cheese
Why these? Because these 25 passwords were used a total of 13,411 times by people with Gawker accounts. They are the 25 most commonly used passwords and they’re not too different in structure to the 25 most common passwords discovered in the rootkit.com breach:
123456, password, rootkit, 111111, 12345678, qwerty, 123456789, 123123, qwertyui, letmein, 12345, 1234, abc123, dvcfghyt, 0, r00tk1t, ìîñêâà, 1234567, 1234567890, 123, fuckyou, 11111111, master, aaaaaa, 1qaz2wsx
Mathematically, “this is fun” is equivalent to “01234 56789” in terms of randomness in an alphanumeric plus spaces password scheme. But is it as likely to be compromised as “8p<+bf82{WA”? You don’t need mathematics to work that one out.
Don’t forget the dictionaries (and that they aren’t dictionaries)
In previous posts I’ve linked off to a typical password dictionary. The first thing to understand about a password “dictionary” is that it’s not necessarily a digital version of your Oxford. In fact in the example above you’ll find entries as varied as “1” to “lily@1982”. You’ll also find these guys:
- this is cool
- this is dvd
- this is good
- this is great
- this is just a test
- this is me
- this is my password
- this is my world
No instance of “this is fun” in this particular dictionary, but it’s also less than 17MB and only about 1.7 million records. Hash that with your favourite algorithms and compare to a compromised (saltless) database and you’re starting to make some real progress.
Now obviously the article wraps up by saying that you shouldn’t actually use the text “this is fun” as your password, rather you should choose another combination of words instead and you’ll be safe. Here’s an interesting dictionary related angle on that; take a look at John the Ripper (JtR). This is a bit of a favourite for breaking passwords as it’s sort of an all-rounder and one of its strengths is the ability to use dictionary attacks.
What’s really intriguing about JtR in the context of this post is the information on the wordlist rules syntax page. There are a few interesting parameters on this this page; let me draw attention to some of the key ones:
- matches whitespace: the ability to break or concatenate words separated by a space.
- reverse: spells the word back to front (i.e. “fun” becomes “nuf”).
- rotate the word left: this turns "jsmith" into "smithj".
- pluralize: turns “crack” into “cracks”.
- shift case: converts "Crack96" into "cRACK(^" (the last two characters are the equivalent of holding the shift key while typing the digits).
- shift each character right, by keyboard: Makes "Crack96" become "Vtsvl07" (offset every keystroke one to the right).
These are half a dozen examples from a significantly large set of possible language substitutions. Why is this important? Because some people seem to think they can outsmart the computers by applying logic to password creation. ‘fraid not folks, the best you can do is use these practices to increase entropy but at the end of the day, computers are alarmingly good at recognising patterns and what we see above is a perfect example of where cracking software is designed to do precisely this.
The only secure password is the one you can’t remember
Finally, let me draw back to the crux of my recent article about The only secure password is the one you can’t remember. Regardless of how simple you make your password, you simply cannot make them both unique and memorable across all your accounts. Fortunately I’m not seeing any advice in the article about reusing passwords (although the omission of this fundamental principal was conspicuous by its absence), but the simple fact is that the average person won’t remember a dozen different passwords and their associated sites let alone a serious number. Here’s how my password list is looking these days:
The “this is fun” example is about as practical as the “iloves@nDwich3s” suggestion from Google I referenced in that previous post; you might remember them across a few sites but the practicality of this approach is far exceeded by the simple logistics of the number of sites we maintain accounts for these days.
Frankly, this really trumps all prior arguments anyway. Forgetting all about the entropy issue, once you accept that uniqueness is important then the only practical choice is a password manager which consequently makes prior arguments in favour of simple passwords redundant.
Summary
I get what the guy is on about in the article, I really do. Continually creating strong passwords and trying to remember them is not fun, particularly when you’re forced into frequently changing them. Clearly the IT overlords in this particular case have put their foot down and the frustration has bubbled up.
By most measures, an eleven character password is not too bad, even in the absence of a greater array of character types. In fact if the article was simply “11 char alphas plus space passwords are cool”, I probably would have ignored it. In fact I only responded was because of the frequent inaccuracies in the rationale for this password scheme.
There’s a very good reason why systems define minimum password criteria and there’s also a very easy way to comply – and exceed – using a decent password manager. In fact once you let go of the mindset that you need to remember all your passwords, everything gets extremely liberating. Sure, you’ll still need to memorise a few (your PC logon, for example), but it’s a rare exception.
There’s no secret pact by those of us in the software and security industries to convince everyone to make confusing passwords just for the sake of it and your IT department is not out to simply make your life miserable for no discernable gain. The concepts discussed above have been honed over many, many years of evidence based analysis and refinement by the security industry and the conclusion that password entropy is important simply isn’t debatable.
Look, at the end of the day all this discussion about passwords is good stuff. It brings the issues out into the open and there are enough quantifiable examples out there to discuss it pretty objectively. And here’s the reality: systems are frequently compromised, weaker passwords are the first to go and the only thing you really have in your control is how strong you make them.
Update, 22 April 2011: There’s an interesting addendum to this story via way of TWiT (This Week in Tech), just yesterday. If you’re not already familiar with TWiT, it’s an extremely popular netcast network hosted by Leo Laporte. Within the TWiT network there’s a series known as Security Now which is co-hosted by Steve Gibson who has had a long and illustrious career in the software world including a great deal of experience in the security space.
Anyway, just yesterday Security Now featured a discussion about the original article this post was in response to. You’re best off listening to it yourself (scroll through to about the 60 minute mark), but in short, Steve talks about “a number of logical mistakes” which are “contradictory”. He then explains the role of rainbow tables and the vulnerability of the “this is fun” password in a similar fashion to what you’ve seen above.
I think the phrase which best sums up the situation was “The danger of using something that you can remember is that it’s going to be too simple”. So we’re back at password management of high entropy strings again (LastPass gains a mention), which is precisely Steve’s conclusion. It’s also my conclusion in the summary above.
Despite a resounding chorus from security experts disputing the assertion that “this is fun” is ever an acceptable password, the author posted another update in defence of the original position. I’ve got to give him credit for steadfastly maintaining his position in the face of overwhelming objection from those who are software professionals (and support from those who aren’t), but the writing is now (even more) clearly on the wall.