Password Safety & Security: Passwords vs Passphrases

Table of Contents

Update, 12:36 pm PT: Corrected error in some base-2 maths. Thanks to Twitter user @tchwojko for keeping us honest.

Today, May 5th 2022, is World Password Day. A great time to reflect on password safety. Much has been written about password security over recent years. Many of the old traditions have been challenged due to a better understanding of the methods and tools used by attackers to crack passwords. Most people now realize that simply swapping numbers for letters and adding digits to the end (we’re looking at you P@ssw0rd123) does nothing to delay password brute force attacks. In addition, NIST¹ and the UK’s NCSC² now actively recommend that organizations do not enforce frequent password changes and, instead, advocate the use of longer passphrases instead of highly complex, but shorter, passwords.

Let’s take a look at the logic and the math behind the recommendation to use passphrases and determine whether they really offer better security for us as users, and for the organizations we work for. We’ll then offer some password best practices to help you remain safe.

Password Cracking

Traditionally, attackers have attempted to guess passwords by, essentially, guessing them one character at a time. The general consensus, therefore, is that the longer a password is the harder it is for computers to perform an exhaustive search (also known as brute force attack). Passphrases, then, seem to be an easy way to achieve long passwords:

they are longer than passwords which, in theory, offers better protection against attackers
they are easier to remember which should prevent things such as password re-use or having them written on a post-it note

By combing a small number of everyday words we can create passwords which are many times longer than the 8-9 character passwords many of us still use today.³

But is it really that simple? Let's take a look at the math.

What are Password Hashes and How Do They Protect Our Passwords?

Organizations keep passwords safe by storing them as a ‘hash’. A hash is the output of a function which converts data of any length into a fixed length string. Hashes are theoretically impossible to reverse so if an attacker steals a hashed password for their intended victim they have no choice but to try to send many different words through the same function to see if they obtain the same hash.

SHA-256 is a well-established hash function, although not recommended for use with passwords. It is used here purely for illustrative purposes. Using SHA-256 we can convert a password of P@ssword123 into a hashed output (see Figure 1). The SHA256 function can be run any number of times on any device and will always result in the same output when given the same input. However, it’s impossible to take the hash and run it ‘backwards’ to obtain the original password.

Figure 1: Using the SHA256 function to hash the string 'P@ssword123'

To find the correct password, attackers must check word after word until they find one which outputs the same hash value as the one they have stolen. While this sounds tedious, password cracking tool, such as Hashcat, are capable of calculating billions of hashes per second on a single computer.⁴ Renting cloud computing services allows security researchers and threat actors, alike, to perform these operations without needing to build specialist computers themselves. Virtual computing in the cloud is capable of calculating tens of billions of hashes per second. ⁵

Exhaustive Searches

From malware used to capture keystrokes, formjacking to capture website passwords, and social engineering techniques such as phishing, threat actors have a vast array of tools and techniques available to them to obtain a user’s password. But if these all fail, good old-fashioned password guessing, using exhaustive searches, may still get them access to the victim’s account.

While ‘dictionary attacks’ speed up modern password cracking, the most complete way to perform a brute force attack is to try every possible combination of characters in the password alphabet. If a user’s password used only lowercase letters a-z and was 8 characters long, the password cracking tool would need to start with aaaaaaaa, then try aaaaaaab, aaaaaaac, etc, and work its way all the way up to zzzzzzzz. In total, there are 268 (or 208,827,064,576) possible combinations.

Remembering that computers work in binary, this is roughly equivalent to 238. This means it would take at most 238 operations to search every possible combination of letters. (Number theory shows us that in fact the password will be found, on average, in only half that time, i.e 237).

Making Exhaustive Searches More Difficult

We can increase the complexity of the password making it harder to guess or perform an exhaustive search by increasing the available characters from which to pick a password. Instead of limiting ourselves to just lowercase letters let's use uppercase, numbers and some special characters, too. Using a-z, A-Z, 0-9 and about 10 special characters our password alphabet is now 72 instead of just 26. The password complexity of an 8-character password jumps from 268 to 728 (or 722,204,136,308,736). This seems like a pretty significant jump!

In base-2, 728 is roughly equivalent to 249.

Considering the speed with modern computer can guess passwords, it is believed that we need complexity of around 2128 to be secure since, as mentioned above, exhaustive searches will find the correct password in around half the length of the password (or, 2127 operations), and birthday attacks are expected to find hash collisions in 264.

Clearly, even with an expanded alphabet, our 8-character password still isn't enough.

Password Length

What else can we do to increase the difficulty of finding the password? As well as increasing the available characters to choose from we can also increase the length. With a slight increase up to just 12 characters the complexity of performing an exhaustive search now jumps to 7212 (or 19,408,409,961,765,342,806,016). That seems much better! Well, the binary equivalent is around 274. This is pretty good, but still a long way away from the 2128 we are aiming for.

Passphrases vs Passwords

Clearly, we need to do something significant to increase the complexity of our passwords. And it's here that passphrases seem to make sense. The idea is that we use 3-5 real words chosen at random to form a sentence that makes no sense but is easier for people to remember.

Let’s consider a 4-word passphrase: correct horse battery staple (readers of the online comic XKCD will be familiar with this).⁶

What kind of difficulty does this give to hackers trying to perform an exhaustive search? To keep things easier for the user to remember, let’s assume we are using only lowercase letters, a-z. The size of our character set is once again only 26. Including the space as a character we have a total of 27 possible characters and a passphrase length of 28. This gives us 2728 (or 11,972,515,182,562,019,788,602,740,026,717,047,105,681). Now THIS is a serious looking number! The base-2 equivalent is around 2133. Finally, we have the complexity we need from our password!

Or do we...?

Passphrase Cracking

While password attacks are far more common, a number of tools exist which allow cyber criminals discover passphrases. The simplest is to simply provide password cracking tool ‘Hashcat’ with a list of predefined passphrases. Such a list, comprised of almost 22 million known phrases, is available on Github.⁷

The passphrase “better late than never” has 22 characters which, according to traditional password advice, implies a strong password due to the length. However, this is one of the most common idioms in the English language and is on line 18,636,796 of the passphrase list.⁸

correct horse battery staple appears on line 1,976,239.

Using known phrases is a simple way to attempt passphrase cracking, but is limited, even with 22 million entries. An alternative approach is to use the PRINCE algorithm to create passphrases which are then sent to Hashcat to calculate the hashes.⁹ At its core, this allows attackers to supply a list of single words which the algorithm then combines in a variety of ways. For example, a word list containing ‘correct’, ‘horse’, ‘battery’ and ‘staple’ will result in the following one, two, three and four-word combinations:

correct horse battery staple
correct horse battery
correct horse
correct
correct battery staple horse
correct battery staple
correct battery
…and so on

Thanks to Zipf’s Law we know that all human languages roughly follows the “80/20 rule”.¹⁰ In fact, the most common 18% of words in the English language form 80% of everything we say. The top 100 words form about half. If we consider a password cracking tool that is programmed with the top 1000 words in the English language it is feasible that instead of trying every possible combination of characters it, instead, tries every possible combination of 3-5 word phrases in its 1000-word dictionary.

Consider our available password alphabet. It is no longer all letters (upper and lowercase), numbers and some special characters. Instead, we have only 1000 words to choose from. And how long will our passphrase be? Character counts are irrelevant since we're not guessing character by character. Our passphrase 'length' is only the number of words we chose. Four, in the example of correct horse battery staple.

So, mathematically, our complexity is now only 10004 (or 1,000,000,000,000). This is roughly 240.

The English language has around 171,000 words in active use today. Applying Zipf’s Law we can estimate that most people would pick a word from a list of only 30,780.¹¹ Even using this expanded word list, we have a maximum number of combinations of 30,7804, which is equivalent to less than 260. That's worse than a 12-character random password.

The math suggests that the use of passphrases alone are, at best, no better than complex passwords and, at worst, may actually be less secure.

Password and Password Recommendations

This has been a simplified look at the theoretical strength of passwords and passphrases. The key take away is that the strength of a given password or passphrase is not only proportional to its length, but also to its entropy (randomness).

In reality, websites and services should be storing per-user 'salts' and per-site 'pepper' values to make offline brute force attacks virtually impossible. This isn’t always the case, however, so here are some general guidelines to help keep you safe when trying to choose a strong password:

Don’t choose your own password
Make use of password managers to create truly random strings of characters for your password. If creating passphrases, make use of websites which can suggest completely random words for you.
Don’t re-use passwords
The data breach of one site should not risk the security of all your online accounts. Ensure that every site and service have a unique password. Password managers are the only realistic way that these can be remembered.
Create long and random passwords or passphrases
Credentials should be either 16+ character passwords chosen completely at random (ideally by a password manager), or 4-5 word passphrases which make use of either character substitution (l33t-speak) and/or non-word breaks, e.g. corr ectho rse batter ysta ple
Use multi-factor authentication (MFA)
Sometimes the theft or brute-force guessing of a password is inevitable. Having a second factor of authentication, such as a time-based code on a mobile phone app, can prevent attackers from gaining access to your account even if they obtain your password.