I’m sure you’ve seen indicators that measure the strength of a password when signing up for a new account online. According to an essay by MSR (Microsoft Research), this UI element has proven effective for increasing security by encouraging passwords that cannot be easily guessed.
Major services such as Google, Facebook and Twitter use password indicators for their sign-up process. Our Nulab Account also uses a password strength meter to ensure users’ confidential information remains secure by reducing the risk of a password being guessed.
In this article, I’ll share the inner workings of a password strength meter based on my experience implementing one.What is Password Strength?
Password strength is a numerically expressed measure of how uncrackable a password is by considering the length and complexity of the password. Our Nulab Accounts use a scale of 1 to 5.
Is an 8-digit Password with Letters, Numbers and Symbols Safe?
Since password strength is measured by length and complexity, would it be safe to simply follow the generally recommended guidelines—use more than 8 characters and mix numbers, symbols, upper and lowercase letters.
Altogether, there are 96 possible characters when choosing from A to Z in both upper and lowercase, 0-9, and all available keyboard symbols. A password with 8 characters could be any one of these 96, taken to the eighth power for the varying patterns. That means over 7,200 trillion password options! Not even a computer could easily handle cracking a password with that type of complexity!
In short, it seems the recommended guidelines is enough when it comes password strength.
We as humans, however, usually choose simple combinations that we can easily remember or that have some meaning to us. If someone tries to crack such an 8-digit password using the dictionary to search for combinations of letters prioritized by trackable words, then the massive 7,200 trillion potential options are narrowed down and the possibility of cracking it is increased.
The following quote will help put things into perspective:
Some hackers are obsessed with ”tendency analysis from past leaked passwords” to develop an ”efficient dictionary that hits the target well.” There are even cases where hackers compete with each other on whose dictionary is more brilliant. [Wikipedia]
Isn’t that concerning? That hackers have specific dictionaries for most used passwords. This truth should make us look like at how password strength must also consider the meaning associated with the characters being used.
Is Strength Measurement a “Black Box?”
When I was implementing our strength meter, I researched the criteria other services used for measurement. I discovered that each service was different and each was a ‘black box.’
A password rated “strong” by one service could easily be rated “weak” in another, which can be confusing for users.
An Alternative: zxcvbn
We have seen that measuring a password’s strength requires more than general rules about types and length of characters; it requires algorithms to verify the strength from various angles. If we create these detailed verification points from scratch, however, it would definitely create another black box particular to our web service and would still be confusing to users. Considering this, I searched for open documents and libraries to solve this issue. What I found was amazing.
For our Nulab Account, we were looking to use an API and process it on server-side. The Nulab Account is also Java based. Many of my team members wanted to use zxcvbn in Java applications, so in the end we dared to port it to Java, publishing it as Java version library.
Thus, we got: zxcvbn4j.
The Real Password Strength Algorithm
In the process of porting zxcvbn to Java, I learned the algorithms. It is challenging to learn the algorithms, but I found them so interesting that I will explain how the work below:
1. Dictionary – Popular Word Matching
zxcvbn has a dictionary of over a hundred thousand words that are commonly used for passwords. It evaluates a password by matching it against all combinations of these words.
A generic password is string of letters often used in passwords, like “password”, “admin”, “root” etc.
In addition, according to “Worst Passwords of 2014” published by American company SplashData in 2014, the password most commonly used in the world is “123456.”
Name and Surname
This refers to using common first names like “Mary” and “Peter,” as well as common surnames such as “Smith,” “Yamada,” etc.
Surnames include Japanese names, as above.
Commonly used English Words
These are English words that are often used in American TV, movies and Wikipedia such as “story,” “social,” “together,” etc.
For example, “drwossap” can be inverted to “password.” The password in question should be inverted and then checked for dictionary matches.
L33T is the substitution of numbers for letter.
Wikipedia explains L33T:
Leet (1337, l33t) is mainly used in English speaking countries on computer communication networks, online BBS etc, written in Latin letters. Also called leet-speak. For example, ”Warez” can be written as ”W@rez” or ”W4r3z” by partially replacing alphabet letters with numbers or symbols that have similar form. In other cases, ”for” and ”to” can be replaced with ”4” and ”2” and similarly ”you” with ”u” and plural ”s” with ”z”. Words with an ending of ”cks” and ”ks” can be replaced with ”x”. And there are more varieties sometimes with deliberate spelling mistakes, mixture of uppercase and lowercase letters and so on. [Wikipedia]
For example, words below can be replaced as:
The head image of this blog has “P@$$w0rd,” which meets the recommended requirements of “use more than 8 characters and contain numbers, symbols, uppercase letters and lowercase letters.” But when replaced by L33T, it matches with a very dangerous letter string of “password.”
2. Spatial – Close-key Matching
These are letter strings constructed using keys that are close to each other on the keyboard/keypad such as “qwertyuiop,” “asdfghjkl” or “zxcvbn.”
They react to:
- qwerty order
- dvorak order
- common keypad
- mac keypad
As zxcvbn4j is porting to Java, we made it react to “jis order” that is frequently used in Japan.
The library name, zxcvbn, is an example of close-key matching and very tricky naming.
3. Repeat – Repeat Matching
We must also check for repetition of the same character, such as “aaaaaaaa” and “11111111,” as well as repetition of strings like, “abcabcabc.”
4. Sequence – Sequence Matching
This is about strings that are in alphabetical or numeric order such as “abcdefghij,” “fghijklmno,” “0123456789” or “6789012345.”
According to “Worst Passwords of 2014”, many people make passwords with numerical order.
5. Date – Year, Month, Date matching
These are character strings recognized as dates, such as “20151101” and “20151224.”
Although I found other great password measurement tools besides zxcvbn, most of them measure password strength only by number of characters and the use of a combination of numbers, symbols, and uppercase lowercase letters. In that sense, zxcvbn is outstanding. Personally, I think matching with L33T substitutes and close key strings scored high points for zxcvbn. You can also can add entries to the dictionary. It’s designed to receive external lists of letter strings, and is therefore easy to customize.
Recently, authentication methods other than passwords are becoming common. For example, the Nulab Account is presently offers 2-step authentication. As technology continues to develop, passwords may be used less frequently but for now they are still at the core of security. We will continue to use passwords as long as it provides our users with a high level of security.
Check our our zxcvbn4j published on Github:nulab/zxcvbn4j. Please consider using it if you would like to implement an API for password strength measurement on Java, or if you want to process strength measurement within Android apps.
If you are interested in learning more about Dropbox’s zxcvbn, please read their official documentation: zxcvbn: realistic password strength estimation.
Happy password hunting!