PassGAN: A Deep Learning Approach for Password Guessing

tbiehn · on Sept 19, 2017

I'm a bit confused as to why they cite https://www.usenix.org/conference/usenixsecurity16/technical..., but then don't evaluate their performance relative. The selection of JTR and HashCat rule eval is troubling - 'Best64' isn't the way a skilled attacker uses those tools. It would be cool to see these teams participating in cracking event; http://contest.korelogic.com/ or tasked against un-recovered corpus; https://hashes.org/

Of course, maybe a more accurate view was that the paper isn't actually seeking to advance to the state of the art in password cracking, and has other motivations.

matrix2596 · on Sept 19, 2017

PaGAN would have been a better name

IncRnd · on Sept 19, 2017

Pagan is the name for a password generator.

dark_silicon · on Sept 19, 2017

This makes me wonder: is it possible to iterate through the generated outputs of a GAN rather than generate random outputs?

kdoherty · on Sept 19, 2017

I'm assuming by iterate you mean "generate outputs of a GAN which preserve some order that you can iterate through."

The answer is yes, but not in the way that one might think. Despite the fact that we seed the GAN with input noise, there is no guarantee that the GAN makes use of this at all. This is a theme with GANs: we often want to imbue them with prior knowledge that we think is important, but is easily ignored by the GAN. In this case, we want to generate samples from p(x|z), where x is in the space of our data (often images, in this case passwords), but provided it gets good results according the the loss function, your GAN may learn p(x|z)=p(x). This is fine if you don't care about enforcing some relationship between input and generated samples, but here we do.

One solution is to use InfoGAN (https://arxiv.org/abs/1606.03657), which adds a term to the loss function that the mutual information between a latent code and the generator output must be high. Your latent code might be drawn from a uniform distribution on [-1,1], and the generator output will be conditioned on this code. This being continuous, it's questionable what "iterate" might mean. On a computer, maybe you iterate through every possible float (as someone mentioned), but if you want to generate N different samples, you could also discretize this distribution to N values on the given interval, each with probability 1/N and sample from this PMF.

Cacti · on Sept 19, 2017

The generator is passed random noise from a distribution, so iterating through all the samples is the same as just sampling from that distribution a ton.

If your noise is uniform on [0, 1], then, sure, you could just iterate through every possible float, I guess. Though in practice you're talking an astronomical number of possible combinations.

Another option is to take two samples from the distribution and interpolate in-between them, which might give you an idea of the distribution of samples. Maybe, it depends on the GAN and data.

Eridrus · on Sept 19, 2017

A GAN effectively takes a seed to generate output, normally people provide random inputs, but you could iterate on the seed. Not sure how good your output would be though.

nicklovescode · on Sept 19, 2017

One interesting direction would be to see if you could use various parts of an email as input features.

You could likely probabilistically guess gender or age from many addresses, and I'm sure domain (yahoo, gmail) changes the distribution as well.

dithering · on Sept 19, 2017

I've used email addresses as source dictionaries. Turns out people see both choosing an email address and choosing a password as a kind of personality test, meaning there's a lot of overlap.

nl · on Sept 19, 2017

Chargrams from the email address would be an excellent feature.

yosito · on Sept 19, 2017

I wonder if this type of approach could pick up on patterns in "random" password generators.

Paul-ish · on Sept 19, 2017

I don't think you can do better than brute force search for cryptographic randomness, but it would be interesting to see if this does well on personal heuristics people use. Eg the XKCD approach.

febin · on Sept 19, 2017

Here is a simplified version of the same https://medium.com/@heyfebin/how-artificial-intelligence-can...

dark_silicon · on Sept 19, 2017

Is there any sort of metaheuristic a good password cracker can follow for juggling between rounds of dictionary iteration, character substitution, brute force suffixing, etc?

In other words, how do you most effectively combine different strategies?

dsfyu404ed · on Sept 19, 2017

Target someone. Wget their LinkedIn and their friend's LinkedIns. Wget the wikipedia pages for their hobbies and skills on said linkedIn pages.

Remove duplicate words and code.

Repeat with other sources of information (FB profiles, etc, etc).

Remove duplicate words.

Add it to your dictionary and then use that to see the generated dictionary.

Password cracking is always a compromise between speed and thoroughness.

tempVariable · on Sept 19, 2017

This is very close.

I used to provide crypto coin wallet password recovery service. The first couple of days would be spent on building a knowledge database with the help of the wallet owner. Plugging that data into the algorithm, even if the password was random and long allowed me to usually get the correct password. Exploring the whole search space without pruning is a fools errand

zeveb · on Sept 19, 2017

That's why I love passwords like PNe1KaC5gZGF5hlonE1k7g: they're just completely unguessable.

I don't have a guessable password on any remote account I have. A remote attacker simply cannot guess passwords; he'd have to use some other method, e.g. taking over my phone number or email account.

yosito · on Sept 19, 2017

Depending on how those passwords are generated, they could be less than truly random and thus easier for an algorithm to guess.

IncRnd · on Sept 19, 2017

Thanks a lot! Now I need to change my password at all sites I have a login...

htrp · on Sept 19, 2017

how do you remember something like that across all of your accounts? (or do you just use lastpass)?

lucb1e · on Sept 19, 2017

There are a few you need to remember, for example if you have a password manager then it would be worth memorizing one of those over the course of a few days. All the rest: use a password manager. I'd expect that not even the most brilliant 1% of this planet to have unique, 14-character random password for each account they own.

For special accounts, such as bank accounts, you could have a second password manager database with a unique password, if you are concerned that a password db which gets unlocked (almost) daily is not secure enough. Or remember unique password for those special accounts.

zeveb · on Sept 19, 2017

I store them in a password store[0], encrypted to my GPG key. That store is itself a git repo, which is encrypted to the same key using git-remote-gcrypt and synced with a remote server.

Thus I can run something like:

    $ pass foobar.example
    jDHQxFTkPjLkvbLNRQe5Ad

Or:

    $ pass -c bazquux.example
    Copied bazquux.example to clipboard. Will clear in 45 seconds.

[0] https://www.passwordstore.org/

pwg · on Sept 19, 2017

One does not try to "remember" those. One uses a password manager, it does the remembering, and you only have to remember the one long password that unlocks the manager.

enzanki_ars · on Sept 19, 2017

There is a chicken and egg problem. How does one make a secure password manager password...? There still is the need for users to learn how to generate secure but memorizable passwords.

pwg · on Sept 20, 2017

With one big exception. There is only one (not 30 or 50 or 100) to have to remember (the master one) and it is one that someone will have to enter repeatedly (until they decide to change it) so they will eventually have enough practice to actually remember it.

ape4 · on Sept 19, 2017

Oh no you guessed my password "brakecaliper"

IncRnd · on Sept 19, 2017

Wow. My password is "caliperbrake." What a coincidence. I feel really safe about this, since I never tell people if the period or quotes are included in the actual password.

hehe - I actually change that per site for extra strong security. Site 1 = caliperbrake. Site 2 = "caliperbrake"

It's really been very reliable.

visarga · on Sept 19, 2017

The moral of the story is that we will have to check our passwords against a password GAN before using them, even if they are unique.

zokier · on Sept 19, 2017

Better moral is that humans are bad at generating randomness, use CSPRNG.

Kequc · on Sept 19, 2017

I suppose this is a good place to ask this question.

How much truth is there to the fairly famous XKCD comic "correct horse battery staple" in this scenario. Isn't it possible that random word combinations, if they were common in passwords, could be guessed relatively early in a password hash brute force attack?

Especially considering we are talking about apparently billions of attempts per second.

Second question, is it true that placing restrictions on the password such as that it must include a capital letter, a special character, and be 8 characters long. Also reduces the time it takes to crack because the algorithm can simply dump a huge amount of possible answers?

rbanffy · on Sept 19, 2017

> correct horse battery staple

Assuming a vocabulary of 16K words, it's 15 bits per word for a total of 60 bits for the four word password. Letters plus punctuation will give you about 60 symbols or 6 bits per position. At 60 bits, the four word password is as good as the 10 characters you generated by smashing your elbows on the keyboard and, hopefully, easier to remember.

> Also reduces the time it takes to crack because the algorithm can simply dump a huge amount of possible answers?

Yes. Knowing the password rules limits the space that'd need to be bruteforced.

FabHK · on Sept 19, 2017

zxcvbn [1], the best simple password strength estimator I'm aware of, gives "correct horse battery staple" around 62 bits, and "Tr0ub4dour&3" around 30 bits (cracked in a day). ("ILoveTacoAndBurgersWhatever1984", suggested below, 53 bits).

> Yes. Knowing the password rules limits the space that'd need to be bruteforced.

Yes, but not that much, really:

1. Giving away the length of your password doesn't help the attacker much. For realistic scenarios, testing all passwords with length < N takes less than 2% of the time of testing all passwords with length N.

(The proportion of passwords with length < N to passwords with length N is approximately 1/M, where M is the number of distinct symbols (here about 60). Exactly it's (q-q^N)/(1-q), I think, where q=1/M.) So, even if you use only numbers, telling the attacker the length of the password gives them only a 10% edge.

2. Knowing that a 10 letter password contains at least one number excludes about 1/6 of passwords ((50/60)^10). So, that's less than one bit. Similarly with special characters etc.

TL;DR: Telling an adversary the length of your password doesn't really help them. Telling them password rules (contains a number, etc.) helps them more, but adding just one more character to your password increases the difficulty more than knowing the password rules decreases it.

[1] https://blogs.dropbox.com/tech/2012/04/zxcvbn-realistic-pass...

https://www.bennish.net/password-strength-checker/

lucb1e · on Sept 19, 2017

> the best simple password strength estimator

What one is really trying to estimate with a "strength estimator" is how much entropy needs to be used to crack it when the generation method is known (Kerckhoff's principle, sort of). So what one really needs to look at is the generation method, not the resulting password.

thesz · on Sept 19, 2017

Typical ASR vocabulary size for English is about 64K words.

If you are unsatisfied with these, use Finnish or Turkish. The vocabulary there is unlimited, for all practical purposes.

Even English can do better than 64K words - use suffixes, as Turkish people do. "Correctless horse batteryness staplenesslessness".

Now you are in the 80 bits for 4 words territory.

zokier · on Sept 19, 2017

> The vocabulary there is unlimited, for all practical purposes

Don't know about Turkish, but for practical purposes Finnish dictionary is very much limited. Source: I'm a native (bilingual) Finnish speaker.

thesz · on Sept 20, 2017

I am working with language modeling lately. For practical purposes (computing perplexity, etc, which is close to of password guessing), Finnish vocabulary is pretty much unlimited.

yosito · on Sept 19, 2017

> Assuming a vocabulary of 16K words

I would assume that including words not likely to be in any dictionary is a good way to increase the difficulty of guessing a password; made up words, slang in non-english languages, rare names, etc.

rbanffy · on Sept 19, 2017

Yes, but you need to keep it random. If rare names occur more in passwords than a random choice would create, you are reducing the search space.

yeukhon · on Sept 19, 2017

Limiting character is wrong because the space is totally limited. Plenty of "important" websites do that and I go sigh. I use different password for different types of services for that reason.

If we don't limit number of characters (but seriously I'd say on average password lengh would be 8-10; I go a lot beyond that). In practice forcing to have combinations of various character sets increase search space, but in practice I bet most users are likely to use @ for a, l for 1, or captialize either first initial, or first initial of every word.

Or worse, just append, preappend, or modify one or two characters because some other websites decided to have a more unique password policy, and users hate to reinvent a whole new password. Then imagine one website got hacked, the next is easy..

So essentially if we can gather sufficent data about a target, then the brute force search space is now hammered and reduced.

In any case, I am a big believer of no restriction, because of what I said above. The behavior of choosing password is not too random so we can predict and infer from whatever we know. Password and passphrase are the same shit because "ILoveTacoAndBurgersWhatever1984" is a legitmate password. Good luck guessing that because I am not a huge fan of Taco, and I bet you no machine brute force this in any reasonable time.

The whole password vs passphrase is a campaign to get rid of the sophicated password policy, so people came up with a new name.

In the end, I believe having a 2-auth and allowing users to freely choose whatever passwors they want is better than forcing them to choose whatever we think is best. Education is the key, both on social engineering and on choosing passwords. We as software technologists have the responsibility to make dangerous / "i am not sure if that's a good thing to do" warning more obvious.

Of course I am aware there are other alternative proposals to replace pwd but for now password is not going anywhere soon.

IncRnd · on Sept 19, 2017

Limiting the character length of a password is an indication that the password is getting stored.

yeukhon · on Sept 19, 2017

I am referring to limiting password to a small number. Of course no one should take in a password > 32 characters. bcrypt has limitation itself. A reasonable and practical limit is 32, so we can prevent DoS attack because encrypt/decrypt takes up a chunk of CPU resource. I'd be surprise anyone go beyond that (for a "passphrase").

IncRnd · on Sept 20, 2017

You are giving bad advice.

bcrypt's maximum password length is not 32 bytes. In fact the maximum has been debated due to discrepancies from the paper describing it and the first implementation, which actually accepted longer passwords due to how it handled words internally. In either case the max length is at least 20 characters more than you state.

Your own words above: "because encrypt/decrypt takes up" is 32 bytes. By most measures a five word passphrase fitting into 32 bytes doesn't even contain 80 bits of entropy.

zokier · on Sept 20, 2017

32 characters is very low limit. I checked and the average word length is about 7 chars. So 5 word passphrase would be 39 chars. And that is not very extreme example

sytelus · on Sept 19, 2017

I'm not an expert in this area but my general perception is that all those restrictions make system overall much less secure. This is simply because average users can't remember lengthy passwords and so they tend to use same password on many website or write it down somewhere insecurely. A much better approach may be just to have user use two or three words as password.

JohnStrange · on Sept 19, 2017

First question: If the size of the (known) dictionary is n words and the number of words in your passphrase is k, and if the choice of each word is made by a random number generator, then the entropy of the passphrase is k x log2(n). So for example a passphrase with 8 words from a dictionary containing 200,000 words has entropy 8 x log2(200000)=140.9 bits, which is very secure.

If you don't choose the words randomly, the security may decrease drastically, but I doubt it will be less for a passphrase of 8 words than a non-randomly chosen short 'password'. Generally, XKCD's advice is sound, as long as people are aware that dedicated attackers can and probably will guess common phrases and book codes (passages from a book).

Second question: Yes, any restrictions on password length or allowed characters drastically reduces password security. For example, 8 characters of random Latin1 is too short, it only has an entropy of 56.87 bit. Things get way worse once passords are not randomly generated. A user-chosen password limited to 8 characters of Latin1 is ridiculous and anyone can crack it.

Generally speaking, humanly generated passphrases are no longer secure.

teolandon · on Sept 19, 2017

The xkcd comic doesn't claim that "correct horse battery staple" is a completely secure password, it just shows that it's so much better than weird passwords that people tend to generate by themselves, like "k@tApU1t2143", and also how much easier it is to remember. So the suggestion is, if you're gonna use a password that you're going to remember, it's better to use a long one than a "complex" one. But yes, if the attacker knows that you're using English words for your passwords, the English language has about 170.000 words according to Oxford English Dict, and if you consider 1/10 of them to be known enough to be used in a password by you, if you use 4 words you have about 80.000.000.000.000.000 combinations, or about a day of brute-forcing with JTR. (Someone correct me if I'm wrong).

Five words makes it a lot more secure, or using more obscure words. Coming up with a "random" method of choosing some 5-6 words out of a dictionary, using dice or coin flips, can go a long way to make your "phrase" password more secure against these kinds of attacks.

Of course, the best approach is to use such a password, and use a password manager.

As for your second question, I believe it doesn't reduce the time it takes to guess some user's password. The restriction to include capital letters or special characters forces users to introduce more entropy in their password, and thus prevents a lot of common password usage. An attacker knowing that the site's password must be at least 8 characters long is not that helpful. Checking all the 1-7 character long passwords is trivial compared to checking all the 8-10 character long passwords. So it is helpful, since it prevents a lot of users from using simple, guessable passwords. In short, the restrictions introduce more complexity than they subtract for most users.

Edit: Having read the other replies, yes, it does technically reduce the search space, and it might be a bad thing (due to users adding simple common symbols, which are again easily guessed). So it probably isn't good practice.

rrobukef · on Sept 19, 2017

4 words take 835 x 10^18. You're low by x10000. The biggest number of the JTR benchmark page [1] is 80981K for single core (Microsoft LanMan) or 6200 kc/s for 1 CPU (DES).

All combinations take 1x10^9 cpu-days or 14 x 10^9 cpu-days. Amazon pricing ranges from a lot to a whole lot more (at least 39 million $).

[1] http://openwall.info/wiki/john/benchmarks

zokier · on Sept 19, 2017

Its bit annoying how much misinformation there is on the web about diceware-style passwords (that correct horse batter staple is an example of). There are so many people (luckily not in this thread!) saying that because it is using dictionary words it is vulnerable to dictionary attacks. That is utterly false. The numbers simply do not lie:

    Set             Number of symbols    Bits per symbol    80 bit string length
    20k word dict               20000              14.29                       6
    Printable ASCII                95               6.57                      13
    [A-Za-z0-9]                    62               5.95                      14
    [a-zA-Z]                       52               5.70                      15
    [a-z0-9]                       36               5.17                      16
    [a-z]                          26               4.70                      18
    [0-9]                          10               3.32                      25
    ????                            n            log2(n)        ceil(80/log2(n))

Bigger question is what is reasonable security level for common use. 80 bit security (like in that table) is probably good enough for most people, 40ish bits like in the XKCD is on the low side. Of course those bits should be always be generated in a secure manner, e.g. CSPRNG, no matter how they are then transformed into a password/phrse/whatever.

edit: did some further quick math: From random internet source I got that single Nvidia 1080 GPU can do about 25 GH/s for MD5 (and much less for something sane). With that 80 bit password takes about 60000 GPU years to crack, 64 bits 1 GPU year, 40 bits less than a single GPU minute.

triangleman · on Sept 19, 2017

So in your example here, we are talking about a password composed of 6 random English words, compared to, for instance, a 13-character password composed of random upper/lowercase letters and numbers?

zokier · on Sept 19, 2017

Yes, 6 randomly picked words from 20k word dictionary.

Cacti · on Sept 19, 2017

Even 60,000 GPU years sounds well within the capabilities of a large nation-state. Expensive, probably not worth it, probably better ways, maybe not practical, but certainly possible.

zokier · on Sept 19, 2017

Practically speaking you'd hope that something that anyone would be willing to spend 60k GPU years on cracking would not rely on plain MD5.

If you still are feeling bit iffy, then you can easily throw another word in. Or use bigger dictionary; 6 words from 40k word dictionary gets you about 90 bits of entropy, which takes in the order of 10^9 GPU years to crack. That should certainly be enough, even if you account few orders of magnitude for ASICs and other general near-future improvements in crack rate.

Kequc · on Sept 19, 2017

Practically speaking all you should be worried about is your password being discovered within a stolen password hash table.

Nobody processing a whole table is going to spend that much time on your password specifically. If the government wants your data, or someone is willing to spend that many GPU cycles to get at your data, they probably have better ways of getting it.

user5994461 · on Sept 19, 2017

There is a similar technique using markov chain. The author don't seem to be aware.

man-and-laptop · on Sept 19, 2017

There are 6 mentions of the word Markov in the paper, not counting citations.

dithering · on Sept 19, 2017

Has anyone proven that the recent spate of "apply neural networks to a text corpus" projects produce output any "better" than a markov chain with the same inputs would?

bitL · on Sept 19, 2017

In speech recognition they fare additional 10-15% better than Markov models and are state-of-art.

lucb1e · on Sept 19, 2017

Isn't that fundamentally different, though?

With password guessing, one wants to generate as many possible options as possible in order of most likely to least likely. If the generation method takes more than a microsecond it might already be faster to just go the brute force way.

With speech recognition one is also time-constrained, but much less so. If it takes 0.2 seconds to come up with the best match, the user is barely done pronouncing the next word.

Neural stuff is slower than Markov chains if I'm not mistaken, but while for speech recognition that might work great, it could be fatal for password guessing.

Cacti · on Sept 19, 2017

You could say that about practically anything.

reviseddamage · on Sept 19, 2017

what's the difference between "There is a similar technique using markov chain. The author don't seem to be aware." AND "There is a similar technique using markov chain! :) "

user5994461 · on Sept 19, 2017

When you announce an improvement of 18-24% over the state of the art, what you think is the state of the art makes a lot of difference.

thisisdave · on Sept 19, 2017

The authors state that HashCat uses Markov models and that they outperform it.

reviseddamage · on Sept 19, 2017

He doesn't get it. Someone else replied to his comment stating the same thing you did, i think he is selectively blind to the word "Markov", he missed them in the article and in the replies.

TheArcane · on Sept 19, 2017

Could you link that?

Oras · on Sept 19, 2017

It would be interesting to create sets from auto-generated passwords that these tools provide (like LastPass, 1Password, KeyPass) and browser generated passwords.

FabHK · on Sept 19, 2017

I don't think so: as they're generated randomly, there's no structure you can "deep learn". (You might recover password rules that are imposed by the password manager, such as at least one number, one capital letter, etc. But that doesn't give you that much).

The point is that humans are quite bad at generating "random" strings, and you can extract regularities there.

Oras · on Sept 19, 2017

Your argument is based on the context that every single user is writing his/her own password rather than using generators. How do you know that all users are writing their own passwords? I was suggesting creating a dataset using these tools not reverse engineering their algorithm.

roywiggins · on Sept 19, 2017

Outputs from those are just noise, and completely unguessable at a rate better than chance. You can't use a GAN to find structure where there wasn't any to begin with.