Hacker News new | past | comments | ask | show | jobs | submit login
GCHQ – Not So Secure? (danfarrall.com)
113 points by sdoering on March 26, 2013 | hide | past | favorite | 74 comments



This is just GCHQ's way of saying that they already know how to reverse bcrypt.

On a more serious note http://www.gchq-careers.co.uk does not appear to be run by GCHQ. The Terms page says that it's run by TMP Worldwide (http://www.gchq-careers.co.uk/terms-and-conditions/).


"This is just GCHQ's way of saying that they already know how to reverse bcrypt."

That comment made my morning.

Surely this is a case of

http://xkcd.com/932/

Embarrassing but surely those nice people in Cheltenham are about more serious work.


This is a microsite run by TMP. Since most Applicant Tracking Systems (the software used to list/fill jobs) do not list jobs in an SEO friendly way, companies like Jobs2Web, Optijob, and TMP built microsite products like the one being used here by the GCHQ.

TMP is an ad agency, not a technology company - can't say I'm too surprised they missed the ball on this one. There are lots of US gov't agencies who use TMP also.


Because you mentioned bcrypt, it might be a good time for everybody to read this:

http://yorickpeterse.com/articles/use-bcrypt-fool/

(sidenote: GCHQ cannot reverse bcrypt. Jgrahamc was making a joke.)


That's not a good article to make people read.

I find it strange that it doesn't mention lack of salting among the most common mistakes. Also, I didn't think SHA1 broken in any way that makes breaking password hashes easier than e.g. the SHA2 family? I might be wrong, though.

PS: I'm not advocating using anything other than a good PBKDF for hashing passwords.)

Edit: Re-reading the article it seems like lots of BS in there:

Example 1, regarding hashing something several times: "In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." Nah, guessing is only more time-consuming.

Example 2, regarding the same thing: "The first reason is pretty easy to bust: simply add more hardware (or better hardware) and you're good to go." This applies for bcrypt as well.

And for his "attack" on "Hashing a password N times in the form of hash( hash(password) ) * N" you would need a working preimage attack for the hashing function used.

EditN: Rewrite


I think you're looking for something more complex than that post brings to the table. As a community, we're trying to circle our wagons around a simple piece of advice about code that stores passwords: do not write code that stores passwords. Even if your algorithm is secure, your code is likely not. Include your language's best-supported secure password library (meaning one of bcrypt, scrypt and PBKDF2) and ship it.

So that post may be incomplete regarding the technical details, but the critical information is there: Just use bcrypt. (...and use the recommended work factor.) I know hackers hate that sort of thing, but this is really one of those things we just have to drill.


His advice is good, but that's still no excuse for making invalid arguments for what he is advocating.

Edit: In fact, if I hadn't heard of bcrypt before and saw that article, I would probably not trust his advice either.


Bcrypt has a tuneable "cost function", so you get to decide how hard to make it. It's effectively designed to be slow and hard to do in parallel.

The SHA family on the other hand are designed to be fast, (for checksums etc) so it's possible that later SHA algorithms are actually worse than earlier ones for password hashing.

Modern computers can do a lot of MD5/SHA1 every second so even with a salt, one round of SHA1 is likely to be not very good at all.

You can probably find a significantly large X and do SHA1 enough times to make it slow enough today, but for future-proofing you are better off just using an algorithm that is actually designed for such purposes.


I'm not saying that that bcrypt isn't a better choice, it is, but some of the "flaws" he is pointing out in that article are just ridiculous. If he's going to argue for something, he should be using correct arguments.


    I find it strange that it doesn't mention lack of salting among the most common mistakes.
True. I believe bcrypt requires a salt, so you can't forget it. Bringing that up would have strengthened the case.

    In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." Nah, guessing is only more time-consuming.
The article is using that as an incorrect argument for repeated hashing, and goes on to detail why it isn't necessarily more time-consuming because it increases the probability of finding a collision.

    This applies for bcrypt as well.
You can tune the work factor, so in a few years when computers are nearing fast enough to brute force your hashes, you only need to increase the work factor, not re-write all your code. You can't do that with something like SHA. The article could probably be clearer on that.

    And for his "attack" on "Hashing a password N times in the form of hash( hash(password) ) * N" you would need a working preimage attack for the hashing function used.
I don't see why. If there is a probability of a collision existing for a hash, repeated hashing will increase that probability, turning something that has a low number of collisions into a high number of collisions. The more collisions, the easier it will be to find one.


You can increase the number of rounds of hashing as well, without rewriting your code.

I can't argue with your last point, simply because I don't understand it. How exactly does this "turn something that has a low number of collisions into a high number of collisions?"

In my mind, what we're doing is hashing "mypasswordmysalt" n times, and storing the resulting hash, the salt and n in a user table. If the user table is leaked, and n is 3 (for simplicity's sake, it would normally be _much_ higher), can you explain how this could be worse than doing one round of hashing?


Every hash function has a probability of collisions. For illustration, let's imagine a really bad one where every output has another input that results in the same value.

With a single round of hashing, there are two possible inputs A1 and A2 that can produce the final output O. With a sufficiently large number of potential inputs, it will take you a while to brute force and enumerate all possible inputs before you hit on either A1 or A2.

With two rounds of hashing, there are two possible inputs A1 and A2 that can produce the final output O, and two possible inputs B1 and B2 that can produce the intermediate hash A1, and two possible inputs B3 and B4 that can produce the intermediate hash A2.

With three rounds of hashing you end up with something like this:

    C1  C2  C3  C4 C5  C6  C7  C8 
     \  /    \  /   \  /   \  /
      B1      B2     B3      B4
       \      /       \      /
        \    /         \    /
          A1             A2
           \             /
            ------O------
So with each round of hashing you are increasing the number of collisions, meaning you're likely to brute force an input that will hash to O much quicker.

[Edit] Of course, with each round you're also increasing the amount of time to compute O, but given most hashing algorithms are designed to be fast I'd say it's probably not enough to counter it. Not sure though, I've not actually looked at the maths.


[Replying to tryeng]

    No, there is an infinite number of inputs that produce the final output O.
Only with an infinite number of inputs. If we restrict our domain to things that are likely to be passwords, say string under 1000 characters in length, then we're increasing the number of inputs in our domain that can produce O.

    you have to find something that generates O after exactly n rounds of hashing
I was using an increasing number of iterations to demonstrate how each iteration potentially increases the number of collisions. Taking n=3, you don't need to know any of the intermediate states A1, B1, B2, B3 or B4 to take advantage of the fact that in our domain of strings under 1000 characters we only have to find one of 8 possible inputs rather than one of 2.

    I said will be true for all relevant hashing functions.
All relevant hashing functions have a probability of collisions in any useful input domain. Okay the tree won't be as dense as the one illustrated, but you're still increasing that probability by repeated hashing.

You need to find some way to mitigate the collisions, at which point you've basically got bcrypt.


[Replying to ajanuary]

Thank you, now I actually do see your point. I would still not think of it a considerable weakness. To find such a collision would take more time than bruteforcing any likely password.

There doesn't yet exist a single example of any SHA1 or SHA2 collision, and if we use SHA256 as an example, we could probably not find one the next few years by bruteforcing even if we used all the world's current computing power and storage.

Edit: Actually, that whole argument falls to pieces, because if we can search through enough possibilities to find any collision, the output size of the hashing function is too small to for the hashing function to be secure.


Indeed. This is where I agree with to that the article is a bit weak. It overstates the problem of repeated hashing and doesn't explain how bcrypt solves that problem at all. It makes it sound like a completely different and magical solution rather than repeated hashing with collision mitigation.

It's more a case of "hey, here's a potential problem you might not have thought of, here's an algorithm that addresses it."


Then I guess we might agree.

The only advantage I know of with bcrypt over multple SHA2 is that GPUs are very bad at it compared to most hashing functions, so the CPU cost (on my server) and the GPU cost (the crackers' cost) are not too different. (Anyone, please correct me if I'm wrong.)

Off-topic: This exponential reply delay is really annoying.


you guys can click on the [link] right above the post and you'll be able to reply without a delay.


"With a single round of hashing, there are two possible inputs A1 and A2 that can produce the final output O."

No, there is an infinite number of inputs that produce the final output O. And you have to find something that produces O after exactly n rounds of hashing, it doesn't help to find something that produces O after one or two rounds.

Edit: Sorry, didn't see your assumption when I first posted, but I guess what I said will be true for all relevant hashing functions.


Hi, said author here. The particular article was written around 2 years ago (April 2011 to be exact) when crypto was still a fairly new concept to me. As a result there indeed are some flaws with the article. Having said that, some extra explanation can't hurt.

> Example 1, regarding hashing something several times: "In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." Nah, guessing is only more time-consuming.

I'm not entirely sure what you're trying to say with this example. The particular list item was meant as one of the examples why I think people would do it that way. It's not too uncommon that I read some article about a developer doing that because it is supposedly more secure.

> Example 2, regarding the same thing: "The first reason is pretty easy to bust: simply add more hardware (or better hardware) and you're good to go." This applies for bcrypt as well.

Bcrypt introduces a weight/cost (whatever you'd call it) factor who's sole purpose is to prevent this. The higher the factor the slower the process takes. The nice bit about it is that with a weight of N the hashing process always takes the same (due to some bcrypt voodoo that is beyond my knowledge) amount of time. You also can't change the weight since that will result in a different hash being produced (it would be fairly useless otherwise).

Having said that, I agree that the article could've been written in a better way but it will remain as is, simply because I try not to edit articles after I've published them.


Hi, I see that I misread the "In order to retrieve the original password a hacker has to crack multiple hashes instead of only one." as your argument, not an example of a false argument. I stand corrected.

Regarding the cost, bcrypt only increases the number of iterations (exponentially) with increased cost. The operation will take the same amount of time on one specific CPU, but go faster on another. However, because of higher memory usage than SHA variants, GPU implementations of bcrypt don't benefit as much compared to CPU implementations.

We still agree your advice to use bcrypt, though.


Maybe it's on purpose and it's question #1 of the interview, "what did you think of our recruitment website?"


“We took the liberty of trying your email and password on some popular websites and have rejected your application”


That doesn't make it any less egregious. GCHQ (of all organisations!) have failed to carry out even a minimum standard of due diligence in selecting a supplier that handles sensitive personal information of prospective employees.


They have finite resources so there is an opportunity cost to everything.

More time auditing their public website means less time auditing military systems etc.


The real question is why anyone would sign up to such things using a "real" password. I've lost count of the number of jobsites and so on that I registered for with a random string knowing that a) Firefox would remember it and b) On the off chance it doesn't, and I get through to a stage where I ever need the password I can just recover/reset

Signing up for websites that have limited/no repeat value is part of everyday life and I don't have high expectations of them


To be fair, I doubt that the GCHQ website is developed by the same people who are doing cyber intelligence work or whatever it is they do.

More likely it was just developed by whichever company was picked off a list of government contractors. I'm sure that whatever internal systems they have are completely separate from the website.

GCHQ probably consider arguing with a contractor about the password hashing on the jobs section of their website as a waste of their time.


If you're reusing passwords on websites you care about your behavior is risky no matter what hasing algorithm gets used or doesn't at some specific website.

If you're not reusing passwords it doesn't matter to you how they store your password. If they have broken in far enough to dump the auth table they almost inevitably can access your data stored there.


http://www.ucas.ac.uk/

Another website which stores passwords in cleartext - just raising awareness.

We have received a reminder request for your login details.

Please use the details exactly as written below to access the UCAS Apply 2013 service:

Password: ThisWasMyClearTextPassword


It was a while before they even had password reset. I believe you had to ring up. Think this will be fixed next year.


I forgot my username, because they don't allow email address login, so I had to recover my username, for this they required some kind of ID, but it wasn't present in my emails, it took me a while to figure out there were different recovery pages for different types of accounts, so putting in my email just returned "invalid email address".

Took me about 45 minutes just to recover my information, it's a terrible user experience:

First off, requiring an uppercase letter, which I've never used, so I actually now need to remember another, both lowercasepassword and Uppercasepassword, then changing my username to something built out of my name and age, with capitals in them, like FirstLast92, instead of just my email.

I hope I never have to use this website again.


> both lowercasepassword and Uppercasepassword

When I was applying to uni both of those would be invalid passwords too, as they're more than 8 characters. I emailed them to complain, and was told that this is to enforce easy-to-remember passwords, because they didn't want to deal with the hassle of people asking for password resets...


The University of York have a similar length requirement. When I enquired about it, I was told it was because some of the older systems have a maximum length, and they keep your password the same everywhere.


Only 45 minutes? They've improved that since I had to use it (~6 years ago). I think it took me the best part of a week to get my account reset :\


Yes, I was dissapointed when I got my password back in plaintext. The user experience overall is pretty dismal.


I raised this with them when I went through the process about 7 years ago. Seems not much has changed :(


The argument that this matters holds water. If you can steal the identities of all the applicants then, in particular, you can steal the identities of the successful ones.


GCHQ, along with various other agencies, out-source some of their recruitment, mainly to sift. Perhaps you could steal the identities of the candidates who had passed the initial few sifts...but I really doubt that things like developed vetting status are going through this system.


Nope - having been there as a contractor, it's still done on paper believe it or not.


It's not obvious that they are storing them in plain text (they could easily be encrypting them), but what they aren't doing is using a one way hash.


That doesn't really make it any better. If someone gets their database, they almost certainly also have the key they hypothetically encrypted the passwords with.

If they'd used pbkdf/bcrypt or even better, scrypt, this would be a non-issue.


I'm curious - what makes scrypt superior to bcrypt?


The space-hard is important for people throwing 10000 GPU cores at the problem. Bcrypt is more susceptible as it was designed before the million-core world came about; scrypt will continue to thwart due to memory constraints.


In addition to being time-hard (like pbkdf2 and bcrypt), it is also space-hard.


You are right but I had rather they stored the passwords in cleartext than send them over e-mail to the users (because then the e-mail provider has a copy of the password)...

Being able to automatically reproduce them is almost equivalent to storing them in cleartext.


Not if we reuse the password or if you're important enough, a pattern they might recognize


Just highlights how the race to the bottom in terms of IT procurement in govt. is arguably counter productive. Compare the cumbersome government procurement process for knocking up simple websites like this when they could palm them off to any number of small competent London tech companies.

Most of these things could just be pithy rails site that get thrown away after every recruitment campaign.


This is what happens when you let HR buy software without adult supervision.


HMRC does not accept passwords with special characters, so it only shows that these sites are run by incompetent IT departments....


Outsourced to companies like Capgemini who deliver as little as they can get away with


There exist a few cases where storing passwords in clear text is valid, but this one isn't it.

Ask the question: Is the cost of giving the user a new generated password higher than the risk averted by storing the password hashed.

If all you have is a login for a website, then the risk is clearly bigger than any cost.


>There exist a few cases where storing passwords in clear text is valid, but this one isn't it.

Name one.


I think there are some cases when storing passwords in clear text is valid. One example:

A website has educational content. Teachers can sign up students in their classrooms. The teacher's password is stored securely, the student's password is not. The student password is shorter and automatically generated. The goal is to make the password just hard enough to not be guessed by other students, but not so hard that the student can't remember it. It is stored in clear text so that the teacher can look it up for the student, or print out the password to pass out to the student, etc. The student account is only given access to the content. The worst thing that happens if a student's password is guessed is that another student can mess up their progress tracking.

Is there a reason the student passwords should be encrypted in the database?


All of those can be solved by password reset (perhaps by permission of the teacher account, in this case, rather than entering personal information, as password resets usually work).


Sure, teachers could reset the password. My point is that there is no reason to. It is one extra step for teachers and gets rid of the benefit of having a consistent password for the student.


Kerberos.


This is a big deal when you consider that, IIRC, GCHQ instructs you not to tell anyone you've applied, and admitting that you've talked to others about your candidacy can disqualify you from the job.

But perhaps I'm remembering a different intelligence agency's policy.


No, that's true across all of the UK intelligence agencies (probably all agencies worldwide, really). Depending on the role you'll probably have to go through developed vetting, at which point your passwords are the least of your worries...


Official UK public transport site also does this http://www.transportdirect.info

There is a site for naming and shaming plaintextoffenders.com


Clearly they're actually storing them using very strong hashing techniques but then using super secret technology from MI6 to reverse them if they get forgotten.


No no, that's back to front. GCHQ develop the super secret technologies and do all the listening and decrypting.

MI6 uses that intelligence (as well as intelligence they've gathered themselves).

MI6's "super secret technology" is a rubber hose in some friendly country with no human rights laws.


I love how they strongly advise against writing down the password after sending it in plain text over (possibly unencrypted) email.


You know that these recruitment sites are run at arms length by a specialist recruitment agency and not GCHQ its self.

Though it doesn't send out the right signals as a list of potential candidates for GCHQ, The SS and SIS does have inteligence value to other actors



I encountered this over a year ago. Shocking really.


Just had to upgrade my bandwidth for the views your bringing. Some interesting comments here guys.


Just because it was sent by email, doesn't mean they aren't encrypting passwords. When the password is sent to their servers it's unencrypted anyway so can be read and sent in an email. I do disagree with passwords being sent by email though


I think the point is that storing even encrypted passwords is not as safe as storing (salted) hashes, because if the database was compromised, the encryption key would likely be compromised as well. It's safer if even the site themselves do not know your password.

Technically, you are right to say that there's no evidence passwords are being stored in plaintext, but encrypted stores really aren't any better.


Now try thinking about what you've just said, please.


Nobody accused them of "not encrypting" [sic] passwords. People accused them of storing passwords in clear text.

These are two entirely different things.

Why? Because passwords should NEVER be encrypted. Passwords are meant to be hashed (with a salt) and the hash (+salt) is what should be stored on their servers.

You really should know better...


509 - Bandwidth Limit Exceeded ... The power of HN!


Is there some checklist some place that can be used to beat people over the head with the very basics of security for a user-facing site?

Sort of, "If your developers are doing this today, they are grossly incompetent and you are putting your business and customers at risk."


The OWASP Top Ten is a good start and gets better each year. https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Proje...


Thanks!


[deleted]


but it's not a lie then. As you say yourself the HR part of their website stores plaintext passwords.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: