Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

On the subject of raw computing power, if you live in the US you've probably heard about some NSA or CIA data facility being installed in your general region, and how the local power company built new infrastructure just to power the building. If Google can throw 2000 cores at securing software, how many can a government throw at breaking it, e.g. in preparation for the next iteration of Stuxnet?


The cluster is interesting, but not as interesting as the giant corpus of SWF files Google got to use. Do you think the government has a crawl as complete as Google's under its hat? How? People notice when the Googlebot does new things. Wouldn't we noticed the Fedbot?


Quite true. Google does have a lot of data. But, I'd wager the NSA has just as much data, just from different sources. Maybe they couldn't fuzz Flash with the optimal set of .swf files, but they could mine vast numbers of voice conversations for correlations.

Additionally, years ago a friend of mine who I'd lost contact with caught up with me and told me he found a cached copy of a website I'd taken down in his employer's equivalent to the Wayback Machine. His employer was a branch of the federal government. I know my anecdote doesn't prove anything, let alone come close to addressing the difficulty of crawling the web without anyone noticing (intercept all http traffic in transit?), but the fact remains that there are literally tons of computers doing something for the government.


Perhaps Fedbot crawls in a less deterministic manner, uses a lot of different ips, and sets user agent to IE?


I suspect "fedbot" works by calling up google and saying "Hi, it's us again, we've got another white van on the way to the googleplex, have a petabyte or two of the Internet ready for us to collect in 20 minutes. thanks"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: