Hacker News new | past | comments | ask | show | jobs | submit login

I am always intrigued by a lot of these announcements that clarify that only 'some' of the user information was obtained by the infiltrators.

I would have assumed that if a database was breached, then the bad guys could access the entire user or password hash table? Would 'only some' data mean that they detected a 'SELECT * FROM users' query being run and shut down the connection before it could complete? Is it the database sharding they use which means the entire table is not visible at one time?

I'd be interested to hear more about technique or technologies available to prevent global queries to scrape entire tables once someone has gained access to your database.




One very simple option would be to kill all queries which take more than a few hundred ms, or which return over n rows, and send an alert. Such queries are almost always slow, and tend to stand out amongst normal traffic.

Doing so keeps your db responsive against programmer errors and limits data exfiltration. I've been doing this for the first reason for years.


Not every user has access to every table or database. So if a company separates such things based on various categories, brandnamecars.com's hacker exploited account might get all the car users' credentials, but not the case for brandnametrucks.com, who had a separate table or database, and restricted the brandnamecars.com user account properly with permissions so it couldn't get that info, even if the same server handles both the databases or tables.


Say there's two tables, users and user_preferences. Someone goes in, takes the contents of users (hashes and salts and all). Only some of the user information was obtained!


I get it about normalised data spread across multiple tables - but usually (from how I interpret it), they seem to be talking about number of rows - i.e. "We think only 10,000 users had their information compromised...".

I believe in the case of the LinkedIn breach, they said that something like "less than 20% of their user passwords were leaked". I take that to mean that not all rows were exposed, but only some - that's why I am intrigued as to whether the query was shut off mid stream, or the bulk download of exported data was detected and cut off or similar?


This is the case when attackers don't get access to the database itself - imagine they were able to listen to connections between users and front-end servers, and extracted authentication information. This would only concern users connecting during a specific timeframe.

In this post for instance, they indicate that attackers got 'sync users’ passwords' while storing only 'encrypted/hashed data'.

Other possibilities: they accessed a partial backup (or prod data used in dev), a caching system, a message broker (Kafka)...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: