Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Only marking as “deleted” while indefinitely keeping it is illegal in the EU/EEA. The GDPR _requires_ a hard deletion in cases like this, but allows a grace period of a few weeks for the deletion to propagate throughout systems.


There are backup systems that are write-only. What’s to be done then?


Facebook used an encryption key per user for their backups. For deletion they just delete the encryption key which makes the data unreadable. There was an article years ago about their cold storage infrastructure, Blueray discs if I recall. https://www.datacenterfrontier.com/cloud/article/11431537/in...


You could replay this backup, and skip problematic record when writing new copy of the backup. Delete old backup. What’s important is to keep such log of „records to be deleted from backup”.


How does one do this with a 20TB SQL database?

Our approach would be to add some filters into our 'restore' pipeline which drops the problematic data should we ever attempt a restore, but I don't think it's good enough, and we have to maintain a list of user id hashes or such to power the filters.

Edit: I mean, in a way that won't eat a lot of costs. I can imagine a malicious group opening and demanding deletions for 1000s of users which would mean a deletion job running on a large number of these 20TB backups, say 100 daily backups and for multiple users?


You don't need to delete data instantly, you just need to do it within a reasonable timeframe. So batching data deletion requests and running a clear out once a week should be fine.

You may even be okay to just reply to the user that you've deleted all active copies of the data and it'll be fully gone when your backups expire in 30 days.

IANAL tho.


> a malicious group opening and demanding deletions for 1000s of users

I am not aware of any provision within GDPR that allows anyone else but the individual person (and courts) to request deletion of their personal data. So I think your example is highly unlikely to ever happen.


A group of individuals could do a gdpr flashmob and make any data admin miserable.


This is a solved problem as far as I am concerned.

We have automated systems to deal with requests in that category, it would probably have to be in the double-digit percentage of our customer base before we see any significant impact on our ability to conduct business.

We know which dat belongs to which customer, we know which data we must delete if requested, we know which data (eg invoice related for bookkeeping) we must keep even if requested to delete personal data.

If we ever piss off such a large portion of our customers, that they want to delete their accounts, GDPR-related requests will be the least of our concern.


That cost real money and requires literally throwing out the old backup (which may or may not be destroyable). Think optical media and stuff like that.


The GDPR was proposed and discussed in 2014, it was voted on in 2016, and went into effect in 2018. As 2025 is at the door, what excuses are there really to use a non-compliant tech stack when handling personal data?


It's imprudent to use technology that makes it impossible to comply with the law.


I’ve had a cursory look into that recently (just a simple googling) and it seems that it’s considered OK to keep the data in backups.

Which does seem weird… but to be fair, it would be near impossible to delete from backups as they exist today, it would be a law that can’t be practically applied.


Depends on which country's GDPR authorities you ask. At one point the French authorities said you don't have to delete data from backups, the Danish authorities said you have to delete when technically feasible, and the UK authorities said you had to put the data "beyond use" which has been interpreted to mean that if you ever restore from the backup you have omit the "deleted" data.

My guess is that most places go with not taking any active steps to delete things from the backups themselves, counting on media rotation to eventually overwrite the data. When restoring they omit anything that is on the "should be deleted" list.


Encrypt it and delete keys.


Encrypt write-once backups and store the keys on rewritable backups.


Simple. Destroy the backup physically.


The acid bath.


Store everything on a decentralized P2P server for privacy enhancing technologists (PETs) to deconstruct.


Illegality matters only if you get caught - and when it comes to the GDPR it turns out even "getting caught" isn't actually a problem, as the continued existence of Facebook, Google, the data broker industry, etc demonstrates.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: