IIRC, in some countries census data is sometimes released with small, intentional errors to prevent the ability to locate specific individuals. Make a 36 year old sometimes a 37 year old or a 35 year old. Make a 180 cm person sometimes 182 cm or 178 cm. Small enough errors not to make the aggregate data invalid, but enough to make it hard to identify individuals from the data.
Perhaps this is a partial solution for the Netflix dataset.
This is the approach that Netflix took with the initial data. The paper referred to shows that this is insufficient, and does little to ease privacy concerns. The general problem is that if you 'fuzz' up the data enough to make identification impossible, it's no longer useful as a dataset.
Perhaps this is a partial solution for the Netflix dataset.