Or it’s simply an indicator of a schema that has not been excessively normalised...

echelon · 2025-10-18T20:35:12 1760819712

DISTINCT, as well as the other aggregation functions, are fantastic for offline analytics queries. I find a lot of use for them in reporting, non-production code.

valiant55 · 2025-10-18T17:58:36 1760810316

It depends when you see it, but I agree that DISTINCT shouldn't be used in production. If I'm writing a one off query and DISTINCT gets me over the finish line sparing me a few minutes then that's fine.

viraptor · 2025-10-19T09:15:12 1760865312

Which categories did the user post in? Which projects did the user interact with in the last week? That's all normal DISTINCT usage.

ndsipa_pomu · 2025-10-19T09:45:18 1760867118

There's nothing wrong with using DISTINCT correctly and it does belong in production. The author is complaining about developers that just put in DISTINCT as a matter of course rather than using it appropriately.

ndsipa_pomu · 2025-10-19T09:43:51 1760867031

One reason to have excessively normalised tables would be to ensure consistency so that you don't have to worry about various records with "London", "LONDON", "lindon" etc.

sgarland · 2025-10-18T22:33:24 1760826804

Because a city/region/state can be uniquely identified with a postal code (hell, in Ireland, the entire address is encapsulated in the postal code), but the reverse is not true.

At scale, repeated low-cardinality columns matter a great deal.

virissimo · 2025-10-19T00:32:18 1760833938

There are ZIP codes that overlap a city and also an unincorporated area. Furthermore, there are zip codes that overlap different states. A data model that renders these unrepresentable may come back to bite you.

Breza · 2025-10-27T18:45:06 1761590706

This assumption got me in trouble as a junior analyst years ago. I was asked to analyze our customer base and wrote something like the below. Management congratulated me on finding thousands more customers than we'd ever had before.

SELECT zipcode.rural_urban_code, COUNT(*) AS n_customer FROM customer INNER JOIN zipcode USING(zipcode) GROUP BY 1;

pbnjay · 2025-10-19T00:33:16 1760833996

FYI this is not true in the US. Zip codes identify postal routes not locations

bdangubic · 2025-10-19T01:55:43 1760838943

saying zipcodes uniquely identify city/state/region is like saying John uniquely identifies a human :)

sgarland · 2025-10-19T12:09:19 1760875759

EDIT: TIL that there are cross-state ZIP codes.

lucyjojo · 2025-10-19T03:54:49 1760846089

these kinds of things are almost never true in the real world.