Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The very next ask will be "order the zipcodes by number of customers" at which point you'll be back to aggregations, which is where you should have started


Anti-Patterns You Should Avoid: overengineering for potential future requirements. Are there real-life cases where you should design with the future in mind? Yes. Are there real-life cases where DISTINCT is the best choice by whatever metric you prioritize at the time? Also yes.


> Are there real-life cases where DISTINCT is the best choice by whatever metric you prioritize at the time

Indeed, along that line, I would say that DISTINCT can be used to convey intent... and doing that in code is important.

- I want to know the zipcodes we have customers in - DISTINCT

- I want to know how many customers we have in each zipcode - aggregates

Can you do the first with the second? Sure.. but the first makes it clear what your goal is.


Partly in jest, but maybe we need a NON-DISTINCT signaller to convey the inverse and return duplicate values only.

SOMEWHAT-DISTINCT with a fuzzy threshold would also be useful.


I hear you. It's not all _that_ uncommon for me to query for "things with more than one instance". Although, to be fair, it's more common for me to that when grep/sort/uniqing logs on the command line.


Here we start to get close to analytics sql vs application sql, and I think that's a whole separate beast itself with different patterns and anti-patterns.


Ah, yeah, you beat me to it. I do reporting, not applications.


distinct seems like an aggregation to me


Whole seconds will have been wasted!


I do reporting, not application development. If somebody wants to know different information I'd write a different query.


count(id) group by post_code order by 1




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: