Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It may console you: in some sense, the top 4900 are more valuable than the top-100.

Why? Everybody here knows Paul Graham. I know Krebs and Schneier, most of you will, too. In a long tail distribution like this, the top entries (left) are the obvious ones, the lowest frequented ones (right) might be noise (artifact of the methods e.g. bugs in the data cleaning), but the middle part is really where the value is: blogs we don't know but would like to know.

In search engine ranking, people needed a lot of time until the late Karen Spärck Jones finally discovered IDF (inverse document [collection] frequency) in 1972, the "Yang" to raw term frequency (TF), which had been the "Yin" that was missing a counterforce to retrieve truly relevant documents when balanced in the TFIDF formula.

So, plea to the OP: please release the rest of your list (101-100000).



+1 to this. I'd also argue that some on the list are unapologetic self-promoters like Simon Willison. Nothing wrong with it but it shows and I think it's much more impressive to be below that cohort but still only a reasonable distance away.


At the bottom of the screen you will find a pulldown that lets you show up to 5000 entries.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: