For years, I've been good heartedly losing the blog SEO ranking fight to a great developer and writer who has the same name as me. A football player eclipses us both if you just google our shared name, but if you add any sort of "developer" or "programming", he's clearly got me beat for the top marks. It makes sense — he writes about tech much more consistently than I do, and his articles are likely much more helpful than my sporadic and eclectic posts.
Naturally, being vain, when I saw this post, I immediately looked up my own blog and was chuffed to see it at #292.
Funnily enough, I have a very common name and take quite a bit of relief in knowing that were someone to even attempt to look me up, they would see a number of authors, artists, politicians, etc. long before anything of me ever appeared. I've gone searching once or twice out of curiosity, and didn't find anything relevant on at least the first two or three pages of Google, Bing, etc.
I suppose the difference is that when I make any sort of professional blog, etc. I do so under online handles in place of my legal name because I see the content as being the important bit, not the person it's coming from; the credibility flows from the information provided, not the name it's tied to.
...Well, provided it's not xXx_DongMaster6969_xXx or whatever. :)
In the future, our names will simply be a cryptographic hash of our genome and we’ll all enjoy unambiguous identity and individuality as we express ourselves solely with Unicode Emoji.
I had two posts[0, 1] I was super confident would be a match for HN and two others[2, 3] that I thought had a so-so chance. They've all flopped except I got lucky that someone else submitted this one after I gave up.
It's somewhat arbitrary, but the threshold where you're not allowed to resubmit a story on HN is if it reaches over 20 points. I'm not sure if that's still the case, but it was a few years ago when I asked why I couldn't resubmit.
Also, just from my subjective sense, it's a reasonable cutoff for when articles are officially on the front page for any meaningful amount of time rather than they briefly appear and then fall off.
That's weird that I'm excluded. Thanks for watching out for me :-)
Hypotheses: 1. it's an error. 2. I have powerful enemies. 3. Someone from the future is trying to stop me. 4. My blog triggered the FDIV bug and needed to be excluded.
There are a lot of odd exclusions on that list. Just spot-checking, I see blog.plover.com, the blog of Mark Jason Dominus, who by the way is looking for a job[1].
Also, dtrace.org is excluded, which hosts four individual blogs that surely should qualify.
I think what happened was that I was going through the list of domains and assumed that "righto.com" would be too valuable a domain name for a personal blog and excluded it without checking. Sorry about that!
I love that dynomight.net stands out with "existential angst" as a very unique category among the top blogs, as well as being written by an anonymous/pseudonymous author. I'm a big fan of their writing.
Also quite surprised to find my own site in the top 5000 for the past 5 years! It feels like Hacker News is simultaneously quite large but also a cozy community where you often recognize names from day to day.
Hi, my blog incoherency.co.uk appears at number 207 but it says the author is David Given.
I'm not sure if the error is that you think my blog is written by David Given instead of James Stanley, or if you think David Given's blog is incoherency.co.uk instead of cowlark.com !
Slate Star Codex and Astral Codex Ten should probably be combined. Also, it's odd that while ACX's author is listed as "Scott Alexander" (his long-time pseudonym), SSC's is listed as "anonymous." He went by Scott Alexander even in the days of SSC.
Yeah, that's a good point. I was trying to be respectful of the author's wishes around pseudonymity, as it seemed intentional that when he lost anonymity, he switched domains, so I didn't want to "out" the author, but it's kind of silly since it's very public now.
I think you're right that it makes sense to identify him by his pseudonym rather than just "Anonymous" so I've updated it:
Interesting website. I was reading the free chapter about active and passive voice, something I've never really understood or paid much attention to. Excellent explanation that cleared things up. My manager uses active voice when they're taking credit for our work, and passive voice when they do things wrong. It's a neat trick.
It may console you: in some sense, the top 4900 are more valuable than the top-100.
Why? Everybody here knows Paul Graham. I know Krebs and Schneier, most of you will, too. In a long tail distribution like this, the top entries (left) are the obvious ones, the lowest frequented ones (right) might be noise (artifact of the methods e.g. bugs in the data cleaning), but the middle part is really where the value is: blogs we don't know but would like to know.
In search engine ranking, people needed a lot of time until the late Karen Spärck Jones finally discovered IDF (inverse document [collection] frequency) in 1972, the "Yang" to raw term frequency (TF), which had been the "Yin" that was missing a counterforce to retrieve truly relevant documents when balanced in the TFIDF formula.
So, plea to the OP: please release the rest of your list (101-100000).
+1 to this. I'd also argue that some on the list are unapologetic self-promoters like Simon Willison. Nothing wrong with it but it shows and I think it's much more impressive to be below that cohort but still only a reasonable distance away.
FYI, in my May 2023 survey of HN's archived front pages (under the "past" link in the HN titlebar) ... horse.sheep doesn't appear at all.
That's looking at just the top <=30 stories per day. Your high-water mark seems to have been 2022-12-19, with 88 points / 32 comments, appearing on the 3rd page of the daily archive, ranked #73:
I had to fiddle with the dates to find a couple examples of blogs that violate the single-author rule in the methodology (marginalrevolution, ribbonfarm) but it's probably better to have them included.
Even though a glance suggests the majority of high-scorers are self-hosted, I wonder if this dataset is valuable for predicting the strength of different blog hosts. Some fiddling did lead to a couple results that are hosted on Blogger or Ghost or Medium, so they are there.
I’ve been thinking of starting an anonymous blog with my thoughts just to record them and any projects I do. I want them to be visible on the internet and searchable instead of behind some facebook or instagram wall. What is a good blog service to use that will be around for decades? I don’t really want to run my own domain. Do things like Blogger and Blogspot still exist and will they continue to in the future?
Where does the "bio" field come from? Mine says "Developer and writer" and I suppose those are both things that I am, but not very close to what I'd have put there.
I'm guessing it's annotated by an LLM. Would be a lot of thankless work otherwise, so don't really blame the author, but it means you get the occasional nonsense summary.
They're about 95% accurate, so not completely incorrect. I think the value of the correct ones is higher than the few incorrect ones. I manually review and fix them, but for just a fun tool, it's not practical to write 5000 eight-word bios.
I started by writing them by hand, but it was taking forever to read enough of each author's blog to write a summary and topic list, so I used an LLM and then spot-checked.
I just updated yours, but let me know if you'd like something different.
Based on my May 2023 survey of front-page results, those rank at 954 and 3,028 (of all sites) respectively:
954 18 63.008 lcamtuf.blogspot.com :::: blog
3028 6 77.327 lcamtuf.substack.com :::: blog
Among blogs I'd identified (similar methodology though all but certainly different URL set from TFA), #295 and #521 of 5,506 blogs identified.
My set includes 52,642 sites all told, with 16,185 classified as, e.g., "programming", "blog", "social media", "academic/science", "corporate comm.", "general news", "government", "software", "tech news", etc. I'd come up with 61 total classifications, covering all sites with at least 18 front-page appearances within the HN archive.
He is, but he's not in the top 100. He would be, but he splits his articles across different blogs.
For authors that just change domains (e.g., christine.website and xeiaso.net), I combine scores, but if they maintain different writing under different domains, I treat the domains as separate.
That said, I think Michał Zalewski is the author who most suffers due to this rule.
A simple statistics of the top 5,000 blog domain names shows that 54% use .com, 14% use .org, 7% use .io (40% of which are github.io), and 6% use .net. These five together account for 81%.
This is something I wanted but I couldn't figure out a way to do it in a way that's meaningful. Authors like Simon Willison publish frequently, so even though he has a lot of high-scoring posts, he has a lot of low-to-no-scoring posts too. It feels unfair to penalize people who publish frequently just because not every post is a homerun.
Note that this is gonna be skewed pretty heavily toward domains that have existed for most of HN's history, at the expense of any newer domains that had fewer chances to rack up points.
If you look at any 2-4 year period, the ranking tends to be quite different. Well, Paul Graham is there pretty consistently, but everything else changes.
You can change the date ranges (e.g. just the YTD, or last 12 months, or set a custom range), and it gives an interesting overview of the evolution over time.
Like jvns.ca drops off the list entirely for 2025, but was consistently in the top 5 until last year.
paulg's blog shouldnt count as for its an extension of this and more of a long game sales pitch tailored for different purposes. Not a bad thing, but I just wouldnt consider it a blog.
I think that is wholly unfair. If Paul Graham is anything its a writer/creator first. Those articles have also been extremely influential to many entrepreneurs and people in the business world. I'm personally appreciative of them, even if I don't always agree with him.
Interesting that if you were to combine AstralCodexTen and SlateStarCodex it would be around top #20. Even with the traffic split he's in the top 50 twice.
Naturally, being vain, when I saw this post, I immediately looked up my own blog and was chuffed to see it at #292.
But, guess who I see just above at #289.