In many cultures, family names are high-entropy and given names are lower entropy (two random people are more likely to share a given name rather than a family name). Under this assumption, the family-name citation trend makes more sense as there is a lower chance of collisions.
For Korean and Chinese names in particular, it's sort of the opposite: family names are lower entropy (e.g. Lee, Wang, Kim) than given names, which drastically increases the chance of family-name collision. I’m a PhD student of Chinese descent, and I share a family name with lots of academic peers in my specific subfield of research, and have even co-authored with unrelated individuals who share a family name with me. Family-name citations are really ambiguous and confusing to me, so I prefer omitting them altogether and using numeric links.
I don't buy the argument - when I pick the wrong "Wilson" in "Wilson (2002)" then by the time I read the bibliography, I can see that I make a mistake.
Has that ever happened? Not that I know, but I saw something else, without stepping in the trap, namely that the _same_ Wilson wrote more than one paper in 2002, and they cited the _other_ paper, not the one I first thought. Again, the reason I found out is that I read the references, otherwise I could not relate this incident.
It's important to scan the references to check oneself if one agrees with the selection of references, or if the author(s) omitted important work that one knows and the authors did not know or did not want to cite due to a hidden agenda (selection bias), in which case one may email them to share additional references so that they cannot say "we never knew".
I could accept numeric citekeys if and only if the HTML rendering shows me the
full entry when hovering over it with the mouse cursor (as a "tool tip"); there is absolutely no need in the 21st century to manually scroll/jump to the end without coming back when you can shop the reference inline. Hypertext is there for you to make use of it (Berners-Lee [1], 2000)!
I hope people get that the above triple indirection was actually tongue in cheek - a self-referential and self-satirical pun of how to shoot yourself in the foot.
If HN had HTML I could have demonstrated the hover-over idea that I desribed, but then I respect HN's minimalism, and we all value the safety and absence of spam through not having HTML on here.
As an aside, since hover doesn't work with touchscreens (phones and tablets), it might be useful to have a typographic note next to the bibkey that shows/hides the popup on touch. What's a good note to use?
Fair and useful. But then how well do tooltips print? Is the printed version complete? If the tooltip is just the citation or footnote, then the printout is complete.
Depending on the accuracy of the client, this could well give differing behaviours for desktop, mobile, and print output. HTML is flexible, very few designers seem willing to embrace this.
(a) [Name 2005] is much easier to mentally track if it appears repeatedly in longer text than [5] (at least for me). [5] is just [5]. [Name 2005] is "that paper by Name from twenty years ago".
(b) By using [Name 2005], I might not know which exact paper this is, but I get how recent it is w.r.t. what I am reading. In many cases, this is useful context. Saying "[5] proves X" could mean that this is a new result, or a well known fact. Saying "[Name 1967] proves X" clearly indicates that this is something that has been known for some time.
"Fundamentally, the problem with this is that it's actively encouraging guesses to override the communication of ethically required citation information."
This is silly. 99% of the time, if you cite Autor et al. (2013), it'll be that Autor et al. (2013). The other 1%, it'll be another David Autor paper. The case when you guess something wrong, and really it's a different author, who then gets hurt, probably happens once a year. Meanwhile, Autor et al. (2013) immediately lets me understand what you're referencing, which [57] does not.
One pragmatic advantage of the Autor et al (2013) reference style over the [57] style is that it is much easier to use when you don't have software to automatically number references, and renumber then as new refs are added. In my career I did occasionally have to resort to manual reference numbering and it was pure hell.
It is also quite helpful to know from the first glance that e.g. [Wulf68b] is actually from, you know, 1968 so I probably can just ignore whatever ideas he had on what future directions the programming language design will take (heck, it pre-dates K&R C).
Maybe I am wrong, but isn't the argument here, that because you trust in your guess, you don't look up the key in the bibliography and never become aware of your incorrect guess?
I mean, regardless of the Bibkey naming, it'll be the same content in the bibliography/citations section/appendix
Personally I hate autoincrementing numeric bibkeys for the same reason I hate them everywhere -- if you remove, reorder, or combine citations, it's a huge pita to work around.
I get if you're working in latex or something you can have the bibkeys programmatically generated, but still.
So long as the bibkey is unique and has an associated value, ill always prefer something semantically meaningful in the text.
I mean, when I see a citation, I'll read the citations regardless (if it's for work etc, casually at home I'll only read it if it seems interesting or if I'm dubious). The only way I could see it being an issue is if I've already seen the citations before and somehow I'm misremembering it or ignoring some context/nuance and I decide to be lazy and not refresh the cache, but at that point it's definitely a me problem and I don't think going to the citations and looking at the source title or excerpt there will be any better.
A 1% error rate honestly seems unacceptably high if you're in the habit of reading a large volume of papers, each of which has a large volume of citations. To quote TFA, "Science involves many different error-reduction practices that take time. It's part of the job." At the end of the day, it comes down to whether you think there's value in being correct, or if you're content merely seeming correct.
1% is my guess, and I'd call it a maximum. There is also a trade-off: when you use numbers, most people won't look up the reference at all. This means they simply don't know who deserves the credit. Or they may think a reference is more solid or accurate than it is.
What about being able to use direct voice when reporting related work? e.g. "Secondname invented a yabayaba". I have seen "[1] invented a yabayaba", but doesn't it look kind of odd?
Coming from a field where [n] routinely means {1,2,...,n}, I don't feel like the [1], [2], [3] system can win me over. But I'm ready to see some advantages it has.
For Korean and Chinese names in particular, it's sort of the opposite: family names are lower entropy (e.g. Lee, Wang, Kim) than given names, which drastically increases the chance of family-name collision. I’m a PhD student of Chinese descent, and I share a family name with lots of academic peers in my specific subfield of research, and have even co-authored with unrelated individuals who share a family name with me. Family-name citations are really ambiguous and confusing to me, so I prefer omitting them altogether and using numeric links.