Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Why wouldn't you expect an AI that's both a superhuman coder and a superhuman communicator to be decent at translating between human requirements and code?

At this point LLMs are a superhuman nothing, except in terms of volume, which is a standard computer thing ("To err is human, but to really foul things up you need a computer" - a quote from 60 years ago).

LLMs are fast, reasonably flexible, but at the moment they don't really raise the ceiling in terms of quality, which is what I would define as "superhuman".

They are comparatively cheaper than humans and volume matters ("quantity has a quality all its own" - speaking of quotes). But I'm fairly sure that superhuman to most people means "Superman", not 1 trillion ants :-)



I wrote that based on my experience comparing my prose writing and code to what I can get from ChatGPT or Claude Code, which I feel are on average significantly higher quality than what I can do on a single pass. The quality still improves when I critique its output and iterate with it, but from what I tried, the quality of the result of it doing the work and me critiquing it is better (and definitely faster) than what I get when I try to do it myself and have it critique my approach.

But maybe it's just because I personally am not as good as others, so let me try to offer some examples of tasks where the quality of AI output is empirically better than the human baseline:

1. Chess (and other games) - Stockfish has an ELO of 3644[0], compared to Magnus Carlsen at 2882

2. Natural Language understanding - AIs surpassed the human expert baseline on SuperGlue a while ago [1]

3. General image classification - On Imagenet top-5, facebook's convnext is at 98.55 [2], while humans are at about 94.9% [3]. Humans are still better at poor lighting conditions, but with additional training data, AIs are catching up quickly.

4. Cancer diagnosis - on lymph-node whole slide images, the best human pathologist in the study got an AUC of 0.884, while the best AI classifier was at 0.994 [4]

5. Competition math - AI is at the level of the best competitors, achieving gold level at the IMO this year [5]. It's not clearly superhuman yet, but I expect it will be very soon.

6. Competition coding - Here too AI is head to head with the best competitors, successfully solving all problems at this year's ICPC [6]. Similarly, at the AtCoder World Tour Finals 2025 Heuristic contest, only one human managed to beat the OpenAI submission [7].

So summing this up, I'll say that even if AI isn't better at all of these tasks than the best prepared humans, it's extremely unlikely that I'll get one of those humans to do tasks for me. So while AI is still very flawed, I already quite often prefer to rely on it rather to delegate to another human, and this is as bad as it ever will be.

P.S. While not a benchmark, there's a small study from last year that looked at the quality of AI-generated code documentation in comparison to the actual human-written documentation in a variety of code bases and found "results indicate that all LLMs (except StarChat) consistently outperform the original documentation generated by humans." [8]

[0] https://computerchess.org.uk/ccrl/4040/

[1] https://super.gluebenchmark.com/

[2] https://huggingface.co/spaces/Bekhouche/ImageNet-1k_leaderbo...

[3] https://cs.stanford.edu/people/karpathy/ilsvrc/

[4] https://jamanetwork.com/journals/jama/fullarticle/2665774

[5] https://deepmind.google/blog/advanced-version-of-gemini-with...

[6] https://worldfinals.icpc.global/2025/openai.html

[7] https://arstechnica.com/ai/2025/07/exhausted-man-defeats-ai-...

[8] https://arxiv.org/pdf/2312.10349


Brother, you are not going to convince people who dedicated their lives to learning a language, knowledge that bankrolls a pretty cushy life, that that language is likely to soon be readily accessible to everyone with access to a machine translator.


Indeed, or in the words of Upton Sinclair:

> It is difficult to get a man to understand something, when his salary depends on his not understanding it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: