Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I do think it’s an interesting line of inquiry… but not robust enough.

E.g. this paper would be much more interesting if it measured the threshold at which the LLM starts to become good at X, and linked that threshold to the number and character of training examples of X. Then, maybe, we can begin to think about comparing the LLM to a human.

Alas, it requires access to the training data to do that study, and it requires a vast amount of compute to do it robustly.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: