Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Sometimes I wonder if there is overfitting towards benchmarks

There absolutely is, even when it isn't intended.

The difference between what the model is fitting to and reality it is used on is essentially every problem in AI, from paperclipping to hallucination, from unlawful output to simple classification errors.

(Ok, not every problem, there's also sample efficiency, and…)



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: