Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

benchmark performance seems to hold up on the aider benchmark. R1 comes in on the second place with 56.9% behind O1's 61.7%.

https://aider.chat/docs/leaderboards/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: