Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most critical piece of information I couldn’t find is - how many shot was this?

Could it understand the solution is correct by itself (one-shot)? Or did it have just great math intuition and knowledge? How the solutions were validated if it was 10-100 shot?



The solutions were evaluated on their submitted output. You're allowed to use multiple 'shots' to produce the output, but just one submission per question. People are allowed this same affordance.


But were humans involved in picking which answer/shot to submit, or was it only AI?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: