Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks, any ideas why it's not possible to build a generic eval for this? Since it's about asking a set of questions that's not public knowledge (or making stuff up) and check if the model says "I don't know"?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: