Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They refer to this in the paper as a part of the "cold start data" which they use to fine-tune DeepSeek-V3 prior to training R1.

They don't specifically name OpenAI, but they refer to "directly prompting models to generate answers with reflection and verification".



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: