The existence of R1-zero is evidence against any sort of theft of OpenAI's inter... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		valine 6 months ago \| parent \| context \| favorite \| on: OpenAI says it has evidence DeepSeek used its mode... The existence of R1-zero is evidence against any sort of theft of OpenAI's internal COT data. The model sometimes outputs illegible text that's useful only to R1. You can't do distillation without a shared vocabulary. The only way R1 could exist is if they trained it with RL.

natdempk 6 months ago [–]

I don’t think anyone is really suggesting they stole COT or that it is leaked, but rather that the final o1 outputs were used to train the base model and reasoning components more easily.

valine 6 months ago | [–]

The RL is done on problems with verifiable answers. I’m not sure how o1 slop would be at all useful in that respect.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact