It is not as if they are not open about how they did it. People are actually working on reproducing their results as they describe in the papers. Somebody has already reproduced the r1-zero rl training process on a smaller model (linked in some comment here).
Even if o1 specifically was used (which is in itself doubtful), it does not mean that this was the main reason that r1 succeeded/it could not have happened without it. The o1 outputs hides the CoT part, which is the most important here. Also we are in 2025, scratch does not exist anymore. Creating better technology building upon previous (widely available) technology has never been a controversial issue.
Even if o1 specifically was used (which is in itself doubtful), it does not mean that this was the main reason that r1 succeeded/it could not have happened without it. The o1 outputs hides the CoT part, which is the most important here. Also we are in 2025, scratch does not exist anymore. Creating better technology building upon previous (widely available) technology has never been a controversial issue.