Hacker News new | past | comments | ask | show | jobs | submit login

Wait, so the trick is they reach into the context and basically switch '</think>' with 'wait' and that makes it carry on thinking?



Not sure if your pun was intended, but 'wait' probably works so well because of the models being trained on text structured like your comment, where "wait" is followed by a deeper understanding.


Yes, that's explicitly mentioned in the blog post:

>In s1, when the LLM tries to stop thinking with "</think>", they force it to keep going by replacing it with "Wait".


Yes, that's one of the tricks.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: