The fundamental innovation is training the model to reason through reinforcement...

		Philpax 5 months ago \| parent \| context \| favorite \| on: Claude 3.7 Sonnet and Claude Code The fundamental innovation is training the model to reason through reinforcement learning; you can train existing models with traces from these reasoning models to get you within the same ballpark, but taking it further requires you to do RL yourself.