> Just because your sampler picks a sequence of tokens that contain incorrect reasoning doesn't mean a useful reasoning trace isn’t also contained within the latent space.
That's essentially the core idea in Coconut[1][2], to keep the reasoning traces in a continuous space.
That's essentially the core idea in Coconut[1][2], to keep the reasoning traces in a continuous space.
[1]: https://arxiv.org/abs/2412.06769
[2]: https://github.com/facebookresearch/coconut