So the current problem with a loop like that is that LLMs in their current form ...

ACCount37 · 2025-10-15T22:16:50 1760566610

Consistency drive. The base model always wants to generate an output that's consistent with its context! It's what it was trained to do!

Every LLM is just a base model with a few things bolted on the top of it. And loops are extremely self-consistent. So LLMs LOVE their loops!

By the way, "no no no, that's a reasoning loop, I got to break it" is a behavior that larger models learn by themselves under enough RLVR stress. But you need a lot of RLVR to get to that point. And sometimes this generalizes to what looks like the LLM is just... getting bored by repetition of any kind. Who would have though.

jncfhnb · 2025-10-15T16:00:43 1760544043

That’s actually exactly my point. You cannot fake it till you make it by using forever larger context windows. You have to map it back to actual system state. Giant context windows might progressively produce the illusion of working due to unfathomable scale, but it’s a terrible tool for the job.

LLMs are not stateful. A chat log is a truly shitty state tracker. An LLM will never be a good agent (beyond a conceivable illusion of unfathomable scale). A simple agent system that uses an LLM for most of its thinking operations could.