There is a limited amount of computation that you can useful do in the absence o...

There is a limited amount of computation that you can useful do in the absence of new input (like an LLM between prompts). If you do as much computation as you usefully can (with your current algorithmic limits) in a burst immediately when you receive a prompt, output, and then go into a sleep state, that seems obviously better than receive a prompt, output, and then do some of the computation that you can usefully do after your output.