Main use case for me would be RLAIF. Given a prompt, generation, and a code execution result - rank N alternative executions and execution results for DPO/other training patterns.
In complex use cases like building a bi engineer, it’s useful to persist state across multiple function calls within the same interpreter.
In complex use cases like building a bi engineer, it’s useful to persist state across multiple function calls within the same interpreter.