Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Main use case for me would be RLAIF. Given a prompt, generation, and a code execution result - rank N alternative executions and execution results for DPO/other training patterns.

In complex use cases like building a bi engineer, it’s useful to persist state across multiple function calls within the same interpreter.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: