Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For a showcase of different LMQL queries and resulting model output, have a look at https://lmql.ai.

Per query, the LMQL runtime calls the underlying LM several times, to execute the complete specified (multi-part) query program.

It does not only translate to only one LM prompt, but rather a sequence of prompts, where during generation additional constraining of the LM is applied, to ensure the LM behaves according to the provided template. That's how it support control-flow and external function calls during generation, it actually executes the queries with a proper runtime and only uses LLMs on the backend.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: