Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The ability to specify a context-free grammar as output constraint? This blows my mind. How do you control the auto regressive sampling to guarantee the correct syntax?


I assume they're doing "Structured Generation" or "Guided generation", which has been possible for a while if you control the LLM itself e.g. running an OSS model, e.g. [0][1]. It's cool to see a major API provider offer it, though.

The basic idea is: at each auto-regressive step (each token generation), instead of letting the model generate a probability distribution over "all tokens in the entire vocab it's ever seen" (the default), only allow the model to generate a probability distribution over "this specific set of tokens I provide". And that set can change from one sampling set to the next, according to a given grammar. E.g. if you're using a JSON grammar, and you've just generated a `{`, you can provide the model a choice of only which tokens are valid JSON immediately after a `{`, etc.

[0] https://github.com/dottxt-ai/outlines [1] https://github.com/guidance-ai/guidance


You sample only from tokens that could possibly result in a valid production for the grammar. It's an inference-only thing.


ah, thanks!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: