The ability to specify a context-free grammar as output constraint? This blows m...

evnc · 2025-08-07T18:30:06 1754591406

I assume they're doing "Structured Generation" or "Guided generation", which has been possible for a while if you control the LLM itself e.g. running an OSS model, e.g. [0][1]. It's cool to see a major API provider offer it, though.

The basic idea is: at each auto-regressive step (each token generation), instead of letting the model generate a probability distribution over "all tokens in the entire vocab it's ever seen" (the default), only allow the model to generate a probability distribution over "this specific set of tokens I provide". And that set can change from one sampling set to the next, according to a given grammar. E.g. if you're using a JSON grammar, and you've just generated a `{`, you can provide the model a choice of only which tokens are valid JSON immediately after a `{`, etc.

[0] https://github.com/dottxt-ai/outlines [1] https://github.com/guidance-ai/guidance

qsort · 2025-08-07T18:03:38 1754589818

You sample only from tokens that could possibly result in a valid production for the grammar. It's an inference-only thing.

low_tech_punk · 2025-08-07T18:05:29 1754589929

ah, thanks!