Interesting that none of the new features (DALLE-3, Advanced Data Analysis, Browse with Bing) are usable without enabling history (and therefore, using your data for training).
Logit-bias guidance goes a long way -- LLM structure for regex, context-free grammars, categorization, and typed construction. I'm working on a hosted and model-agnostic version of this with thiggle
We've found the same. A lot of usage through our LLM Categorization endpoint. The toughest problem was actually constraining the model to only output valid categories and not hallucinate new ones. And to only return one for single-classification (or multiple if that's the mode).
I just released a zero-shot classification API built on LLMs https://github.com/thiggle/api. It always returns structured JSON and only the relevant categories/classes out of the ones you provide.
LLMs are excellent reasoning engines. But nudging them to the desired output is challenging. They might return categories outside the ones that you determined. They might return multiple categories when you only want one (or the opposite — a single category when you want multiple). Even if you steer the AI toward the correct answer, parsing the output can be difficult. Asking the LLM to output structure data works 80% of the time. But the 20% of the time that your code parses the response fails takes up 99% of your time and is unacceptable for most real-world use cases.
Hi there! I adore your blog. Quick question - you have an MBA from Stanford, and you're a software engineer rather than a 'manager'. Are there others like you? I was thinking of an MBA as an option, but was afraid my focus after that might not be technical enough.
"Quite a few people in business have paired a liberal arts undergrad degree with an MBA. They seem to do just fine. But I think that’s a missed opportunity—much better would be an MBA on top of an engineering or math undergraduate degree. People with that combination are invaluable, and there aren’t nearly enough of them running around."
I needed to read that today, (Purdue computer eng + Harvard MBA here)
For a less dramatic strategy with LLMs that expose the tokenizer vocabulary, you can use context-free grammars to constrain the logits according to the parser so that the LLMs only generate valid next tokens for the language.[0]