Hacker News new | past | comments | ask | show | jobs | submit | rckrd's comments login

Interesting that none of the new features (DALLE-3, Advanced Data Analysis, Browse with Bing) are usable without enabling history (and therefore, using your data for training).


You can literally disable that here without it impacting your history or the use of those features:

https://docs.google.com/forms/d/e/1FAIpQLScrnC-_A7JFs4LbIuze...


Yeah nothing suspicious about a random google doc instead of a direct link to OpenAI. I would suggest nobody fill this out.



i stand corrected, thanks. what an odd way for them to collect data considering they seem more than capable of making a form on their own website...


I've also compiled a list of leaked system prompts from various applications.

[0] https://matt-rickard.com/a-list-of-leaked-system-prompts


Logit-bias guidance goes a long way -- LLM structure for regex, context-free grammars, categorization, and typed construction. I'm working on a hosted and model-agnostic version of this with thiggle

[0] https://thiggle.com


We use a similar trick and expose it via an API. Much easier to parse when you can guarantee the shape of the output

[0] https://thiggle.com/


We've found the same. A lot of usage through our LLM Categorization endpoint. The toughest problem was actually constraining the model to only output valid categories and not hallucinate new ones. And to only return one for single-classification (or multiple if that's the mode).

[0] https://matt-rickard.com/categorization-and-classification-w...


In more impressive news, "38% of code generated by GPT-4 does not contain API misuses"


Seems to be already surpassing humans.


I also released a hosted version of my open-source libraries ReLLM and ParserLLM that already supports APIs for

* Regex completion for LLMs

* Context-free Grammar completion for LLMs

https://thiggle.com/

[0] https://github.com/r2d4/rellm

[1] https://github.com/r2d4/parserllm

[2] https://github.com/thiggle/api

There's also another API on Thiggle that I've build that supports classification via a similar logit-based strategy.


I just released a zero-shot classification API built on LLMs https://github.com/thiggle/api. It always returns structured JSON and only the relevant categories/classes out of the ones you provide.

LLMs are excellent reasoning engines. But nudging them to the desired output is challenging. They might return categories outside the ones that you determined. They might return multiple categories when you only want one (or the opposite — a single category when you want multiple). Even if you steer the AI toward the correct answer, parsing the output can be difficult. Asking the LLM to output structure data works 80% of the time. But the 20% of the time that your code parses the response fails takes up 99% of your time and is unacceptable for most real-world use cases.

[0] https://twitter.com/mattrickard/status/1678603390337822722


https://matt-rickard.com

779 blog posts. Writing about engineering, startups, math, and AI.

Many of the posts have rich discussions on HN. You can see the top ones here: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

---

* Reflections on 10k Hours of Programming (421 points) - https://news.ycombinator.com/item?id=28086836

* Don't Use Kubernetes Yet (306 points) - https://news.ycombinator.com/item?id=31795160

* Google search's death by a thousand cuts (292 points) - https://news.ycombinator.com/item?id=36564042

* The Unreasonable Effectiveness of Makefiles (256 points) - https://news.ycombinator.com/item?id=32438616

* I Miss the Programmable Web (248 points) - https://news.ycombinator.com/item?id=32284375

* What Comes After Git? (227 points) - https://news.ycombinator.com/item?id=31984450

---

RSS Feed: https://matt-rickard.com/rss

Email list: https://matt-rickard.com/subscribe


Hi there! I adore your blog. Quick question - you have an MBA from Stanford, and you're a software engineer rather than a 'manager'. Are there others like you? I was thinking of an MBA as an option, but was afraid my focus after that might not be technical enough.


Thank you! I'm waiting to write this post (I follow Patrick Collison's advice methodology -- wait 10 years before you can accurately reflect [0]).

But here's Marc Andreessen thoughts:

> "Seek to be a double/triple/quadruple threat."

He talks about the MBA + Undergrad Engineering combo in this blog post. https://fictivekin.github.io/pmarchive-jekyll/guide_to_caree...

[0] https://patrickcollison.com/advice


> I follow Patrick Collison's advice methodology -- wait 10 years before you can accurately reflect

I wish more people did so. The signal to noise ratio of the content on the internet would improve a whole lot.


"Quite a few people in business have paired a liberal arts undergrad degree with an MBA. They seem to do just fine. But I think that’s a missed opportunity—much better would be an MBA on top of an engineering or math undergraduate degree. People with that combination are invaluable, and there aren’t nearly enough of them running around."

I needed to read that today, (Purdue computer eng + Harvard MBA here)


Thanks you for the link! Is it cool if I email you with further questions?


Yep -- matt (at) matt-rickard.com


Love your blog posts! You wrote new blog posts daily, at what time during the day you dedicated for writing, and how long on average for each session?


Hi Matt! I really liked the format/layout/style of your blog. Could you share what you’ve used to build it?


Thanks for posting - I'd been looking for your site after forgetting what it was!


I love your blog. I have been following your posts for about last two years


Thank you!


Could i ask how to make such Email list ?


It used to be self-hosted, but I recently moved the list to Substack to make it easier for readers if they have an existing Substack account.


For a less dramatic strategy with LLMs that expose the tokenizer vocabulary, you can use context-free grammars to constrain the logits according to the parser so that the LLMs only generate valid next tokens for the language.[0]

[0]https://github.com/r2d4/parserllm


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: