Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>> It is already demonstrating general capabilities and performing a wide range of intellectual tasks, including those that it is not specifically trained on.

Huh? Isn't a LLM's capability fully constrained by the training data? Everything else is hallucinated.



You can argue that everything output by an LLM is hallucinated, since there's no difference under-the-hood between outputting useful information and outputting hallucinations.

The quality of the LLM then becomes how often it produces useful information. That score has gone up a lot in the past 18 months.

(Sometimes hallucinations are what you want: "Tell me a fun story about a dog learning calculus" is a valid prompt which mostly isn't meant to produce real facts about the world")


Isn't it the case that the latest models actually hallucinate more than the ones that came before? Despite best efforts to prevent it.


The o3 model card reports a so far unexplained uptick in hallucination rate from o1 - on page 4 of https://cdn.openai.com/pdf/2221c875-02dc-4789-800b-e7758f372...

That is according to one specific internal OpenAI benchmark, I don't know if it's been replicated externally yet.


The critical discovery was a way to crack the “Frame Problem”, which roughly comes down to colloquial notions of common sense or intuition. For the first time ever, we have models that know if you jump off a stool, you will (likely!) be standing on the ground afterwards.

In that sense, they absolutely know things that aren’t in their training data. You’re correct about factual knowledge, tho — that’s why they’re not trained to optimize it! A database(/pagerank?) solves that problem already.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: