> What we call "hallucinations" is far more similar to what we would call "inven...

majormajor · 2025-03-20T03:47:21 1742442441

That seems to be giving the system too much credit. Like "reduce the temperature and they'll go away." A more probable next word based on a huge general corpus of text is not necessarily a more correct one for a specific situation.

Consider the errors like "this math library will have this specific function" (based on a hundred other math libraries for other languages usually having that).

AdieuToLogic · 2025-03-20T04:08:27 1742443707

> That seems to be giving the system too much credit. Like "reduce the temperature and they'll go away." A more probable next word based on a huge general corpus of text is not necessarily a more correct one for a specific situation.

I believe we are saying the same thing here. My clarification to the OP's statement:

  What we call "hallucinations" is far more similar to what 
  we would call "inventiveness", "creativity", or 
  "imagination" in humans ...

Was that the algorithm has no concept of correctness (nor the other anthropomorphic attributes cited), but instead relies on pseudo-randomness to vary search paths when generating text.

cornel_io · 2025-03-20T05:09:37 1742447377

There are various results that suggest that LLMs do internally have everything they'd need to know that they're hallucinating/wrong:

https://arxiv.org/abs/2402.09733

https://arxiv.org/abs/2305.18248

https://www.ox.ac.uk/news/2024-06-20-major-research-hallucin...

So I don't think it's that they have no concept of correctness, they do, but it's not strong enough. We're probably just not training them in ways that optimize for that over other desirable qualities, at least aggressively enough.

It's also clear to anyone who has used many different models over the years that the amount of hallucination goes down as the models get better, even without any special attention being (apparently) paid to that problem. GPT 3.5 was REALLY bad about this stuff, but 4o and o1 are at least mediocre. So it may be that it's just one of the tougher things for a model to figure out, even if it's possible with massive capacity and compute. But I'd say it's very clear that we're not in the world Gary Marcus wishes we were in, where there's some hard and fundamental limitation that keeps a transformer network from having the capability to be more truthful as a it gets better; rather, like all aspects, we just aren't as far along as we'd prefer.

ForTheKidz · 2025-03-20T06:24:31 1742451871

> There are various results that suggest that LLMs do internally have everything they'd need to know that they're hallucinating/wrong

We need better definitions of what sort of reasonable expectation people can have for detecting incoherency and self-contradiction when humans are horrible at seeing this, except in comparison to things that don't seem to produce meaningful language in the general case. We all have contradictory worldviews and are therefore capable of rationally finding ourselves with conclusions that are trivially and empirically incoherent. I think "hallucinations" (horribly, horribly named term) are just an intractable burden of applying finite, lossy filters to a virtually continuous and infinitely detailed reality—language itself is sort of an ad-hoc, buggy consensus algorithm that's been sufficient to reproduce.

But yea if you're looking for a coherent and satisfying answer on idk politics, values, basically anything that hinges on floating signifiers, you're going to have a bad time.

(Or perhaps you're just hallucinating understanding and agreement: there are many phrases in the english language that read differently based on expected context and tone. It wouldn't surprise me if some models tended towards production of ambiguous or tautological semantics pleasingly-hedged or "responsibly"-moderated, aka PR.)

Personally, I don't think it's a problem. If you are willing to believe what a chatbot says without verifying it there's little advice I could give you that can help. It's also good training to remind yourself that confidence is a poor signal for correctness.

AdieuToLogic · 2025-03-22T02:48:15 1742611695

> There are various results that suggest that LLMs do internally have everything they'd need to know that they're hallucinating/wrong:

The underlying requirement, which invalidates an LLM having "everything they'd need to know that they're hallucinating/wrong", is the premise all three assume - external detection.

From the first arxiv abstract:

  Moreover, informed by the empirical observations, we show 
  great potential of using the guidance derived from LLM's 
  hidden representation space to mitigate hallucination.

From the second arxiv abstract:

  Using this basic insight, we illustrate that one can 
  identify hallucinated references without ever consulting 
  any external resources, by asking a set of direct or 
  indirect queries to the language model about the 
  references. These queries can be considered as "consistency 
  checks."

From the Nature abstract:

  Researchers need a general method for detecting 
  hallucinations in LLMs that works even with new and unseen 
  questions to which humans might not know the answer. Here 
  we develop new methods grounded in statistics, proposing 
  entropy-based uncertainty estimators for LLMs to detect a 
  subset of hallucinations—confabulations—which are arbitrary 
  and incorrect generations.

Ultimately, no matter what content is generated, it is up to a person to provide the understanding component.

> So I don't think it's that they have no concept of correctness, they do, but it's not strong enough.

Again, "correctness" is a determination solely made by a person evaluating a result in the context of what the person accepts, not intrinsic to an algorithm itself. All an algorithm can do is attempt to produce results congruent with whatever constraints it is configured to satisfy.

ForTheKidz · 2025-03-20T06:10:38 1742451038

We really need an idiom for the behavior of being technically correct but absolutely destroying the prospect of interesting conversation. With this framing we might as well go back to arguing over which rock our local river god has blessed with greater utility. I'm not actually entirely convinced humans are capable of understanding much when discussion desired is this low quality.

Critically, creation does not require intent nor understanding. Neither does recombination; neither reformulation. The only thing intent is necessary for is to create something meaningful to humans—handily taken care of via prompt and training material, just like with humans.

(If you can't tell, I thought we had bypassed the neuroticism over whether or not data counts as "understanding", whatever that means to people, on week 2 of LLMs)

AdieuToLogic · 2025-03-22T01:29:08 1742606948

> We really need an idiom for the behavior of being technically correct but absolutely destroying the prospect of interesting conversation.

While it is not an idiom, the applicable term is likely pedantry[0].

> I'm not actually entirely convinced humans are capable of understanding much when discussion desired is this low quality.

Ignoring the judgemental qualifier, consider your original post to which I replied:

  What we call "hallucinations" is far more similar to what
  we would call "inventiveness", "creativity", or
  "imagination" in humans ...

The term for this behavior is anthropomorphism[1] due to ascribing human behaviors/motivations to algorithmic constructs.

> Critically, creation does not require intent nor understanding. Neither does recombination; neither reformulation.

The same can be said for a random number generator and a permutation algorithm.

> (If you can't tell, I thought we had bypassed the neuroticism over whether or not data counts as "understanding", whatever that means to people, on week 2 of LLMs)

If you can't tell, I differentiate between humans and algorithms, no matter the cleverness observed of the latter, as only the former can possess "understanding."

0 - https://www.merriam-webster.com/dictionary/pedant

1 - https://www.merriam-webster.com/dictionary/anthropomorphism