Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There are various results that suggest that LLMs do internally have everything they'd need to know that they're hallucinating/wrong:

The underlying requirement, which invalidates an LLM having "everything they'd need to know that they're hallucinating/wrong", is the premise all three assume - external detection.

From the first arxiv abstract:

  Moreover, informed by the empirical observations, we show 
  great potential of using the guidance derived from LLM's 
  hidden representation space to mitigate hallucination.
From the second arxiv abstract:

  Using this basic insight, we illustrate that one can 
  identify hallucinated references without ever consulting 
  any external resources, by asking a set of direct or 
  indirect queries to the language model about the 
  references. These queries can be considered as "consistency 
  checks."
From the Nature abstract:

  Researchers need a general method for detecting 
  hallucinations in LLMs that works even with new and unseen 
  questions to which humans might not know the answer. Here 
  we develop new methods grounded in statistics, proposing 
  entropy-based uncertainty estimators for LLMs to detect a 
  subset of hallucinations—confabulations—which are arbitrary 
  and incorrect generations.
Ultimately, no matter what content is generated, it is up to a person to provide the understanding component.

> So I don't think it's that they have no concept of correctness, they do, but it's not strong enough.

Again, "correctness" is a determination solely made by a person evaluating a result in the context of what the person accepts, not intrinsic to an algorithm itself. All an algorithm can do is attempt to produce results congruent with whatever constraints it is configured to satisfy.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: