> That would mean that there is never any hallucination.
No it wouldn’t. If the LLM produces an output that does not match the training data or claims things that are not in the training data due to pseudorandom statistical processes then that’s a hallucination. If it accurately represents the training data or context content, it’s not a hallucination.
Similarly, if you request that an LLM tells you something false and the information it provided is false, that’s not a hallucination.
> The point of original comment was distinguishing between fact and fiction,
In the context of LLMs, fact means something represented in the training set. Not factual in an absolute, philosophical sense.
If you put a lot of categorically false information into the training corpus and train an LLM on it, those pieces of information are “factual” in the context of the LLM output.
The key part of the parent comment:
> caused by the use of statistical process (the pseudo random number generator
No it wouldn’t. If the LLM produces an output that does not match the training data or claims things that are not in the training data due to pseudorandom statistical processes then that’s a hallucination. If it accurately represents the training data or context content, it’s not a hallucination.
Similarly, if you request that an LLM tells you something false and the information it provided is false, that’s not a hallucination.
> The point of original comment was distinguishing between fact and fiction,
In the context of LLMs, fact means something represented in the training set. Not factual in an absolute, philosophical sense.
If you put a lot of categorically false information into the training corpus and train an LLM on it, those pieces of information are “factual” in the context of the LLM output.
The key part of the parent comment:
> caused by the use of statistical process (the pseudo random number generator