Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm now running a follow-up poll on whether or not "there are 3 Bs in blueberry" should count as a hallucination and the early numbers are much closer - currently 41% say it is, 59% say it isn't. https://twitter.com/simonw/status/1953777495309746363


so? doesn't change the fact that it fits the formal definition. Just because llm companies have fooled a bunch of people that they are different, doesn't make it true.

If they were different things (objectively, not "in my opinion these things are different) then they'd be handled differently. Internally they are the exact same thing: wrong statistics, and are "solved" the same way. More training and more data.

Edit: even the "fabricated fact" definition is subjective. To me, the model saying "this is in first person" is it confidently presenting a wrong thing as fact.


What I've learned from the Twitter polls is to avoid the word "hallucination" entirely, because it turns out there are enough people out there with differing definitions that it's not a useful shorthand for clear communication.


This just seems like goalpost shifting to make it sound like these models are more capable than they are. Oh, it didn't "hallucinate" (a term which I think sucks because it anthropomorphizes the model), it just "fabricated a fact" or "made an error".

It doesn't matter what you call it, the output was wrong. And it's not like something new and different is going on here vs whatever your definition of a hallucination is: in both cases the model predicted the wrong sequence of tokens in response to the prompt.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: