Every day I see people treat gen AI like a thinking human, Dijkstra's attitudes ...

FarmerPotato · 2025-12-08T04:28:54 1765168134

One metaphor is to call the model a person, another metaphor is to call it a pile of words. These are quite opposite. I think that's the whole point.

Person-metaphor does nothing to explain its behavior, either.

"Bag of words" has a deep origin in English, the Anglo-Saxon kenning "word-hord", as when Beowulf addresses the Danish sea-scout (line 258)

"He unlocked his word-hoard and delivered this answer."

So, bag of words, word-treasury, was already a metaphor for what makes a person a clever speaker.

SequoiaHope · 2025-12-08T08:05:39 1765181139

Word-hoard is a very good phrase.

Timwi · 2025-12-08T11:45:41 1765194341

It's an everyday word in German: Wortschatz, meaning (someone's) active vocabulary.

bloaf · 2025-12-08T02:10:30 1765159830

I'll make the following observation:

The contra-positive of "All LLMs are not thinking like humans" is "No humans are thinking like LLMs"

And I do not believe we actually understand human thinking well enough to make that assertion.

Indeed, it is my deep suspicion that we will eventually achieve AGI not by totally abandoning today's LLMs for some other paradigm, but rather embedding them in a loop with the right persistence mechanisms.

viccis · 2025-12-08T18:48:09 1765219689

Given that LLMs are incapable of synthetic a priori knowledge and humans are, I would say that as the tech stands currently, it's reasonable to make both of those statements.

visarga · 2025-12-08T05:13:57 1765170837

The loop, or more precisely the "search" does the novel part in thinking, the brain is just optimizing this process. Evolution could manage with the simplest model - copying with occasional errors, and in one run it made everyone of us. The moral - if you scale search the model can be dumb.

robinei · 2025-12-08T09:52:26 1765187546

Let’s not underestimate the scale of the search which led to us though, even though you may be right in principle. In addition to deep time on earth, we may well be just part of a tiny fraction of a universe-wide and mostly fruitless search.

roxolotl · 2025-12-08T01:25:22 1765157122

Yea bag of words isn’t helpful at all. I really do think that “superpowered sentence completion” is the best description. Not only is it reasonably accurate it is understandable, everyone has seen autocomplete function, and it’s useful. I don’t know how to “use” a bag of words. I do know how to use sentence completion. It also helps explains why context matters.

domador · 2025-12-08T02:14:39 1765160079

I've been recently using a similar description, referring to "AI" (LLMs) as "glorified autocomplete" or "luxury autocomplete".

eichin · 2025-12-08T06:33:23 1765175603

I think I first heard "spicy autocomplete" two or three years ago...

visarga · 2025-12-08T05:12:23 1765170743

Sentence completion does not give it justice, when I can ask a LLM to refactor my repo and come back half an hour later to see the deed done.

xtracto · 2025-12-08T14:46:51 1765205211

Thats the thing, when you use an Ask/answer mechanism, you are just writing a "novel" where User: asks and personal coding assistant: answers. But all the text goes into the autocomplete function and the "toaster" outputs the most probable text according to the function.

Its useful, it's amazing, but as the original text says, thinking of it as "some intelligence with reasoning " makes us use the wrong mental models for it.

xtracto · 2025-12-08T14:42:05 1765204925

For me, the problem is in the "chat" mechanic that OpenAI and others use to present the product. It lends itself to strong antropomorphizing.

If instead of a chat interface we simply had a "complete the phrase" interface, people would understand the tool better for what it is.

gkbrk · 2025-12-08T17:04:03 1765213443

But people aren't using ChatGPT for completing phrases. They're using it to get their tasks done, or get their questions answered.

The fact that pretraining of ChatGPT is done with a "completing the phrase" task has no bearing on how people actually end up using it.

yannyu · 2025-12-08T17:25:01 1765214701

It's not just the pretraining, it's the entire scaffolding between the user and the LLM itself that contributes to the illusion. How many people would continue assuming that these chatbots were conscious or intelligent if they had to build their own context manager, memory manager, system prompt, personality prompt, and interface?

yannyu · 2025-12-08T17:00:15 1765213215

I agree 100%. Most people haven't actually interacted directly with an LLM before. Most people's experience with LLMs is ChatGPT, Claude, Grok, or any of the other tools that automatically handle context, memory, personality, temperature, and are deliberately engineered to have the tool communicate like a human. There is a ton of very deterministic programming that happens between you and the LLM itself to create this experience, and much of the time when people are talking about the ineffable intelligence of chatbots, it's because of the illusion created by this scaffolding.

akersten · 2025-12-08T04:27:14 1765168034

Bag of words is actually the perfect metaphor. The data structure is a bag. The output is a word. The selection strategy is opaquely undefined.

> Gen AI tricks laypeople into treating its token inferences as "thinking" because it is trained to replicate the semiotic appearance of doing so. A "bag of words" doesn't sufficiently explain this behavior.

Something about there being significant overlap between the smartest bears and the dumbest humans. Sorry you[0] were fooled by the magic bag.

[0] in the "not you, the layperson in question" sense

viccis · 2025-12-08T18:50:12 1765219812

I think it's still a bit of a tortured metaphor. LLMs operate on tokens, not words. And to describe their behavior as pulling the right word out of a bag is so vague that it applies every bit as much to a Naive Bayes model written in Python in 10 minutes as it does to the greatest state of the art LLM.

habinero · 2025-12-08T09:03:51 1765184631

Yeah. I have a half-cynical/half-serious pet theory that a decent fraction of humanity has a broken theory of mind and thinks everyone has the same thought patterns they do. If it talks like me, it thinks like me.

Whenever the comment section takes a long hit and goes "but what is thinking, really" I get slightly more cynical about it lol

ACCount37 · 2025-12-08T09:22:53 1765185773

Why not?

By now, it's pretty clear that LLMs implement abstract thinking - as do humans.

They don't think exactly like humans do - but they sure copy a lot of human thinking, and end up closer to it than just about anything that's not a human.

habinero · 2025-12-09T09:10:38 1765271438

It isn't clear because they do none of that lol. They don't think.

It can kinda sorta look like thinking if you don't have a critical eye, but it really doesn't take much to break the illusion.

I really don't get this obsessive need to pretend your tools are alive. Y'all know when you watch YouTube that it's a trick and the tiny people on your screen don't live in your computer, right?

ACCount37 · 2025-12-09T09:43:52 1765273432

And how do you know that exactly? What is the source of that certainty? What makes you fully confident that a system that can write short stories and one-shot Python scripts and catch obscure pop culture references in text isn't "thinking" in any way?

The answer to that is the siren song of "AI effect".

Even admitting "we don't know" requires letting go of the idea that "thinking" must be exclusive to humans. And many are far too weak to do that.

FarmerPotato · 2025-12-08T04:31:44 1765168304

lol magic bag.

Davidzheng · 2025-12-08T09:11:02 1765185062

well they are trained to be almost in distribution as a thinking human. So...

maleldil · 2025-12-10T01:33:39 1765330419

Which only means they can mimic the output of a human. So does a p-zombie. It doesn't make them human.

akomtu · 2025-12-08T01:54:35 1765158875

Spoken Query Language? Just like SQL, but for unstructured blobs of text as a database and unstructured language as a query? Also known as Slop Query Language or just Slop Machine for its unpredictable results.

Ukv · 2025-12-08T02:05:28 1765159528

> Spoken Query Language? Just like SQL, but for unstructured blobs of text as a database and unstructured language as a query?

I feel that's more a description of a search engine. Doesn't really give an intuition of why LLMs can do the things they do (beyond retrieval), or where/why they'll fail.

ACCount37 · 2025-12-08T10:16:12 1765188972

If you want actionable intuition, try "a human with almost zero self-awareness".

"Self-awareness" used in a purely mechanical sense here: having actionable information about itself and its own capabilities.

If you ask an old LLM whether it's able to count the Rs in "strawberry" successfully, it'll say "yes". And then you ask it to do so, and it'll say "2 Rs". It doesn't have the self-awareness to know the practical limits of its knowledge and capabilities. If it did, it would be able to work around the tokenizer and count the Rs successfully.

That's a major pattern in LLM behavior. They have a lot of capabilities and knowledge, but not nearly enough knowledge of how reliable those capabilities are, or meta-knowledge that tells them where the limits of their knowledge lie. So, unreliable reasoning, hallucinations and more.

Ukv · 2025-12-08T14:57:50 1765205870

Agree that's a better intuition, with pretraining pushing the model towards saying "I don't know" in the kinds of situations where people write that as opposed to by introspection of its own confidence.

ACCount37 · 2025-12-08T18:45:32 1765219532

There appears to be a degree of "introspection of its own confidence" in modern LLMs. They can identify their own hallucinations, at a rate significantly better than chance. So there must be some sort of "do I recall this?" mechanism built into them. Even if it's not exactly a reliable mechanism.

Anthropic has discovered that this is definitely the case for name recognition, and I suspect that names aren't the only things subject to a process like that.