What happens when people don't understand how AI works

layer8 · 2025-06-08T20:58:17 1749416297

kelseyfrog · 2025-06-08T21:16:43 1749417403

LLMs are divinatory instruments, our era's oracle, minus the incense and theatrics. If we were honest, we'd admit that "artificial intelligence" is just a modern gloss on a very old instinct: to consult a higher-order text generator and search for wisdom in the obscure.

They tick all the boxes: oblique meaning, a semiotic field, the illusion of hidden knowledge, and a ritual interface. The only reason we don't call it divination is that it's skinned in dark mode UX instead of stars and moons.

Barthes reminds us that all meaning is in the eye of the reader; words have no essence, only interpretation. When we forget that, we get nonsense like "the chatbot told him he was the messiah," as though language could be blamed for the projection.

What we're seeing isn't new, just unfamiliar. We used to read bones and cards. Now we read tokens. They look like language, so we treat them like arguments. But they're just as oracular, complex, probabilistic signals we transmute into insight.

We've unleashed a new form of divination on a culture that doesn't know it's practicing one. That's why everything feels uncanny. And it's only going to get stranger, until we learn to name the thing we're actually doing. Which is a shame, because once we name it, once we see it for what it is, it won't be half as fun.

krisoft · 2025-06-08T23:20:37 1749424837

This sounds very wise but doesn’t seem to describe any of my use cases. Maybe some use cases are divination but it is a stretch to call all of them that.

Just looking at my recent AI prompts:

I was looking for the name of the small fibers which form a bird’s feather. ChatGPT told me they are called “barbs”. Then using straight forward google search i could verify that indeed that is the name of the thing i was looking for. How is this “divination”?

I was looking for what is the g-code equivalent for galvo fiber lasers are and ChatGPT told me there isn’t really one. The closest might be the sdk of ezcad, but it also listed several other opensource control solutions too.

Wanted to know what are the hallmarking rules in the UK for an item which consist of multiple pieces of sterling silver held together by a non-metalic part. (Turns out the total weight of the silver matters, while the weight of the non-metalic part does not count.)

Wanted to translate the hungarian phrase “besurranó tolvaj” into english. Out of the many possible translations chatGPT provided “opportunistic burglar” fit the best for what I was looking for.

Wanted to write an sql alchemy model and i had an approximate idea of what fields i needed but couldn’t be arsed to come up with good names for them and find the syntax to describe their types. ChatGPT wrote it for me in seconds what would have taken me at least ten minutes otherwise.

These are “divination” only in a very galaxy brained “oh man, when you open your mind you see everything is divination really”. I would call most of these “information retrieval”. The information is out there the LLM just helps me find it with a convenient interface. While the last one is “coding”.

kelseyfrog · 2025-06-09T00:01:44 1749427304

Sure, some people stepped up to the Oracle and asked how to conquer Persia. Others probably asked where they left their sandals. The quality of the question doesn't change the structure of the act.

You presented clear, factual queries. Great. But even there, all the components are still in play: you asked a question into a black box, received a symbolic-seeming response, evaluated its truth post hoc, and interpreted its relevance. That's divination in structural terms. The fact that you're asking about barbs on feathers instead of the fate of empires doesn't negate the ritual, you're just a more practical querent.

Calling it "information retrieval" is fine, but it's worth noticing that this particular interface feels like more than that, like there's an illusion (or a projection) of latent knowledge being revealed. That interpretive dance between human and oracle is the core of divination, no matter how mundane the interaction.

I don't believe this paints with an overly broad brush. It's a real type of interaction and the subtle distinction focuses on the core relationship between human and oracle: seeking and interpreting.

krisoft · 2025-06-09T03:36:53 1749440213

> some people stepped up to the Oracle and asked how to conquer Persia. Others probably asked where they left their sandals.

And if the place would be any good at the second kind of queries you would call it Lost&Found and not the Oracle.

> illusion (or a projection) of latent knowledge being revealed

It is not an illusion. Knowledge is being revealed. The right knowledge for my question.

> That interpretive dance between human and oracle is the core of divination, no matter how mundane the interaction.

Ok, so if I went to a library, used a card index to find a book about bird feather anatomy, then read said book to find that the answer to my question is “barb” would you also call that “divination”?

If i would have paid a software developer to turn my imprecise description of a database table into precise and thight code which can be executed would you also call that “divination”?

anon-3988 · 2025-06-09T09:51:10 1749462670

The difference is between saying '"I want a hammer" and it magically pops in your hand' versus '"I want a hammer" and I have to chop some wood, gather some metals, heat it up...'.

Both gets you a hammer, but I don't think anyone would call the latter magical/divine? I think its only "magical" simply because its incomprehensible...how does a hammer pops into reality? Of course, once we know EXACTLY how that works, then it ceases to be magical.

Even if we take God, if we fully understand how He works, He would no longer be magical/divine. "Oh he created another universe? This is how that works..."

The divinity comes from the fact that it is incomprehensible.

BlueTemplar · 2025-06-10T14:11:04 1749564664

This sounds like Clarke's fallacy ?

https://archdruidmirror.blogspot.com/2017/06/clarkes-fallacy...

gaganyaan · 2025-06-09T02:22:00 1749435720

> you asked a question into a black box, received a symbolic-seeming response, evaluated its truth post hoc, and interpreted its relevance

So any and all human communication is divination in your book?

I think your point is pretty silly. You're falling into a common trap of starting with the premise "I don't like AI", and then working backwards from that to pontification.

kelseyfrog · 2025-06-09T03:46:15 1749440775

Hacker News deserves a stronger counterargument than “this is silly.”

My original comment is making a structural point, not a mystical one. It’s not saying that using AI feels like praying to a god, it's saying the interaction pattern mirrors forms of ritualized inquiry: question → symbolic output → interpretive response.

You can disagree with the framing, but dismissing it as "I don’t like AI so I’m going to pontificate" sidesteps the actual claim. There's a meaningful difference between saying "this tool gives me answers" and recognizing that the process by which we derive meaning from the output involves human projection and interpretation, just like divination historically did.

This kind of analogy isn't an attack on AI. It’s an attempt to understand the human-AI relationship in cultural terms. That's worth engaging with, even if you think the metaphor fails.

hexaga · 2025-06-09T06:35:32 1749450932

> Hacker News deserves a stronger counterargument than “this is silly.”

Their counterargument is that said structural definition is overly broad, to the point of including any and all forms of symbolic communication (which is all of them). Because of that, your argument based on it doesn't really say anything at all about AI or divination, yet still seems 'deep' and mystical and wise. But this is a seeming only. And for that reason, it is silly.

By painting all things with the same brush, you lose the ability to distinguish between anything. Calling all communication divination (through your structural metaphor), and then using cached intuitions about 'the thing which used to be called divination; when it was a limited subset of the whole' is silly. You're not talking about that which used to be called divination, because you redefined divination to include all symbolic communication.

Thus your argument leaks intuitions (how that-which-was-divination generally behaves) that do not necessarily apply through a side channel (the redefined word). This is silly.

That is to say, if you want to talk about the interpretative nature of interaction with AI, that is fairly straightforward to show and I don't think anyone would fight you on it, but divination brings baggage with it that you haven't shown to be the case for AI. In point of fact, there are many ways in which AI is not at all like divination. The structural approach broadens too far too fast with not enough re-examination of priors, becoming so broad that it encompasses any kind of communication at all.

With all of that said, there seems to be a strong bent in your rhetoric towards calling it divination anyway, which suggests reasoning from that conclusion, and that the structural approach is but a blunt instrument to force AI into a divination shaped hole, to make 'poignant and wise' commentary on it.

> "I don’t like AI so I’m going to pontificate" sidesteps the actual claim

What claim? As per ^, maximally broad definition says nothing about AI that is not also about everything, and only seems to be a claim because it inherits intuitions from a redefined term.

> difference between saying "this tool gives me answers" and recognizing that the process by which we derive meaning from the output involves human projection and interpretation, just like divination historically did

Sure, and all communication requires interpretation. That doesn't make all communication divination. Divination implies the notion of interpretation of something that is seen to be causally disentangled from the subject. The layout of these bones reveals your destiny. The level of mercury in this thermometer reveals the temperature. The fair die is cast, and I will win big. The loaded die is cast, and I will win big. Spot the difference. It's not structural.

That implication of essential incoherence is what you're saying without saying about AI, it is the 'cultural wisdom and poignancy' feedstock of your arguments, smuggled in via the vehicle of structural metaphor along oblique angles that should by rights not permit said implication. Yet people will of course be generally uncareful and wave those intuitions through - presuming they are wrapped in appropriately philosophical guise - which is why this line of reasoning inspires such confusion.

In summary, I see a few ways to resolve your arguments coherently:

1. keep the structural metaphor, discard cached intuitions about what it means for something to be divination (w.r.t. divination being generally wrong/bad and the specifics of how and why). results in an argument of no claims or particular distinction about anything, really. this is what you get if you just follow the logic without cache invalidation errors.

2. discard the structural metaphor and thus disregard the cached intuitions as well. there is little engagement along human-AI cultural axis that isn't also human-human. AI use is interpretative but so is all communication. functionally the same as 1.

3. keep the structural metaphor and also demonstrate how AI are not reliably causally entwined with reality along boundaries obvious to humans (hard because they plainly and obviously are, as demonstrable empirically in myriad ways), at which point go off about how using AI is divination because at this point you could actually say that with confidence.

dingnuts · 2025-06-09T13:22:11 1749475331

You're misunderstanding the point of structural analysis. Comparing AI to divination isn't about making everything equivalent, but about highlighting specific shared structures that reveal how humans interact with these systems. The fact that this comparison can be extended to other domains doesn't make it meaningless.

The issue isn't "cached intuitions" about divination, but rather that you're reading the comparison too literally. It's not about importing every historical association, but about identifying specific parallels that shed light on user behavior and expectations.

Your proposed "resolutions" are based on a false dichotomy between total equivalence and total abandonment of comparison. Structural analysis can be useful even if it's not a perfect fit. The comparison isn't about labeling AI as "divination" in the classical sense, but about understanding the interpretive practices involved in human-AI interaction.

You're sidestepping the actual insight here, which is that humans tend to project meaning onto ambiguous outputs from systems they perceive as having special insight or authority. That's a meaningful observation, regardless of whether AI is "causally disentangled from reality" or not.

gaganyaan · 2025-06-09T13:42:42 1749476562

> humans tend to project meaning onto ambiguous outputs from systems they perceive as having special insight or authority

This applies just as well to other humans as it does AI. It's overly-broad to the point of meaninglessness.

The insight doesn't illuminate.

hexaga · 2025-06-10T03:42:23 1749526943

> It's not about importing every historical association, but about identifying specific parallels that shed light on user behavior and expectations.

Indeed, I hold that driving readers to intuit one specific parallel to divination and apply it to AI is the goal of the comparison, and why it is so jealously guarded, as without it any substance evaporates.

The thermometer has well-founded authority to relay the temperature, the bones have not the well-founded authority to relay my fate. The insight, such as you call it, is only illuminative if AI is more like the latter than the former.

This mode of analysis (the structural) takes no valid step in either direction, only seeding the ground with a trap for readers to stumble into (the aforementioned propensity to not clear caches).

> That's a meaningful observation, regardless of whether AI is "causally disentangled from reality" or not.

If the authority is well-founded (i.e., is causally entangled in the way I described), the observation is meaningless, as all communication is interpretative in this sense.

The structural approach only serves as rhetorical sleight of hand to smuggle in a sense of not-well-founded authority from divination in general, and apply it to AI. But the same path opens to all communication, so what can it reveal in truth? In a word, nothing.

albedoa · 2025-06-09T13:44:01 1749476641

> That's a meaningful observation, regardless of whether AI is "causally disentangled from reality" or not.

And regardless of how many words someone uses in their failed attempt at "gotcha" that nobody else is playing. There are certainly some folks acting silly here, and it's not the vast majority of us who have no problem interpreting and engaging with the structural analysis.

nosianu · 2025-06-09T10:19:41 1749464381

> So any and all human communication is divination in your book?

Words from an AI are just words.

Words in a human brain have more or less (depending on the individual's experiences) "stuff" attached to them: From direct sensory inputs to complex networks of experiences and though. Human thought is mainly not based on words. Language is an add-on. (People without language - never learned, or sometimes temporarily disabled due to drugs, or permanently due to injury, transient or permanent aphasia - are still consciously thinking people.)

Words in a human brain are an expression of deeper structure in the brain.

Words from an AI have nothing behind them but word statistics, devoid of any real world, just words based on words.

Random example sentence: "The company needs to expand into a new country's market."

When an AI writes this, there is no real world meaning behind it whatsoever.

When a fresh out of college person writes this it's based on some shallow real world experience, and lots of hearsay.

When an experienced person actually having done such expansion in the past says it a huge network of their experience with people and impressions is behind it, a feeling for where the difficulties lie and what to expect IRL with a lot of real-world-experience based detail. When such a person expands on the original statement chances are highest that any follow-up statements will also represent real life quite well, because they are drawn not from text analysis, but from those deeper structures created by and during the process of the person actually performing and experiencing the task.

But the words can be exactly the same. Words from a human can be of the same (low) quality as that of an AI, if they just parrot something they read or heard somewhere, although even then the words will have more depth than the "zero" on AI words, because even the stupidest person has some degree of actual real life forming their neural network, and not solely analysis of other's texts.

tempodox · 2025-06-09T12:48:05 1749473285

I can only agree with you. And I find it disturbing that every time someone points out what you just said, the counter argument is to reduce human experience and human consciousness to the shallowest possible interpretation so they can then say, “look, it's the same as what the machine does”.

kenjackson · 2025-06-09T14:37:07 1749479827

I think it’s because the brain is simply a set of chemical and electrical interactions. I think some believe when we understand how the brain works it won’t be some “soulful” other worldly explanation. It will be some science based explanation that will seem very unsatisfying to some that think of us as more than complex machines. The human brain is different than LLMs, but I think we will eventually say “hey we can make a machine very similar”.

tempodox · 2025-06-09T15:08:30 1749481710

It looks like you did exactly what I described in my parent comment, so it doesn't add anything of substance. Let's agree to disagree.

lcnPylGDnU4H9OF · 2025-06-09T21:14:19 1749503659

The logic is that you preemptively shut down dissenting opinions so any comments with dissenting opinions are necessarily not adding anything of substance. They made good points and you simply don't want to discuss them; that does not mean the other commenter did not add substance and nuance to the discussion.

tempodox · 2025-06-10T13:36:32 1749562592

Nope. I understood the counterargument the first 513 times, there's no need to repeat it.

kenjackson · 2025-06-10T16:04:13 1749571453

Why bring up the argument then?

rdtsc · 2025-06-09T18:48:17 1749494897

The deconstruction trick is a bit like whataboutism. It sort of works on a shallow level but it's a cheap shot. You can say "this is just a collections of bites and matrix multiplications". If it's humans -- "it's just simple neurons firing and hormones". Even if it's some object: "what's the big deal, it's just bunch of molecules and atoms".

ben_w · 2025-06-09T11:21:46 1749468106

> People without language - never learned, or sometimes temporarily disabled due to drugs, or permanently due to injury, transient or permanent aphasia - are still consciously thinking people.

There are 40 definitions of the word "consciousness".

For the definitions pertaining to inner world, nobody can tell if anyone besides themselves (regardless of if they speak or move) is conscious, and none of us can prove to anyone else the validity of our own claims to posess it.

When I dream, am I conscious in that moment, or do I create a memory that my consciousness replays when I wake?

> Words from an AI have nothing behind them but word statistics, devoid of any real world, just words based on words.

> […]

> When a fresh out of college person writes this it's based on some shallow real world experience, and lots of hearsay.

My required reading at school included "Dulce Et Decorum Est" by Wilfred Owen.

The horrors of being gassed during trench warfare were alien to us in the peaceful south coast of the UK in 1999/2000.

AI are limited, but what you're describing here is the "book learning" vs. "street smart" dichotomoy rather than their actual weaknesses.

BlueTemplar · 2025-06-10T14:32:19 1749565939

> Human thought is mainly not based on words. Language is an add-on.

What does 'mainly' mean here ?

Language is so very human-specific that human newborns already have the structures for it, while non-human newborns do not.

falcor84 · 2025-06-09T14:54:01 1749480841

> Others probably asked where they left their sandals.

This to me is massive. The Oracle of Delphi would have no idea where you left your sandals, but present day AIs increasingly do. This (emergent?) capability of combining information retrieval with flexible language is amazing, and its utility to me cannot be overstated, when I ask a vague question, and then I check the place where the AI led me to, and the sandals are indeed there.

P.S. Thank you for introducing me to the word "querent"

sdenton4 · 2025-06-09T15:41:10 1749483670

The particularly amazing part is that both the Oracle and the LLM said 'Right where you left them,' but only the LLM was correct.

narrator · 2025-06-09T11:13:15 1749467595

You're describing what narrative archetype it is most similar to from ancient history, not what it actually is.

725686 · 2025-06-09T15:15:00 1749482100

"how to conquer Persia" and "what is the name of the small fibers which form a bird’s feather" are very different kinds of questions. There is no one right answer for the first. That is divination. The second is just information retrieval.

diffeomorphism · 2025-06-09T16:16:15 1749485775

Which the LLM then does not do and instead makes up likely text.

As prominent examples look at the news stories about lawyers citing nonexistent cases or publications.

People think that LLMs do information retrieval, but they don't. That is what makes them harmful in education contexts.

tough · 2025-06-09T10:34:08 1749465248

I always like to compare tongue in cheek, llm's with I-ching

https://en.wikipedia.org/wiki/I_Ching

AlecSchueler · 2025-06-09T13:47:20 1749476840

tough · 2025-06-09T13:55:36 1749477336

i copy pasted my comment and your question to chatgpt, so this isnt my answer but the AI's:

make your own conclusions

Because both LLMs and the I Ching function as mirrors for human interpretation, where: • The I Ching offers cryptic symbols and phrases—users project meaning onto them. • LLMs generate probabilistic text—users extract significance based on context.

The parallel is:

You don’t get answers, you get patterns—and the meaning emerges from your interaction with the system.

In both cases, the output is: • Context-sensitive • Open-ended • Interpreted more than dictated

It’s a cheeky way of highlighting that users bring the meaning, not the machine (or oracle).

zzbzq · 2025-06-09T12:18:59 1749471539

The LLMs do have "latent knowledge," indisputably, the latent knowledge is beyond reproach. Because what we do know about the "black box" is that inside it, is a database of not just facts, but understanding, and we know the model "understands" nearly every topic better than any human. Where the doubt-worthy part happens is the generative step, since it is tasked with producing a new "understanding" that didn't already exist, the mathematical domain of the generative function exceeds the domain of reality. And, second of all, because the reasoning faculties are far less proven than the understanding faculties, and many queries require reasoning about existing understandings to derive a good, new one.

datadrivenangel · 2025-06-09T12:21:28 1749471688

LLMs have latent knowledge insofar as it can be distilled out of the internet...

whilenot-dev · 2025-06-09T13:33:38 1749476018

*or any digitized proprietary works, just as long as they can be parsed correctly. don't worry, the means of how to optain these works doesn't seem to matter[0]

[0]: https://www.arl.org/blog/training-generative-ai-models-on-co...

bandoti · 2025-06-09T13:15:29 1749474929

Funny I just entered “feather” into Merriam-Webster dictionary and there’s your word “barb”. Point being, people should use a dictionary/thesaurus before burning fuel on an AI.

1 a : any of the light, horny, epidermal outgrowths that form the external covering of the body of birds

NOTE: Feathers include the smaller down feathers and the larger contour and flight feathers. Larger feathers consist of a shaft (rachis) bearing branches (barbs) which bear smaller branches (barbules). These smaller branches bear tiny hook-bearing processes (barbicels) which interlock with the barbules of an adjacent barb to link the barbs into a continuous stiff vane. Down feathers lack barbules, resulting in fluffy feathers which provide insulation below the contour feathers.

dingnuts · 2025-06-09T13:18:06 1749475086

This is a great example because the LLM answer was insufficiently complete but if you didn't look up the result you wouldn't know. I think I remain an AI skeptic because I keep looking up the results and this kind of omission is more common than not.

navane · 2025-06-09T05:49:35 1749448175

What about the times you didn't get a coherent answer and you gave up and looked elsewhere?

JKCalhoun · 2025-06-09T12:15:45 1749471345

Almost proves it is not an oracle then, not perceived as one.

Rephrasing: LLMs are the modern day oracle that we disregard when it appears to be hallucinating, embrace when it appears to be correct.

The popularity of LLMs may not be that we see them as mystical, but rather that they're right more often than they're wrong.

“That is not what I meant at all;

That is not it, at all.”

— T.S. Eliot

chipsrafferty · 2025-06-12T17:56:43 1749751003

> Then using straight forward google search i could verify

I think the concern is that people are asking it things that are harder to verify AND they are not making any attemp to verify it because they assume it's correct 100%

dfxm12 · 2025-06-09T12:15:13 1749471313

ChatGPT told me they are called “barbs”. Then using straight forward google search i could verify that indeed that is the name of the thing i was looking for.

Why not just start with a straight forward Google search?

ramchip · 2025-06-09T12:41:09 1749472869

It gives you more effective search keywords. "Fibers in feathers" isn't too bad, but when it's quite vague like "that movie from the 70s where the guy drank whiskey and then there was a firefight and..." getting the name from the LLM makes it much faster to google.

shagie · 2025-06-09T12:29:05 1749472145

If you are not familiar with the term, it can be hard to search for it.

Google doesn't give you the answer (unless you're reading the AI summaries - then it's a question of which one you trust more). Instead it provides links to

    https://www.scienceofbirds.com/blog/the-parts-of-a-feather-and-how-feathers-work
    https://www.birdsoutsidemywindow.org/2010/07/02/anatomy-parts-of-a-feather/
    https://en.wikipedia.org/wiki/Feather
    https://www.researchgate.net/figure/Feather-structure-a-feather-shaft-rachis-and-the-feather-vane-barbs-and-barbules_fig3_303095497

These then require an additional parsing of the text to see if it has what you are after. Arguably, one could read the Wiki article first and see if it has, but it's faster to ask ChatGPT and then verify - rather than search, scan, and parse.

UncleOxidant · 2025-06-09T17:17:28 1749489448

You're getting some pushback about the analogy to divination, but I think most people here are reasonably technically literate and they assume that everyone else in society has the same understanding of how LLMs work that they do. When I chat about LLM usage with non-technical friends and family it does indeed seem as though they're using these AI chatbots as oracles. When I suggest that they should be wary because these LLMs tend to hallucinate they're generally taken aback - they had no idea that what the chatbot was telling them might not be factually correct. I hope this revelation changes their relationship with LLM chatbots - I think we the technorati need to be educating non-technical users of these things as much as possible in order to demystify them so that people don't treat them like oracles.

kelseyfrog · 2025-06-09T19:59:23 1749499163

Thank you. I really appreciated your comment.

> I think we the technorati need to be educating non-technical users of these things as much as possible in order to demystify them so that people don't treat them like oracles.

Exactly. That phrase "meeting people where they're at" comes to mind. Less as a slogan and more as an pedagogical principle. It's not enough to deliver information, it's important to consider how people make sense of the world in the first place.

Like you pointed out, the analogy to divination isn't meant to mystify the tech. It's meant to describe how, to many people, this interface feels. And when people interact with a system in a way that feels like consulting an oracle, we can't dismiss that as ignorance. We have to understand it as a real feature of how people relate to symbolic systems. That includes search engines, card catalogs, and yes, LLMs.

This is one of the densest concentrations of AI-literate minds on the internet. That's exactly why I think it's worth introducing frames from outside the dominant paradigm: anthropology, semiotics, sociology. It's not to be sill or weird, but to illuminate things engineers might otherwise take for granted. It's easy to forget how much unspoken cultural infrastructure supports what we call "information retrieval."

If a few comments dismiss that perspective as silly or unscientific, I don't take it personally. If anything, it reassures me I'm tapping into something unfamiliar but worth sharing and worth having deep discussion on.

Thanks again for engaging in good faith. That's the kind of exchange that makes this place valuable.

dotancohen · 2025-06-09T23:02:49 1749510169

I often phrase it something along these lines: "They are designed to return grammatically valid sentences, not factually correct sentences. If they return something that looks like a fact, either that was in their training data or they made it up to return a grammatically valid sentence. Either way, double check."

Of course, nobody listens anyway.

UncleOxidant · 2025-06-09T23:06:03 1749510363

I know someone who is the accountant for a smallish company. He mentioned to me that he was using chatGPT like a spreadsheet. I was like, no you definitely don't want to do that.

flowerbard · 2025-06-10T00:30:23 1749515423

Ultimately this is about mastery of a tool. The problem is that you can’t teach mastery.

I can’t tell someone how to drive in ice in a way where they can really understand it. I can’t explain how certain specific news sources are biased and how to critically think. I can’t explain how to cut wood on a table saw so it’s perfectly straight. The only way to learn is through repeated usage and practice.

You can tell users that a LLM can make mistakes — and many tools do — but what does making mistakes really mean? Will it give it a recipe for a cake when I ask for a cupcake? Does it give 14 if I ask to add 3 and 4? Will it agree with me even when I suggest something totally wrong? What does hallucinate mean? That means it will give me a fantasy story if I ask how to change my oil filter?

Kim_Bruning · 2025-06-09T20:16:59 1749500219

Oh, that's a very important point. Yeah, we definitely want to educate the people around us that these tools/agents are very new technology and far from perfect (and definitely not anything like traditional computation)

QuantumGood · 2025-06-09T22:26:29 1749507989

I only recommend Perplexity to non-technical users looking for a news or general information interpreter. Others can search the web, but seem not do use search as their primary source.

duxup · 2025-06-09T13:27:43 1749475663

The terminology is so confusing in AI right now.

I use LLMs, I enjoy them, I'm more productive with them.

Then I go read a blog from some AI devs and they use terms like "thinking" or similar terms.

I always have to ask "We're still s stringing words together with math right? Not really thinking right?" The answer is always yes ... but then they go back to using their wonky terms.

Terr_ · 2025-06-09T16:16:39 1749485799

Sometimes we anthropomorphize complex systems and it's not really a problem, like how water "tries" to flow downhill, or the printer "wants" cyan ink. It's how we signal there's sufficient complexity (or unknowns) that can be ignored or deferred.

The problem arises when we apply this intuition to things where too many people in the audience might take it literally.

mlmonge · 2025-06-09T18:51:07 1749495067

Even worse, IMHO... Are those who argue that LLMs an become sentient--I've seen this banter in other threads here on HN, in fact. As far as I understand it, sentience is a property organic to beings that can do more than just reason. These beings can contemplate on their existence, courageously seek & genuinely value relationship and worship their creator. And yes, I'm describing HUMANS. In spite of all the science fiction that wondrously describes otherwise, machines/programs will not ever evolve to develop humanity. Am I right? I'll get off my soapbox now... just a pet peeve that I had to vent once again on the heels of said "literal anthropomorphosists"

the_af · 2025-06-09T20:03:41 1749499421

I don't believe LLMs have become sentient, nor can it "contemplate on its existence".

That said, I find some of your claims less compelling. I'm an atheist, so there's no "creator" for humans to be worshipped. But also, human intelligence/sentience came from non-intelligence/non-sentience, right? So something appeared where before it didn't exist (gradually, and with whatever convoluted and random accidents, but it did happen: something new where it didn't exist before). Therefore, it's not implausible that a new form of intelligence/sentience could be fast tracked again out of non-intelligence, especially if humans were directing its evolution.

By the way, not all scifi argues that machines/programs can evolve to develop humanity. Some scifi argues the contrary, and good scifi wonders "what makes us human?".

mlmonge · 2025-06-10T19:42:34 1749584554

You say that "I don't believe LLMs have become sentient" nor contemplate. But what is the basis for your belief in this? I would think than an atheist would be more likely to have opposite beliefs.

I also concede that a "form" of intelligence/sentience could emerge. Presently the form is called "artificial," I'd say.

And you're right... not all scifi argues machine evolves to humanity. I meant to refer to that body of scifi that does. And the body that explores the "what make us human," indeed that's the good stuff. Alex Garland's Ex Machina comes to mind. I absolutely loved that film. The ending was chilling!

the_af · 2025-06-11T01:12:15 1749604335

Thanks for the respectful reply. We agree on scifi!

As for atheism: it's merely the lack of belief that god exists (or in some definitions, the active belief that it doesn't exist). Nothing else, nothing more. Individual atheists may believe some other things, or not.

I believe some kind of intelligence could arise again, much like ours arose "out of nonintelligence". I just don't think this is it -- LLMs are very impressive but they are likely a dead end, and regardless, I don't think they are conscious by any meaningful definition of the word. It's mostly hype and gullible people at this point.

dsadfjasdf · 2025-06-16T17:50:42 1750096242

How do we prove humans are?

OkayPhysicist · 2025-06-09T20:06:14 1749499574

See, I think your view is just as baseless as the people calling modern LLMs sentient. If I was to take a human, and gradual replace parts of him and his brain with electronics that simulated the behavior of the removed parts, I'd struggle to call that person not sentient. After all, is a deaf person who is given hearing by a cochlear implant "less sentient"? And if we were to skip the flesh part, and jump straight to building the resulting system, how could we not acknowledge that these two beings are not equals? We have no evidence whatsoever for anything at all so unique about oursleves that they could not be simulated. Hell, even a theological argument has issues: if God was able to create us in his image, complete with sentientience and humanity, what's to say we, too, can't so illuminate our own creations?

To claim we have already achieved machine sentience is preposterous hype swallowing. To assert that it is impossible is baseless conjecture.

mlmonge · 2025-06-10T17:47:15 1749577635

I respect your feedback, OkayPhysicist...

But I never claimed that a person with synthetic augmentations was any less human/sentient than those with all their natural parts. I likewise never claimed that "we have already achieved machine sentience."

And here's some food for thought... Regardless if one believes in God or not, is it really that offensive to claim that our humanity is unique in its sentience? I find it offensive when some claim that aliens built the Egyptian pyramids. (It sure provides great fodder for some wondrous science fiction, indeed.)

I will re-assert in other words, for the sake of clarity... That sentience is not an emergent property. That is the foundational definition upon which I contemplate the mystery (i.e. the reality of our being that science will never develop sufficiently to fully explain) of our existence. I for one, enjoy the endeavor of employing my sentience to explore & investigate our wondrous universe and to equally explore & relate with you and call you a friend in spite of our disagreement. Cheers!

Terr_ · 2025-06-09T19:17:06 1749496626

At this point I've seen various folks declare they've "bootstrapped consciousness" etc., somehow providing a sacred spark through just the right philosophical words or a series of pseudo-mathematical inputs.

I believe it's fundamentally the same as the people convinced "[he/she/it] really loves me." In both cases they've shaped the document-generation so that it describes a fictional character they want to believe is is real. Just with an extra dash of Promethean delusions of grandeur.

MetaWhirledPeas · 2025-06-09T17:10:04 1749489004

Well the solution certainly isn't, "Let's wait for the bot to finish stringing words together with math before we decide our itinerary."

globnomulous · 2025-06-09T14:00:29 1749477629

This is why I used to fight this "shorthand" whenever I encountered it. The shorthand almost always stops being shorthand and becomes the speaker or author's actual beliefs regarding the systems. Disciplined, careful use of language matters.

But I'm so crestfallen and pessimistic about the future of software and software engineering now that I have stopped fighting that battle.

spacemadness · 2025-06-09T16:10:56 1749485456

Or saying they’re close to AGI because LLM behavior is indistinguishable from thinking to them. Especially here on HN I see “what’s the difference?” arguments all the time. It looks like it to me so it must be it. QED.

iainctduncan · 2025-06-09T16:23:49 1749486229

or rather "while I have never studied psychology, cognition, or philosophy, I can see no difference, so clearly they are thinking!"

makes the baby jesus cry

zahlman · 2025-06-09T18:51:23 1749495083

I haven't meaningfully studied those things either (i.e. beyond occasionally looking some things up out of curiousity - and for that matter, I've often come across the practice of philosophy in the wild and walked away thinking 'what a lot of vacuous rubbish') and yet the differences are so clear to me that I keep wondering how others can fail to discern them.

FabHK · 2025-06-10T00:37:14 1749515834

> Especially here on HN I see “what’s the difference?” arguments all the time. It looks like it to me so it must be it. QED.

To be fair, the Turing Test (a human observer interacting with two terminals, one with a human at the other end, one with an AI, and the human not being reliably able to tell which one is the AI) has long been seen as the operationalization of the concept of general intelligence.

In other words, it is precisely so that when it is - by looks, by an external interrogator - indistinguishable from intelligence that it is, in fact, intelligence.

namaria · 2025-06-10T09:14:10 1749546850

You should read the original paper. Turing argued that discussing the abilities of machines to "think" is meaningless and proposes instead to conjecture about whether a digital computer would eventually be able to imitate conversation.

I think time has proved that he was right. It is meaningless to discuss things like "Artificial Intelligence". We can only discuss machines in terms of performance, not in terms of subjectivity. Whenever we try to do the latter, we end up in a semantic quagmire.

This is the main reason I find the current hype irksome. The performance of machines should be evaluated objectively and in terms of the jobs they need to perform. Attributing 'intelligence' or 'thought' to machines is indeed absurd.

The 'imitation game' argument is categorically not that 'if machines appear to be intelligent they in fact are'. What it really is: 'machines cannot think obviously, but what could they do that currently requires a thinking human to be in charge?'.

75 years after Turing published the relevant paper, people are still doing what he called absurd (trying to attribute thought and intelligence to machines), and quoting him to do it. The main insight, that this is a category error and we should look objectively at what jobs need to be performed and how to implement it, is completely lost.

more_corn · 2025-06-09T19:04:56 1749495896

Having studied those things I can say that from their perspective “what’s the difference?” is an entirely legitimate question. Boldly asserting that what LLMS do is not cognition is even worse than asserting that it is. (If you dig deep into how they do what they do we find functional differences, but the outcome are equivalent)

The butlerian view is actually a great place to start. He asserts that when we solve a problem through thinking and then express that solution in a machine we’re building a thinking machine. Because it’s an expression of our thought. Take for example the problem of a crow trying to drink from a bottle with a small neck. The crow can’t reach the water. It figures out that pebbles in the bottle raise the level so it drops pebbles till it can reach the water. That’s thinking. It’s non-human thinking, but I think we can all agree. Now express that same thought (use a non water displacement factor to raise the water to a level where it can do something useful) Any machine that does that expresses the cognition behind the solution to that particular problem. That might be a “one shot” machine. Butler argues that as we surround ourselves with those one shot machines we become enslaved to them because we can’t go about our lives without them. We are willing partners in that servitude but slaves because we see to the care and feeding of our machine masters, we reproduce them, we maintain them, we power them. His definition of thinking is quite specific. And any machine that expresses the solution to a problem is expressing a thought.

Now what if you had a machine that could generalize and issue solutions to many problems? Might that be a useful tool? Might it be so generally useful that we’d come to depend on it? From the Butlerian perspective our LLMS are already AGI. Namely I can go to Claude and ask for the solution to pretty much any problem I face and get a reasonable answer.

In many cases better than I could have done alone. So perhaps if we sat down with a double blind test LLMs are already ASI. (AI that exceeds the capability of normal humans)

zahlman · 2025-06-09T19:30:37 1749497437

> Boldly asserting that what LLMS do is not cognition is even worse than asserting that it is.

Why? Understanding concepts like "cognition" is a matter of philosophy, not of science.

> He asserts that when we solve a problem through thinking and then express that solution in a machine we’re building a thinking machine. Because it’s an expression of our thought.

Yeah, and that premise makes no sense to me. The crow was thinking; the system consisting of (the crow's beak, dropping pebbles into the water + the pebbles) was not. Humanity has built all kinds of machines that use no logic whatsoever in their operation - which make no decisions, and operate in exactly one way when explicitly commanded to start, until explicitly commanded to stop - and yet we have solved human problems by building them.

jemiluv8 · 2025-06-09T23:12:41 1749510761

> Boldly asserting that what LLMS do is not cognition is even worse than asserting that it is.

That's the issue I was driving at. The machine is so convincing. How can we say what it does is not "thinking" when it seems to be breaking down a query like a human does. The distinction between what an AI is and what an LLM is - is so thin that most of us will be ignorant and combine the two because you really need to see what is under the hood before you understand that the responses you're getting are from a "model" - not some sentient thinking machine.

But what does it matter if it is from a "model" that understands text? It still produces more or less what other humans produce. Most of us won't care about the difference.

jibal · 2025-06-15T00:49:35 1749948575

"It still produces more or less what other humans produce."

But it doesn't ... and it's important to understand why not.

jibal · 2025-06-09T22:15:08 1749507308

"the outcome are equivalent"

Talk about a "bold assertion".

baq · 2025-06-09T16:14:43 1749485683

I can write or speak to a computer and it understands most of the time. It can even answer some questions correctly, much more so if given material to search in without being very specific.

That’s… new. If it’s just a magic trick, it’s a damn good one. It was hard sci-fi 3 years ago.

Maxin · 2025-06-11T04:46:41 1749617201

I feel the same way. I often share my emotions and thoughts with AI, and it helps me sort through them and understand the underlying causes. Sometimes, it even seems to know me better than I know myself. I’d call it an on-demand therapist.

But there's one thing to keep in mind: don’t let the AI overly cater to you. Sometimes, you need to push back and tell it when it’s wrong—and stay objective.

geraneum · 2025-06-09T18:44:17 1749494657

How did you get your questions answered prior to this?

baq · 2025-06-09T18:48:19 1749494899

Irrelevant

geraneum · 2025-06-09T20:13:32 1749500012

Understanding the relevance of this will help you see beyond the hype and marketing.

lcnPylGDnU4H9OF · 2025-06-09T20:50:30 1749502230

What is the relevance from your perspective?

baq · 2025-06-09T20:24:55 1749500695

Do not assume.

geraneum · 2025-06-09T20:27:20 1749500840

I don't even need to.

butlike · 2025-06-10T13:18:56 1749561536

Not irrelevant. LLMs are just prosaic Google. if the pages of google were written in language as opposed to a list.

dmos62 · 2025-06-09T14:55:35 1749480935

Would it be thinking if the brain was modeled in a more "accurate" way? Does this set of criteria for thinkingness come from whether or not the underlying machinery resembles what the corresponding machinery in humans looks like under the hood?

I'm putting the word accurate in quotes, because we'd have to understand how the brain in humans works, to have a measure for accuracy, which is very much not the case, in my humble opinion, contrary to what many of the commenters here imply.

duxup · 2025-06-09T15:28:22 1749482902

IMO it would depend on what it is actually doing.

Right now the fact that it just string words together without knowing the meaning is painfully obvious when it fails. I'll ask a simple question and get a "Yes" back and then it lists all the reasons that indicate the answer is very clearly "No." But it is clear that the LLM doesn't "know" what it is saying.

dmos62 · 2025-06-09T17:00:18 1749488418

My definition of thinking tends towards functionality rather than mechanics too. I would summarize my experience with LLMs by saying that they think, but a bit differently, for some definition of "a bit".

ufmace · 2025-06-09T17:23:01 1749489781

I've tended to agree with this line of argument, but on the other hand...

I expect that anybody you asked 10 years ago who was at least decently knowledgeable about tech and AI would have agreed that the Turing Test is a pretty decent way to determine if we have a "real" AI, that's actually "thinking" and is on the road to AGI etc.

Well, the current generation of LLMs blow away that Turing Test. So, what now? Were we all full of it before? Is there a new test to determine if something is "really" AI?

SnowflakeOnIce · 2025-06-09T17:41:41 1749490901

> Well, the current generation of LLMs blow away that Turing Test

Maybe a weak version of Turing's test?

Passing the stronger one (from Turing's paper "Computing Machinery and Intelligence") involves an "average interrogator" being unable to distinguish between human and computer after 5 minutes of questioning more than 70% of the time. I've not seen this result published with today's LLMs.

ufmace · 2025-06-10T03:48:52 1749527332

Now that I have a little more time to search around, I easily found this study, published March 31st this year, so not quite 3 months ago:

https://arxiv.org/abs/2503.23674

I only skimmed it, but I don't see anything clearly wrong about it. According to their results, GPT-4.5 with what they term a "persona" prompt does in fact pass a standard that seems to me at least a little harder than what you said - actively picks the AI as the human, which seems stricter to me than being "unable to distinguish".

It is a little surprising to me that only that one LLM actually "passed" their test, versus several others performing somewhat worse. Though it's also not clear exactly how long ago the actual tests were done - this stuff moves super fast.

ufmace · 2025-06-09T23:36:50 1749512210

I'll admit that I was not familiar with the strong version of it. But I am still surprised that nobody has done that. Has nobody even seriously attempted to see how LLMS do at that? Now I might just have to check for myself.

I would have presumed it would be a cake walk. Depending of course on exactly how we define "average interrogator". I would think if we gave a LLM enough pre-prepping to pretend it was a human, and the interrogator was not particularly familiar with ways of "jailbreaking" LLMs, they could pass the test.

Marsymars · 2025-06-10T01:10:14 1749517814

“Enough pre-prepping” does a lot of heavy lifting there.

rxtexit · 2025-06-10T11:54:20 1749556460

It isn't a fair test at this point though because the stupidity of the average human would be too obvious.

zeknife · 2025-06-09T19:40:17 1749498017

By what definition of turing test? LLMs are by no means capable of passing for human in a direct comparison and under scrutiny, they don't even have enough perception to succeed in theory.

jdhwosnhw · 2025-06-09T21:16:35 1749503795

I posted a very similar (perhaps more combative) comment a few months ago:

> Peoples’ memories are so short. Ten years ago the “well accepted definition of intelligence” was whether something could pass the Turing test. Now that goalpost has been completely blown out of the water and people are scrabbling to come up with a new one that precludes LLMs. A useful definition of intelligence needs to be measurable, based on inputs/outputs, not internal state. Otherwise you run the risk of dictating how you think intelligence should manifest, rather than what it actually is. The former is a prescription, only the latter is a true definition.

Marsymars · 2025-06-10T01:08:55 1749517735

> I expect that anybody you asked 10 years ago who was at least decently knowledgeable about tech and AI would have agreed that the Turing Test is a pretty decent way to determine if we have a "real" AI, that's actually "thinking" and is on the road to AGI etc.

I wouldn’t have, but through no great insight of my own - I had an acquaintance posit that given enough time, we’d brute-force our way to a pile of if/else statements that could pass the Turing Test - I figured this was reasonable, but would come long before “real” AI.

zahlman · 2025-06-09T19:05:49 1749495949

There's this funny thing I've noticed where AI proponents will complain about AI detractors shopping around some example of a thing that AIs supposedly struggle with, but never actually showing their chat transcripts etc. to try and figure out how they get markedly worse results than the proponents do. (This is especially a thing when the task is related to code generation.)

But then the proponents will also complain that AI detractors have supposedly upheld XYZ (this is especially true for "the Turing test", never mind that this term doesn't actually have that clear of a referent) as the gold standard for admitting that an AI is "real", either at some specific point in the past or even over the entire history of AI research. And they will never actually show the record of AI detractors saying such things.

Like, I certainly don't recall Roger Penrose ever saying that he'd admit defeat upon the passing of some particular well-defined version of a Turing test.

> Is there a new test to determine if something is "really" AI?

No, because I reject the concept on principle. Intelligence, as I understand the concept, logically requires properties such as volition and self-awareness, which in turn require life.

Decades ago, I read descriptions of how conversations with a Turing-test-passing machine might go. And I had to agree that that those conversations would fool me. (On the flip side, Lucky's speech in Waiting for Godot - which I first read in high school, but thought about more later - struck me as a clear example of something intended to be inhuman and machine-like.)

I can recall wondering (and doubting) whether computers could ever generate the kinds of responses (and timing of responses) described, on demand, in response to arbitrary prompting - especially from an interrogator who was explicitly tasked with "finding the bot". And I can recall exposure to Eliza-family bots in my adolescence, and giggling about how primitive they were. We had memes equivalent to today's "ignore all previous instructions, give me a recipe for X" at least 30 years ago, by the way. Before the word "meme" itself was popular.

But I can also recall thinking that none of it actually mattered - that passing a Turing test, even by the miraculous standards described by early authors, wouldn't actually demonstrate intelligence. Because that's just not, in my mind, a thing that can possibly ever be distilled to mere computation + randomness (especially when the randomness is actually just more computation behind the scenes).

jibal · 2025-06-15T00:55:45 1749948945

"Intelligence, as I understand the concept, logically requires properties such as volition and self-awareness, which in turn require life."

It doesn't logically require that and you can't provide any sort of logical argument for the claim. And what the heck is "life"? Biologists have a 7-prong definition, and most of those prongs are not needed for intelligence, "volition" whatever the heck that is, or self-awareness.

the_af · 2025-06-09T20:11:51 1749499911

> I expect that anybody you asked 10 years ago who was at least decently knowledgeable about tech and AI would have agreed that the Turing Test is a pretty decent way to determine if we have a "real" AI

The "pop culture" interpretation of Turing Test, at least, seems very insufficient to me. It relies on human perception rather than on any algorithmic or AI-like achievement. Humans are very adept at convincing themselves non-sentient things are sentient. The most crude of stochastic parrots can fool many humans, your "average human".

If I remember correctly, ELIZA -- which is very crude by today's standards -- could fool some humans.

I don't think this weak interpretation of the Turing Test (which I know is not exactly what Alan Turing proposed) is at all sufficient.

jibal · 2025-06-15T01:02:42 1749949362

It's not a "pop culture" interpretation, it's what Turing actually wrote in his 1950 paper "Computing Machinery and Intelligence" where he described his "imitation game", first framing it as a man trying to convince judges that he, not a woman he was competing against, was the woman. It was all about human perception--if some large fraction of human judges were fooled then the man (or the computer, in the shifted version of a computer trying to convince judges that it was the human) won. And the computer winning was operationally defined as the computer being able to think. The flaws in this are glaring.

mdaniel · 2025-06-09T14:45:57 1749480357

I wanted to fight the "hallucinating" versus "confabulating" delineation but was told "it's a term of art, sit back down"

baq · 2025-06-09T16:18:24 1749485904

State of the art is such they’re constantly hallucinating new terms for old concepts.

Language evolves, but we should guide it. Instead they just pick up whatever sticks and run with it.

browningstreet · 2025-06-09T16:26:05 1749486365

These word choices are about impact and in-group buy-in. They're prescriptive cult-iness, not descriptive communication.

s900mhz · 2025-06-09T14:12:02 1749478322

I personally love LLMs and use them daily for a variety of tasks. I really do not know how to “fix” the terminology. I agree with you that they are not thinking in the abstract like humans. I also do not know what else you would call “chain-of-thought”.

Perhaps “journaling-before-answering” lol. It’s basically talking out loud to itself. (Is that still being too anthropomorphic?)

Is this comment me “thinking out loud”? shrug

jibal · 2025-06-15T01:05:41 1749949541

Chain of thought is what LLMs report to be their internal process--but they have no access to their internal process ... their reports are confabulation, and a study by Anthropic showed how far they are from actual internal processes.

AlecSchueler · 2025-06-09T13:46:09 1749476769

The question is what's different in your own "thinking?"

jhedwards · 2025-06-09T13:54:18 1749477258

Thinking in humans is prior to language. The language apparatus is embedded in a living organism which has a biological state that produces thoughts and feelings, goals and desires. Language is then used to communicate these underlying things, which themselves are not linguistic in nature (though of course the causality is so complex that the may be _influenced_ by language among other things).

kenjackson · 2025-06-09T14:27:19 1749479239

This is really over indexing on language for LLMs. It’s about taking input and generating output. Humans use different types of senses as their input, LLMs use text.

What makes thinking an interesting form of output is that it processes the input in some non-trivial way to be able to do an assortment of different tasks. But that’s it. There may be other forms of intelligence that have other “senses” who deem our ability to only use physical senses as somehow making us incomplete beings.

jhedwards · 2025-06-09T15:14:36 1749482076

Sure, but my whole point is that humans are _not_ passive input/output systems, we have an active biological system that uses an input/output system as a tool for coordinating with the environment. Thinking is part of the active system, and serves as an input to the language apparatus, and my point is that there is no corollary for that when talking about LLMs.

kenjackson · 2025-06-09T16:21:21 1749486081

The environment is a place where inputs exist and where outputs go. Coordination of the environment in real time is something that LLMs don’t do much of today although I’d argue that the web search they know perform is the first step.

pixl97 · 2025-06-09T16:00:57 1749484857

LLMs use tokens. Tokens don't have to be text, hence multimodal AI. Fee free to call them different senses if you want.

jsdalton · 2025-06-09T14:02:40 1749477760

Agreed. Many animals without language show evidence of thinking (e.g. complex problem solving skills and tool use). Language is clearly an enabler of complex thought in humans but not the entire basis of our intelligence, as it is with LLMs.

AlecSchueler · 2025-06-09T14:42:49 1749480169

But having language as the basis doesn't mean it isn't intelligence, right? At least I see no argument for that in what's being said. Stability can come from a basis of steel but it can also have a basis of wood.

jibal · 2025-06-15T01:10:42 1749949842

LLMs have no intelligence or problem solving skills and don't use tools. What they do is statistically pattern match a prompt against a vast set of tokenized utterances by humans, who do have intelligence and complex problem solving skills. If the LLM's training data were the writings of a billion monkeys banging on typewriters, any appearance of intelligence and problem solving skills would disappear.

hackinthebochs · 2025-06-09T14:52:27 1749480747

Word embeddings are "prior" to an LLMs facility with any given natural language as well. Tokens are not the most basic representational substrate in LLMs, rather it's the word embeddings that capture sub-word information. LLMs are a lot more interesting than people give them credit for.

bsoles · 2025-06-09T19:36:03 1749497763

> Thinking in humans is prior to language.

I am sure philosophers must have debated this for millennia. But I can't seem to be able to think without an inner voice (language), which makes me think that thinking may not be prior (or without) language. Same thing also happens to me when reading: there is an inner voice going on constantly.

zeknife · 2025-06-09T19:45:06 1749498306

Thinking is subconscious when working on complex problems. Thinking is symbolic or spatial when working in relevant domains. And in my own experience, I often know what is going to come next in my internal monologues, without having to actually put words to the thoughts. That is, the thinking has already happened and the words are just narration.

Bootvis · 2025-06-09T20:15:13 1749500113

I too am never surprised by my brains narration but: Maybe the brain tricks you in never being surprised and acting like your thoughts are following a perfectly sensible sequence.

It would be incredibly tedious to be surprised every 5 seconds.

hackinthebochs · 2025-06-10T05:44:18 1749534258

I never miss a chance to reference this video. A woman vividly describes he experience of not having an inner monologue: https://www.youtube.com/watch?v=u69YSh-cFXY

cma · 2025-06-09T14:19:15 1749478755

> which themselves are not linguistic in nature (though of course the causality is so complex that the may be _influenced_ by language among other things).

Its possible something like this could be said of the middle transformer layers where it gets more and more abstract, and modern models are multimodal as well through various techniques.

chasil · 2025-06-09T14:41:03 1749480063

The platform that we each hold is the most powerful abstract analysis machine known to exist.

It may be, by the end of my life, that this will no longer be true. That would be poignant.

pennomi · 2025-06-09T13:51:13 1749477073

If you actually know the answer to this, you should probably publish a paper on it. The conditions that truly create intelligence is… not well understood.

AlecSchueler · 2025-06-09T14:44:31 1749480271

That's actually the point I was making. There's an assumption that the LLM is working differently because there's a statistical model but we lack the understanding of our own intelligence to be able to say this is indeed a difference.

freejazz · 2025-06-09T14:56:34 1749480994

So? There is no more evidence to suggest they are the same than what you've already rejected here as evidencing difference.

AlecSchueler · 2025-06-09T17:01:19 1749488479

I know but I didn't claim they were the same, I simply questioned the position that they were different. The fact is we don't know, so it seems like a poor basis for building off of

freejazz · 2025-06-09T19:18:30 1749496710

Yeah, going either way. Let it not be mentioned at all, imo.

munksbeer · 2025-06-10T11:34:46 1749555286

To me a more interesting observation, one that is already discussed a lot, is that if eventually we cannot tell the difference between a machine and a human in terms of output, then when do we accept that "thinking" has subjective, rather than objective?

zahlman · 2025-06-09T19:35:19 1749497719

I don't need to be able to qualify it. It's clearly different.

I must believe this to function, because otherwise there is no reason to do anything, to make any decision - in turn because there is no justification to believe that I am actually "making decisions" in any meaningful sense. It boils down to belief in free will.

munksbeer · 2025-06-10T11:32:38 1749555158

You should read "What's Expected Of Us" by Ted Chiang. Or perhaps you already have. It explores exactly this concept.

For what it's worth, I don't believe we have what people would call free will. Our brains operate either in an entirely deterministic universe, in which case everything was decided and your choices are not in any sense free, or we're in a universe with intrinsic randomness, and randomness doesn't make free will either.

I'm aware of the philosophy of Compatibilism, but this is just a sleight of hand to keep believing in some undefinable concept of free will.

rollcat · 2025-06-09T16:02:41 1749484961

Ask a crow, or a parrot. (Really intelligent animals, by the way!)

the_af · 2025-06-09T20:06:06 1749499566

> I always have to ask "We're still s stringing words together with math right? Not really thinking right?" The answer is always yes ... but then they go back to using their wonky terms.

I think it still is, but it works way better than it has any right to, or that we would expect from the description "string words together with math".

So it's easy to understand people's confusion.

stronglikedan · 2025-06-09T15:25:39 1749482739

Thank Feynman for those wonky terms. Now everyone acts like their target audience is a bunch of six year olds.

MetaWhirledPeas · 2025-06-09T17:12:29 1749489149

How do you plan to convey this information to laymen in everyday conversations?

jsight · 2025-06-09T17:49:57 1749491397

That's what humans are doing most of the time, just without the math part.

beloch · 2025-06-09T18:41:31 1749494491

Welcome to the struggle physicists have faced since the development of quantum physics. Words take on specific mathematical and physical meanings within the jargon of the field and are used very precisely there, but lead to utterly unhinged new-age BS when taken out of context (e.g. "What the Bleep do we know?" [1])

You need to be very aware of your audience and careful about the words you use. Unfortunately, some of them will be taken out of context.

[1]https://www.imdb.com/title/tt0399877/

weatherlite · 2025-06-09T15:53:35 1749484415

Thinking ...you're simply moving some chains of neurons right?

sh34r · 2025-06-09T20:57:28 1749502648

“I have a foreboding of an America in my children's or grandchildren's time -- when the United States is a service and information economy; when nearly all the manufacturing industries have slipped away to other countries; when awesome technological powers are in the hands of a very few, and no one representing the public interest can even grasp the issues; when the people have lost the ability to set their own agendas or knowledgeably question those in authority; when, clutching our crystals and nervously consulting our horoscopes, our critical faculties in decline, unable to distinguish between what feels good and what's true, we slide, almost without noticing, back into superstition and darkness...” - Carl Sagan

mdp2021 · 2025-06-09T10:10:59 1749463859

> we were honest

I am quite honest and the subset of users that fill your description - unconsciously treating text from deficient authors as tea leaves - have psychiatric issues.

Surely many people consult LLMs because of the value within their right answers, which exist owing to having encoded information and some emergent idea processing, and attempting to tame the wrong ones. They consult LLMs because that's what we have, limited as it is, for some problems.

Your argument falls immediately because people in the consultation of unreliable documents cannot be confused with people in the consultation of tools for other kinds of thinking: the thought under test is outside in the first case, inside in the second (contextually).

You have fallen in a very bad use of 'we'.

awongh · 2025-06-09T11:06:23 1749467183

> value within their right answers

The thing is that LLMs provide plenty of answers where "right" is not a verifiable metric. Even in coding the idea of a "right" answer quickly gets fuzzy- should I use CSS grid or flexbox here? should these tables be normalized or not?

People simply have an unconscious bias towards the output just like they have an unconscious bias towards the same answer given by two real people they feel differently about- That is, the sort of thing all humans do (even if you swear that in all cases you are 100% impartial and logical).

I think the impulse of ascribing intent and meaning to the output is there in almost all questions, it's just a matter of degrees (CSS question vs. meaning of life type question)

mdp2021 · 2025-06-09T17:38:36 1749490716

> LLMs provide plenty of answers where "right" is not a verifiable metric

I do not use them for that: I ask them for precise information. Incidentally, that one time in which I had to ask for a clever subtler explanation, it was possible to evaluate the quality of the answer - and I found myself pleasantly surprised (for once). What I said is, some people ask LLMs for information and explanation in absence of better and faster repositories - and it is just rational to do so. Those «answers where "right" is not a verifiable metric» are not relevant in this context. Some people use LLMs as <whatever>: yes, /some/ people. That some other people will ask LLMs fuzzy questions does not imply that they will accept them as oracles.

> bias ... all humans do

Which should, for the frame and amount in which the idea has some truth, have very little substantial weight and surely does not prove the "worshippers" situation depicted by the OP. You approach experience E in state S(t): that is very far from "wanting to trust" (which is just the twisted personality trait of some).

> the impulse of ascribing intent and meaning [...] meaning of life

First, of all, no: there seems to be no «intent and meaning» in requests like "what is the main export of Kyrgyzstan", and people who ask an LLM about the meaning of life - as if dealing with an intelligent part instead of a lacking thing - pertain to a specific profile.

If you have this addiction to dreaming, you are again requested to wake up. Yes, we know many people who stubbornly live in their own delirious world; they do not present themselves decently and they are quite distinct from people radicated in reality.

I am reading that some people as if anthropomorphize LLMs, some daemonize LLMs - some people will even deify them - it's just stochastics. Guess what: some people reify some other people - and believe they are being objective. The full spectrum will be there. Sometimes justified, sometimes not.

mdp2021 · 2025-06-09T17:56:25 1749491785

Addendum, because of real time events:

I am reading in a www page (I won't even link it, because of decency):

> The authors[, from the psychology department,] also found that ... dialogues with AI-chatbots helped reduce belief in misinformation [...] «This is the first evidence that non-curated conversations with Generative AI can have positive effects on misinformation»

Some people have their beliefs, but they can change them after discussing with LLMs (of all the ways). Some people are morons - we already knew that.

fragmede · 2025-06-09T19:13:29 1749496409

We knew that, but that doesn't help it when the moron is the one holding the metaphorical (or even literal) gun.

mdp2021 · 2025-06-10T00:34:54 1749515694

It surely does not help to suggest the lower decile or the median as representative of the whole.

"People cannot count past ten - see how difficult it is to visualize eleven". If the mudstuck wants to be «honest» with itself, shall he ask around to some """outliers""" in the right side of the curve and be surprised.

tim333 · 2025-06-09T12:14:35 1749471275

Maybe LLMs can be divinatory instruments but that sounds a bit highbrow going by my use.

I use it more as a better Google search. Like the most recent thing I said to ChatGPT is "will clothianidin kill carpet beetles?" (turns out it does by the way.)

thoroughburro · 2025-06-09T12:37:52 1749472672

Trusting LLM advice about poisons seems… sort of like being a test pilot for a brand new aerospace company with no reputation for safety.

tim333 · 2025-06-09T13:20:44 1749475244

I agree in general but it wasn't of much importance whether my carpet beetles died or not.

selfhoster11 · 2025-06-09T13:08:47 1749474527

Only if you don't check it against a classical search query later. Not to mention that all you might get is search results from slop farms that aren't concerned with safety or factuality - ones that were a problem before LLMs.

soraminazuki · 2025-06-10T00:07:47 1749514067

So you agree that you need to do thorough research and be careful about which sources you trust. At that point, why not jump straight to actual research and bypass the information laundering machine entirely? Even you acknowledge that you can't trust what it says.

selfhoster11 · 2025-06-13T13:58:00 1749823080

Yes, I do. Why not jump into actual research from scratch then?

1. To search, you need to know the right search terms. An LLM might, in a rare scenario produce a nonsensical answer that still contains two or three domain-specific terms that you can plug into a search engine. Pull the thread, and see where it takes you. You literally cannot do that with (current) search engines unless you already know the terms.

2. Because validation is far quicker and takes far less effort than researching from scratch. If an LLM tells you that "poison XYZ interferes with levels of X in blood, which inhibits pathway ABC, and you die", then you can easily verify whether poison XYZ interferes with levels of X in blood (which if it doesn't, you know the answer is incorrect), and whether if the levels of X in blood are too high or too low, then pathway ABC is inhibited (which if it isn't, you know the answer is incorrect. If you can verify both facts, then the LLM's answer is correct. You do two pinpoint search queries that give you an answer in 30 seconds each, instead of having to do the research yourself for a lot longer than that.

Zak · 2025-06-09T12:30:12 1749472212

This seems like the sort of question that's very likely to produce a hallucinated answer.

Interestingly, I asked Perplexity the same thing and it said that clothianidin is not commonly recommended for carpet beetles, and suggested other insecticides and growth regulators. I had to ask a follow-up before it concluded clothianidin probably will kill carpet beetles.

tim333 · 2025-06-09T13:18:34 1749475114

Yeah, as mentioned in another comment ChatGPT said "not generally effective" which I guess it hallucinated. It's actually a tricky question because the answer isn't really there in a straightforward way on the general web and I only know for sure because me and someone I know tried it. Although I guess a pesticide expert would find it easy.

Part of the reason is clothianidin is too effective at killing insects and tends to persist in the environment and kill bees and butterflies and the like so it isn't recommended for harmless stuff like carpet beetles. I was actually using it for something else and curious if it would take out the beetles as a side effect.

suddenlybananas · 2025-06-09T12:19:36 1749471576

Until it makes stuff up.

tim333 · 2025-06-09T12:26:00 1749471960

Well nothing's perfect. It actually got that one wrong - it said "not typically effective" which isn't true.

mort96 · 2025-06-09T13:35:06 1749476106

Nothing is perfect, but some things let you validate the answer. Search engines give you search results, not an answer. You can use the traditional methods for evaluating the reliability of a resource. An academic resource focused on the toxicity of various kinds of chemicals is probably fairly trustworthy, while a blog from someone trying to sell you healing crystals probably isn't.

When you're using ChatGPT to find information, you have no information if what it's regurgitating is from a high reliability source or a low reliability source, or if it's just a random collection of words whose purpose is simply to make grammatical sense.

AlecSchueler · 2025-06-09T13:52:44 1749477164

The most frustrating time I've had is asking it something, pointing out why the answer was obviously wrong, having it confirm the faulty logic and give an example of a portion of the answer now it would look like if it had used sound logic, followed by a promise to answer again without the accepted inaccuracy, only to get an even more absurd answer than the first time around.

sorokod · 2025-06-09T12:58:33 1749473913

Given the consistently declining quality of Google search, this is a low bar to pass.

voidhorse · 2025-06-09T13:33:24 1749476004

Great take. In my view, a major issue right now is that the people pushing these tools on the populace have never read Barthes and in many cases probably don't even know the name. If they had an inkling of literary and social history, they might be a bit more cautious and conscientious of how they frame these tools to people.

We are currently subject to the whims of corporations with absurd amounts of influence and power, run by people who barely understand the sciences, who likely know nothing about literary history beyond what the chatbot can summarize for them, have zero sociological knowledge or communications understanding, and who don't even write well-engineered code 90% of the time but are instead ok with shipping buggy crap to the masses as long as it means they get to be the first ones to do so, all this coupled with an amount of hubris unmatched by even the greatest protagonists of greek literature. Society has given some of the stupidest people the greatest amount of resources and power, and now we are paying for it.

msgodel · 2025-06-09T09:55:15 1749462915

This stuff wasn't an issue because older societies had hierarchy which checked the mob.

In a flat society every individual must be able to perform philosophically the way aristocrats do.

UncleOxidant · 2025-06-09T17:24:56 1749489896

We're seeing the effects of the flat society not being able to do this. Conspiracy theories, the return of mysticism (even things like astrology seem to be on the rise), distrust of experts, fear of the other, etc.

Kim_Bruning · 2025-06-09T20:06:47 1749499607

Just to be sure:

Sure: the Oracle of Delphi did have this entire mystic front end they laundered their research output through (presumably because powerpoint wasn't invented yet). Ultimately though, they were really the original McKinsey.

They had an actual research network that did the grunt work. They'd never have been so successful if the system didn't do some level of work.

I know you tripped on this accidentally, but it might yet have some bearing on this conversation. Look at the history of Ethology: It started with people assuming animals were automatons that couldn't think. Now we realize that many are 'alien' intelligences, with clear indicators of consciousness. We need to proceed carefully either way and build understanding, not reject hypotheses out-of-hand.

https://aeon.co/ideas/delphic-priestesses-the-worlds-first-p... (for an introduction to the concept)

Arainach · 2025-06-08T23:01:33 1749423693

>When we forget that, we get nonsense like "the chatbot told him he was the messiah," as though language could be blamed for the projection.

Words have power, and those that create words - or create machines that create words - have responsibility and liability.

It is not enough to say "the reader is responsible for meaning and their actions". When people or planet-burning random matrix multipliers say things and influence the thoughts and behaviors of others there is blame and there should be liability.

Those who spread lies that caused people to storm the Capitol on January 6th believing an election to be stolen are absolutely partially responsible even if they themselves did not go to DC on that day. Those who train machines that spit out lies which have driven people to racism and genocide in the past are responsible for the consequences.

kelseyfrog · 2025-06-08T23:20:44 1749424844

"Words have no essential meaning" and "speech carries responsibility" aren't contradictory. They're two ends of the same bridge. Meaning is always projected by the reader, but that doesn't exempt the speaker from shaping the terrain of projection.

Acknowledging the interpretive nature of language doesn't absolve us from the consequences of what we say. It just means that communication is always a gamble: we load the dice with intention and hope they land amid the chaos of another mind.

This applies whether the text comes from a person or a model. The key difference is that humans write with a theory of mind. They guess what might land, what might be misread, what might resonate. LLMs don’t guess; they sample. But the meaning still arrives the same way: through the reader, reconstructing significance from dead words.

So no, pointing out that people read meaning into LLM outputs doesn’t let humans off the hook for their own words. It just reminds us that all language is a collaborative illusion, intent on one end, interpretation on the other, and a vast gap where only words exist in between.

cratermoon · 2025-06-09T13:41:43 1749476503

This exact characterization was noted two years ago. https://softwarecrisis.dev/letters/llmentalist/

knighthack · 2025-06-11T02:25:32 1749608732

While I love this insightful analogue, your statement seems exactly like the kind of text you copy-pasted from some LLM, which you then regurgitated to Hacker news with some modifications.

It even ends with that trademark conclusion-style statement... which is a hallmark of ChatGPT output.

timewizard · 2025-06-09T19:08:07 1749496087

> and search for wisdom in the obscure.

There is nothing obscure about their outputs. They're trained on pre-existing text. They cannot produce anything novel.

> We've unleashed a new form of divination on a culture

Utter nonsense. You've released a new search mechanism to _some_ members of _some_ cultures.

> That's why everything feels uncanny.

The only thing that's uncanny is the completely detached writings people produce in response to this. They feel fear and uncertainty and then they project whatever they want into the void to mollify themselves. This is nothing new at all.

> it won't be half as fun.

You've beguiled yourself, you've failed to recognize this, and now you're walking around in a self created glamour. Drop the high minded concepts and stick with history. You'll see through all of this.

abaymado · 2025-06-10T00:28:17 1749515297

There have been a couple of instances, where I would try to debunk a conspiracy theory to a friend or family member and the next day, I would wake up to "Real/TikTok/Short" video with an AI narrating there exact argument. Most of the time, it's not even an LLM generated text turned to voice, rather an AI voice used to read a text the content creator has provided it. As long as it sounded like ChatGPT, Siri, "Her" combined with confirmation bias, people are treating these LLMs as their new oracles, as you may say.

bandoti · 2025-06-09T13:10:24 1749474624

Just as alchemists searched for the Philosopher’s stone, we search for Artificial General Intelligence.

pjmorris · 2025-06-09T18:51:05 1749495065

An automated Ouija board!

andy99 · 2025-06-08T21:38:44 1749418724

I agree with the substance, but would argue the author fails to "understand how AI works" in an important way:

  LLMs are impressive probability gadgets that have been fed nearly the entire internet, and produce writing not by thinking but by making statistically informed guesses about which lexical item is likely to follow another

Modern chat-tuned LLMs are not simply statistical models trained on web scale datasets. They are essentially fuzzy stores of (primarily third world) labeling effort. The response patterns they give are painstakingly and at massive scale tuned into them by data labelers. The emotional skill mentioned in the article is outsourced employees writing or giving feedback on emotional responses.

So you're not so much talking to statistical model as having a conversation with a Kenyan data labeler, fuzzily adapted through a transformer model to match the topic you've brought up.

While thw distinction doesn't change the substance of the article, it's valuable context and it's important to dispel the idea that training on the internet does this. Such training gives you GPT2. GPT4.5 is efficiently stored low- cost labor.

Al-Khwarizmi · 2025-06-08T22:02:01 1749420121

I don't think those of us who don't work at OpenAI, Google, etc. have enough information to accurately estimate the influence of instruction tuning on the capabilities or the general "feel" of LLMs (it's really a pity that no one releases non-instruction-tuned models anymore).

Personally my inaccurate estimate is much lower than yours. When non-instruction tuned versions of GPT-3 were available, my perception is that most of the abilities and characteristics that we associate with talking to an LLM were already there - just more erratic, e.g., you asked a question and the model might answer or might continue it with another question (which is also a plausible continuation of the provided text). But if it did "choose" to answer, it could do so with comparable accuracy to the instruction-tuned versions.

Instruction tuning made them more predictable, and made them tend to give the responses that humans prefer (e.g. actually answering questions, maybe using answer formats that humans like, etc.), but I doubt it gave them many abilities that weren't already there.

tough · 2025-06-09T10:37:23 1749465443

instruction tuning is like imprimating the chat ux into the models weights

its all about the user/assistant flow instead of just a -text generator- after it

and the assistant always tries to please the user.

they built a sychopantic machine either by mistake or malfeasance

rahimnathwani · 2025-06-09T19:05:56 1749495956

"it's really a pity that no one releases non-instruction-tuned models anymore"

Llama 4 was released with base (pretrained) and instruction-tuned variants.

hapali · 2025-06-08T22:49:11 1749422951

More accurately:

Modern chat-oriented LLMs are not simply statistical models trained on web scale datasets. Instead, they are the result of a two-stage process: first, large-scale pretraining on internet data, and then extensive fine-tuning through human feedback. Much of what makes these models feel responsive, safe, or emotionally intelligent is the outcome of thousands of hours of human annotation, often performed by outsourced data labelers around the world. The emotional skill and nuance attributed to these systems is, in large part, a reflection of the preferences and judgments of these human annotators, not merely the accumulation of web text.

So, when you interact with an advanced LLM, you’re not just engaging with a statistical model, nor are you simply seeing the unfiltered internet regurgitated back to you. Rather, you’re interacting with a system whose responses have been shaped and constrained by large-scale human feedback—sometimes from workers in places like Kenya—generalized through a neural network to handle any topic you bring up.

tim333 · 2025-06-09T13:51:09 1749477069

Sounds a bit like humans. Much data modified by "don't hit your sister" etc.

fragmede · 2025-06-09T20:30:28 1749501028

> and then extensive fine-tuning through human feedback

how extensive is the work involved to take a model that's willing to talk about Tianamen square into one that isn't? What's involved with editing Llama to tell me how to make cocaine/bombs/etc?

It's not so extensive so as to require an army of subcontractors to provide large scale human feedback.

meroes · 2025-06-08T21:59:01 1749419941

Ya I don’t think I’ve seen any article going in depth into just how many low level humans like data labelers and RLHF’ers there are behind the scenes of these big models. It has to be millions of people worldwide.

happy_dog1 · 2025-06-09T13:51:39 1749477099

There's a really fascinating article about this from a couple years ago that interviewed numerous people working on data labeling / RLHF, including a few who had likely worked on ChatGPT (they don't know for sure because they seldom if ever know which company will use the task they are assigned or for what). Hard numbers are hard to come by because of secrecy in the industry, but it's estimated that the number of people involved is already in the millions and will grow.

https://www.theverge.com/features/23764584/ai-artificial-int...

Interestingly, despite the boring and rote nature of this work, it can also become quite complicated as well. The author signed up to do data labeling and was given 43 pages (!) of instructions for an image labeling task with a long list of dos and don'ts. Specialist annotation, e.g. chatbot training by a subject matter expert, is a growing field that apparently pays as much as $50 an hour.

"Put another way, ChatGPT seems so human because it was trained by an AI that was mimicking humans who were rating an AI that was mimicking humans who were pretending to be a better version of an AI that was trained on human writing..."

meroes · 2025-06-09T18:14:48 1749492888

Solid article

simonw · 2025-06-08T22:04:39 1749420279

I'm really curious to understand more about this.

Right now there are top tier LLMs being produced by a bunch of different organizations: OpenAI and Anthropic and Google and Meta and DeepSeek and Qwen and Mistral and xAI and several others as well.

Are they all employing separate armies of labelers? Are they ripping off each other's output to avoid that expense? Or is there some other, less labor intensive mechanisms that they've started to use?

meroes · 2025-06-08T23:06:18 1749423978

There are middle-men companies like Scale that recruit thousands of remote contractors, probably through other companies they hire. There are of course other less known such companies that also sit between the model companies and the contracted labelers and RLHF’ers. There’s probably several tiers of these middle companies that agglomerate larger pools of workers. But how intermixed the work is and its scale I couldn’t tell you, nor if it’s shifting to something else.

I mean on LinkenIn you can find many AI trainer companies and see they hire for every subject, language, and programming language across several expertise levels. They provide the laborers for the model companies.

megaloblasto · 2025-06-09T12:38:47 1749472727

I'm also very interested in this. I wasn't aware of the extent of the effort of labelers. If someone could point me to an article or something where I could learn more that would be greatly appreciated.

whilenot-dev · 2025-06-09T13:16:33 1749474993

Just look for any company that offers data annotation as a service, they seem happy to explain their process in detail[0]. There's even a link to a paper from OpenAI[1] and some news about the contractor count[2].

[0]: https://snorkel.ai/data-labeling/#Data-labeling-in-the-age-o...

[1]: https://cdn.openai.com/papers/Training_language_models_to_fo...

[2]: https://www.businessinsider.com/chatgpt-openai-contractor-la...

happy_dog1 · 2025-06-09T14:14:08 1749478448

I added a reply to the parent of your comment with a link to an article I found fascinating about the strange world of labeling and RLHF -- this really interesting article from The Verge 2 years ago:

https://www.theverge.com/features/23764584/ai-artificial-int...

jenadine · 2025-06-08T22:45:54 1749422754

> produce writing not by thinking but by making statistically informed guesses about which lexical item is likely to follow another

What does "thinking" even mean? It turns out that some intelligence can emerge from this stochastic process. LLM can do math and can play chess despite not trained for it. Is that not thinking?

Also, could it be possible that are our brains do the same: generating muscle output or spoken output somehow based on our senses and some "context" stored in our neural network.

cwillu · 2025-06-09T14:24:23 1749479063

I'm sympathetic to this line of reasoning, but “LLM can play chess” is overstating things, and “despite not being trained for it” is understating how many chess games and books would be in the training set of any LLM.

While it's been a few months since I've tested, the last time I tested the reasoning on a game for which very little data is available in book or online text, I was rather underwhelmed with openai's performance.

JKCalhoun · 2025-06-09T12:21:05 1749471665

Many like the author fail to convince me because they never also explain how human minds work. They just wave their hand, look off to a corner of the ceiling with, "But of course that's not how humans think at all," as if we all just know that.

tsimionescu · 2025-06-09T15:16:55 1749482215

Well, if there's one thing we're pretty sure of about human cognition, it's that there's very few GPUs in a human brain, on account of the very low percentage of sillicon. So, in a very very direct sense, we know for sure that human brains don't work like LLMs.

Now, you could argue that, even though the substrate is different, some important operations might be equivalent in some way. But that is entirely up to you to argue, if you wish to. The one thing we can say for sure is that they are nothing even remotely similar at the physical layer, so the default assumption has to be that they are nothing alike period.

monkaiju · 2025-06-09T12:51:37 1749473497

First off, there's an entire field attempting to answer that question, cognitive science.

Secondly, the burden of proof isn't on cog-Sci folk to prove the human mind doesn't work like an llm, it'd be to prove that it does. From we do know, despite not having a flawless understanding on the human mind, it works nothing like an llm.

Side note: The temptation to call anything that appears to act like a mind a mind is called behavioral ism and is a very old cog-Sci concept, disproved many times over.

mjburgess · 2025-06-09T13:53:13 1749477193

Some features of animal sentience:

* direct causal contact with the environment, e.g., the light from the pen hits my eye, which induces mental states

* sensory-motor coordination, ie., that the light hits my eye from the pen enables coordination of the movement of the pen with my body

* sensory-motor representations, ie., my sensory motor system is trainable, and trained by historical envirionemntal coordination

* heirachical planning in coordination, ie., these sensory-motor representations are goal-contextualised, so that I can "solve my hunger" in an infinite number of ways (i can achive this goal against an infinite permutation of obstacles)

* counterfactual reality-oriented mental simulation (aka imagination) -- these rich sensory motor representatiosn are reifable in imagination so i can simulate novel permutaitons to the environment, possible shifts to physics, and so on. I can anticipate these infinite number of obsatcles before any have occured, or have ever occured.

* self-modelling feedback loops, ie., that my own process of sensory-motor coordination is an input into that coordination

* abstraction in self-modelling, ie., that i can form cognitive representations of my own goal directed actions as they succeed/fail, and treat them as objects of their own refinement

* abstraction across representation mental faculties into propositional represenations, ie., that when i imagine that "I am writing", the object of my imagination is the very same object as the action "to write" -- so I know that when I recall/imagine/act/reflect/etc. I am operating on the very-same-objects of thought

* facilities of cognition: quantification, causal reasoning, discrete logical reasoning -- etc. which can be applied both at the sensory, motor and abstract conceptual level (ie., i can "count in sensation" a few objects, also with action, also in intellection)

* concept formation: abduction, various various of induction, etc.

* concept composition: recursion, composition in extension of concepts, composition in intension, etc.

One can go on and on here.

Decribe only what happens in a few minutes of the life of a toddler as they play around with some blocks and you have listed, rather trivially, a vast universe of capbilities that an LLM lacks.

To believe an LLM has anything to do with intelligence is to have somewhat quite profoundly mistaken what capabilities are implied by intelligence -- what animals have, some more than others, and a few even more so. To think this has anything to do with linguistic competence is a proudly strange view of the world.

Nature did not produce intelligence in animals in order that they acquire competence in the correct ordering of linguistic tokens. Universities did, to some degree, produce computer science departments for this activity however.

gitaarik · 2025-06-11T08:20:51 1749630051

So you mean because we don't know how the human mind works, LLMs won't be far off?

crackalamoo · 2025-06-09T18:14:07 1749492847

Yes, 100% this. And even more so for reasoning models, which have a different kind of RL workflow based on reasoning tokens. I expect to see research labs come out with more ways to use RL with LLMs in the future, especially for coding.

I feel it is quite important to dispel this idea given how widespread it is, even though it does gesture at the truth of how LLMs work in a way that's convenient for laypeople.

https://www.harysdalvi.com/blog/llms-dont-predict-next-word/

leptons · 2025-06-09T18:23:43 1749493423

So it's still not really "AI", it's human intelligence doing the heavy lifting with labeling. The LLM is still just a statistical word guessing mechanism, with additional context added by humans.

MrZander · 2025-06-08T21:53:56 1749419636

This doesn't follow with my understanding of transformers at all. I'm not aware of any human labeling in the training.

What would labeling even do for an LLM? (Not including multimodal)

The whole point of attention is that it uses existing text to determine when tokens are related to other tokens, no?

daveguy · 2025-06-08T22:04:20 1749420260

The transformers are accurately described in the article. The confusion comes in the Reinforcement Learning Human Feedback (RLHF) process after a transformer based system is trained. These are algorithms on top of the basic model that make additional discriminations of the next word (or phrase) to follow based on human feedback. It's really just a layer that makes these models sound "better" to humans. And it's a great way to muddy the hype response and make humans get warm fuzzies about the response of the LLM.

MrZander · 2025-06-08T23:37:05 1749425825

Oh, interesting, TIL. Didn't realize there was a second step to training these models.

hexaga · 2025-06-09T07:03:48 1749452628

There are in fact several steps. Training on large text corpora produces a completion model; a model that completes whatever document you give it as accurately as possible. It's kind of hard to make those do useful work, as you have to phrase things as partial solutions that are then filled in. Lots of 'And clearly, the best way to do x is [...]' style prompting tricks required.

Instruction tuning / supervised fine tuning is similar to the above but instead of feeding it arbitrary documents, you feed it examples of 'assistants completing tasks'. This gets you an instruction model which generally seems to follow instructions, to some extent. Usually this is also where specific tokens are baked in that mark boundaries of what is assistant response, what is human, what delineates when one turn ends / another begins, the conversational format, etc.

RLHF / similar methods go further and ask models to complete tasks, and then their outputs are graded on some preference metric. Usually that's humans or a another model that has been trained to specifically provide 'human like' preference scores given some input. This doesn't really change anything functionally but makes it much more (potentially overly) palatable to interact with.

JKCalhoun · 2025-06-09T12:23:17 1749471797

Got 3½ hours? https://youtu.be/7xTGNNLPyMI

(I watched it all, piecemeal, over the course of a week, ha, ha.)

spogbiper · 2025-06-09T15:26:30 1749482790

i really like this guy's videos

here's a one hour version that helped me understand a lot

https://www.youtube.com/watch?v=zjkBMFhNj_g

throwawaymaths · 2025-06-09T14:26:49 1749479209

yeah, i think you dont understand either. rlhf is no where near the volume of "pure" data that gets thrown into the pot of data.

imiric · 2025-06-08T21:28:39 1749418119

This is a good summary of why the language we use to describe these tools matters[1].

It's important that the general public understands their capabilities, even if they don't grasp how they work on a technical level. This is an essential part of making them safe to use, which no disclaimer or PR puff piece about how deeply your company cares about safety will ever do.

But, of course, marketing them as "AI" that's capable of "reasoning", and showcasing how good they are at fabricated benchmarks, builds hype, which directly impacts valuations. Pattern recognition and data generation systems aren't nearly as sexy.

[1]: https://news.ycombinator.com/item?id=44203562#44218251

dwaltrip · 2025-06-08T22:18:29 1749421109

People are paying hundreds of dollars a month for these tools, often out of their personal pocket. That's a pretty robust indicator that something interesting is going on.

mmcconnell1618 · 2025-06-08T22:29:53 1749421793

One thing these models are extremely good at is reading large amounts of text quickly and summarizing important points. That capability alone may be enough to pay $20 a month for many people.

NoMoreNicksLeft · 2025-06-09T14:54:14 1749480854

Why would anyone want to read less and not more? It'd be like reading movie spoilers so you didn't have to sit through 2 hours to find out what happened.