It's very very good at sounding like it understands stuff. Almost as good as actually understanding stuff in some fields, sure. But it's definitely not the same.
It will confidently analyze and describe a chess position using advanced sounding book techniques, but its all fundamentally flawed, often missing things that are extremely obvious (like, an undefended queen free to take) while trying to sound like its a seasoned expert - that is if it doesn't completely hallucinate moves that are not allowed by the rules of the game.
This is how it works in other fields I am able to analyse. It's very good at sounding like it knows what its doing, speaking at the level of a masters level student or higher, but its actual appraisal of problems is often wrong in a way very different to how humans make mistakes. Another great example is getting it to solve cryptic crosswords from back in the day. It often knows the answer already in its training set, but it hasn't seen anyone write out the reasoning for the answer, so if you ask it to explain, it makes nonsensical leaps (claiming birch rhymes with tyre level nonsense)
If anyone wants to see the chess comprehension breakdown in action, the YouTuber GothamChess occasionally puts out videos where he plays against a new or recently-updated LLM.
Hanging a queen is not evidence of a lack of intelligence - even the very best human grandmasters will occasionally do that. But in pretty much every single video, the LLM loses the plot entirely after barely a couple dozen moves and starts to resurrect already-captured pieces, move pieces to squares they can't get to, etc - all while keeping the same confident "expert" tone.
In the Catch me if you Can movie, Leo diCaprio’s character wears a surgeon’s gown and confidently says “I concur”.
What I’m hearing here is that you are willing to get your surgery done by him and not by one of the real doctors - if he is capable of pronouncing enough doctor-sounding phrases.
If that's what you're hearing, then you're not thinking it through. Of course one would not want an AI acting as a doctor as one's real doctor, but a medical or law school graduate studying for a license sure would appreciate a Socratic tutor in their specialization. Likewise, on the job in a technical specialization, a sounding board is of more value when it follows along, potentially with a virtual board of debate, and questions when logical drifts occur. It's not AI thinking for one, it is AI critically assisting their exploration through Socratic debate. Do not place AI in charge of critical decisions, but do place them in the assistance of people figuring out such situations.
The doctors analogy still applies, that "socratic tutor" LLM is actually a charlatan that sounds, to the untrained mind, like a competent person, but in actuality is a complete farce. I still wouldn't trust that.
Leo diCaprio's character says nothing of substance in that scene. If you ask an LLM a question about most subjects, it will give you a highly intelligent, substantive answer.
No. It will give you a long answer with correct grammar, punctuation, use of wide vocabulary, and persuasive sounding arguments. What we are learning nowadays is that among humans, that is highly correlated with highly intelligent, substantive answers from intelligent, practiced subject matter experts.
Among AIs, such text is merely correlated with having read a lot of literature. It's sometimes right. It's sometimes wrong. But you don't know, and any attempt to defer to "oh well it sounds persuasive", which may have served you okay with smart humans, will end up failing in spectacular and unpredictable ways.
I do not say this because I don't find AI's interesting, or even useful. They are, for tasks they are suited too. But there are so many people essentially arguing they are suited to all tasks, which they clearly aren't
You are seriously underselling what LLMs do nowadays. It's not just that the grammar is correct. In most cases, the answers are substantive and factually correct.
You can ask fairly complicated questions, and it will usually reason correctly and give you a high-quality answer. I ask about programming, physics and math, and it usually answers on the level of someone with a high level of training in those fields.
It sometimes fails in strange ways, but you can't just write off all of the high-quality answers LLMs give as nothing more than plausible-sounding English.
>We're not philosophizing here, we're talking about practical results and clearly, in the current context, it does not deliver in that area.
Except it clearly does, in a lot of areas. You can't take a 'practical results trump all' stance and come out of it saying LLMs understand nothing. They understand a lot of things just fine.
The current models obviously understand a lot. They would easily understand your comment, for example, and give an intelligent answer in response. The whole "the current models cannot understand" mantra is more religious than anything.
thats the point though, its not sufficient. Not even slightly. It constantly makes obvious mistakes, and cannot keep things coherent
I was almost going to explicitly mention your point but deleted it because I thought people would be able to understand.
This is not a philosophy/theology sitting around handwringing about "oh but would a sufficiently powerful LLM be able to dance on the head of a pin". We're talking about a thing, that actually exists, that you can actually test. In a whole lot of real-world scenarios that you try to throw at it, it fails in strange and unpredictable ways. Ways that it will swear up and down it did not do. It will lie to your face. It's convincing. But then it will lose in chess, it will fuck up running a vending machine buisness, it will get lost coding and reinvent the same functions over and over, it will make completely nonsensical answers to crossword puzzles.
This is not an intelligence that is unlimited, it is a deeply flawed two year old that just so happens to have read the entire output of human writing. It's a fundamentally different mind to ours, and makes different mistakes. It sounds convincing and yet fails, constantly. It will tell you a four step explanation of how its going to do something, then fail to execute four simple steps.
Which is exactly why is it insane that the industry is hell bent on creating autonomous automation through LLMs. Rube Goldberg machines is what will be created, and if civilization survives that insanity it will be looked back upon as one grand stupid era.
It will confidently analyze and describe a chess position using advanced sounding book techniques, but its all fundamentally flawed, often missing things that are extremely obvious (like, an undefended queen free to take) while trying to sound like its a seasoned expert - that is if it doesn't completely hallucinate moves that are not allowed by the rules of the game.
This is how it works in other fields I am able to analyse. It's very good at sounding like it knows what its doing, speaking at the level of a masters level student or higher, but its actual appraisal of problems is often wrong in a way very different to how humans make mistakes. Another great example is getting it to solve cryptic crosswords from back in the day. It often knows the answer already in its training set, but it hasn't seen anyone write out the reasoning for the answer, so if you ask it to explain, it makes nonsensical leaps (claiming birch rhymes with tyre level nonsense)