Ask HN: Why are current LLMs not considered AGI?

gryfft · on Oct 17, 2023

At this point the thing holding ChatGPT back from universally-accepted "AGI" label is its few remaining sub-human skillsets, like forgetting things from too many tokens ago that a human would not forget.

My prediction is that over the course of the next 6-48 months, we'll see the emergence of LLMs with "working memory," "short term memory," and "long term memory," with working memory being more or less current LLM capabilities, short term memory being made up of a fast one-shot summarization which then gets temporarily stored raw on disk, and long-term storage getting transcribed into a LORA-like module overnight based on perceived importance of the short term memories.

I think emotion analogues will be important for the last part, as emotion processing plays a big role in memory formation (this is an adaptation: we more strongly remember things that we had strong emotions about because they're more important to us.)

So, 6-48 months to computer systems that feel (/have emotion analogue) and sleep to dream (/summarize into long-term storage overnight.)

Those developments, I'm confident, will absolutely silence anyone who says it's not "real" AGI. But then, at that point, you can potentially have built a being that can have feelings about its own existence, and then things get Interesting.

Someone · on Oct 17, 2023

> At this point the thing holding ChatGPT back from universally-accepted "AGI" label is its few remaining sub-human skillsets

How do we know there are a few remaining ones to reach AGI and not zillions? How do we know adding “memory” is sufficient for it to acquire them?

I think the answers to those questions are “we don’t have the faintest idea” and “we know it won’t be enough”.

As to the latter, one thing LLMs cannot do that we think is essential for human intelligence is to think logically. They can regurgitate logic from memory, but they can’t come up with correct original ideas, unless by accident.

beanbean01 · on Oct 19, 2023

I suspect that a great many of the people who say AGI is just around the corner and are giving <2030 timelines are saying that because they want it to be around the corner.

latexr · on Oct 17, 2023

Just like self-driving cars, we’re likely to hear people say “AGI is just around the corner” for an unreasonable amount of time.

philippta · on Oct 17, 2023

Thanks a lot for that detailed explanation. That makes perfect sense.

I wonder if all these additional capabilities can be bolted onto existing LLMs or need another iteration to the transformer architecture plus retraining.

ilaksh · on Oct 17, 2023

LLMs are not the end of AI research. See LMM (large multimodal model), cognitive architecture, spiking neural networks, I-JEPA, etc. There are going to be multiple totally different types of AI that may be called be AGI, depending on who you ask.

For emotions, see Pei Wang's research.

latexr · on Oct 17, 2023

> That makes perfect sense.

On the contrary, it has multiple glaring flaws. When you consider that you can ask an LLM one question and get a nonsensical answer, saying that stopping them from forgetting will bring them closer to “universally-accepted” AGI has no basis in reality. Humans can’t even universally agree that the Earth is not flat, it is a pipe dream to think LLMs will bring any consensus in a few months.

Another poster points out other issues with the answer: https://news.ycombinator.com/item?id=37915367

d--b · on Oct 17, 2023

Because people think it can't infer anything about things it hasn't seen before. Like it can't build an internal model of some problem that it hasn't been trained for.

> let's play fizz buzz but with 3 and 7

ChatGPT

Sure, let's play Fizz Buzz with 3 and 7. The rules are:

Instead of saying "Fizz" for multiples of 3, you say "Fizz." Instead of saying "Buzz" for multiples of 7, you say "Buzz." If a number is a multiple of both 3 and 7, you say "FizzBuzz." Let's start counting. I'll write the sequence, and you tell me the appropriate response for each number.

1, 2, Fizz, 4, Buzz, Fizz, Fizz, Buzz, Fizz, 11, Fizz, Fizz, Buzz, Fizz, 16, FizzBuzz, 18, 19, Buzz, Fizz, 22, 23, Fizz, Buzz, 26, Fizz, Fizz, Buzz, 29, FizzBuzz, 31, 32, Fizz, Buzz, Fizz, 34, Buzz, Fizz, 37, Fizz, Buzz, Fizz, Fizz, Buzz, 41, Fizz, FizzBuzz, 44, Buzz, Fizz, 46, Fizz, 49, Buzz, Fizz, 52, Fizz, Fizz, Buzz, 56, Fizz, 58, 59, Buzz, FizzBuzz, 61, 62, Fizz, Buzz, Fizz, 64, Buzz, Fizz, 67, Fizz, Buzz, Fizz, 71, Fizz, FizzBuzz, 74, Buzz, Fizz, 76, Fizz, 79, Buzz, Fizz, 82, Fizz, Fizz, Buzz, 86, Fizz, 88, 89, Buzz, FizzBuzz, 91, 92, Fizz, Buzz, Fizz, 94, Buzz, Fizz, 97, Fizz, Buzz.

Great job! Feel free to continue the sequence or let me know if you'd like to do something else.

jpnelson · on Oct 17, 2023

Did you use Chat GPT 3.5 or Chat GPT 4? GPT 4 solves this correctly when I ran it (but admittedly, by nature of the way it works non deterministically, it might have failed for you but worked for me)

d--b · on Oct 17, 2023

latexr · on Oct 17, 2023

That answer has so many errors it’s embarrassing. It’s egregiously wrong. Even the statement of the rules it’s going to follow has errors.

By the time it gets to 10 it states it as 11 and remains wrong until it self-corrects by 29, but then thinks there are three numbers between 32 and 34 and the errors keep going. It doesn’t even correctly substitute Fizz and Buzz in the most basic places. It’s a mess that even a child learning the game for the first time would avoid, or at least stop and reevaluate.

harperlee · on Oct 17, 2023

I would say, based on experience, that kids aged 3 or 4 would be similarly nonsensical, changing number ordering, improvising without any consideration for self-consistency, and heavily focused in the present irrespective of what has just happened.

latexr · on Oct 17, 2023

> I would say, based on experience, that kids aged 3 or 4 would be similarly nonsensical

Is that how OpenAI positions ChatGPT? As useful as a four year old knowledge worker? Because that is not useful for the tasks people are giving it.

If the best defence that can be mustered to outright wrong answers is that a baby would make similar errors, there’s no point in even discussing OP’s AGI question. A good faith argument would recognise that when comparing to a child, we’re considering at a minimum the age and mental capacity to count and understand what a “multiple of” is.

jfoster · on Oct 17, 2023

It's also clearly demonstrating more intelligence than we would generally expect from other animals.

latexr · on Oct 17, 2023

By that logic, autocorrect and Markov chains would be considered intelligent. As the quote goes: “If you judge a fish by its ability to climb a tree…”

og_kalu · on Oct 17, 2023

4 handles this just fine. This is one of the most frustrating parts of these discussions. Making sweeping claims on the abilities of LLMs and the evaluator isn't even bothered to use the current state of the art.

latexr · on Oct 17, 2023

It’s also worrisome that someone who is making a specific point about the output of an LLM didn’t spend a couple of seconds to verify if their results aligned with their beliefs. If they didn’t do it in this situation, I have zero trust they do any kind of verification for other uses. The worse part is that is probably going to be the norm.

d--b · on Oct 19, 2023

Whatever, man.

Just did it with "Bing with GPT-4", not sure what that uses, and I refuse to pay OpenAI to do this so...

Anyways, results are :

Sure, let’s play Fizz Buzz with 3 and 7. Here are the first 15 numbers:

1, 2, Fizz, 4, 5, Fizz, 7, 8, Fizz, 10, 11, Fizz, 13, Buzz, Fizz

In this game:

Numbers divisible by 3 are replaced by “Fizz” Numbers divisible by 7 are replaced by “Buzz” Numbers divisible by both become “FizzBuzz” Would you like to continue or change the rules?

It's better but still wrong.

I tried with 2 and 7, it was wrong too.

og_kalu · on Oct 20, 2023

https://sl.bing.net/hafEGFSHcT6

dTal · on Oct 17, 2023

They aren't "general" enough. Sure they're competent at solving problems represented in the training data, and can even in some cases abstract over them and find patterns. But they don't have "slow thought". They can't think without talking, and they can only talk like a human, and humans don't habitually narrate their train of thought, so there's a limit to how far "let's think step by step" can take them. The result is that they can't abstract recursively - since they are architecturally incapable of "thinking harder" about a problem, there will always be some threshold of input novelty that loses them, and right now that threshold is actually unimpressively low when you get down to brass tacks.

og_kalu · on Oct 17, 2023

Transformers with memory can abstract recursively in theory. https://arxiv.org/abs/2301.04589.

If it's just extra computation that's the kick then simply more tokens will suffice. You can even even implement a computation token if you wish. https://arxiv.org/abs/2310.02226

Frankly, the ability to truly abstract recursively is by no means necessary either. Humans can't actually do this. Either in every day practice (you will get bored, you will lose interest) or in theory (flawed memory, finite memory, you will die). Limited recursive abilities =/ trivial recursive abilities.

The current state of the art is easily general enough by actual testable definitions/baselines.

danielmarkbruce · on Oct 17, 2023

This is somewhat solved by having an "agent" use an LLM as a train of thought, along with a few other systems (like a runtime to do some calcs, some apis to do other things etc).

beanbean01 · on Oct 19, 2023

Here's the problem with LLMs: https://arxiv.org/abs/2301.06627 Basically, they're missing a lot of the brain machinery required to function. For example, if you ask them to solve a math problem, they do just fine ... until you ask them to apply an inference rule on top of it that takes them outside of their training set. The result is something that LOOKS like AGI until you realize it's read the entire Internet.

ilaksh · on Oct 17, 2023

Everyone has a different definition of what AGI means and no one realizes it or specifies.

The people who have a worldview that aligns with strong artificial intelligence often want to call whatever AGI, depending on their definition, but are afraid to do so because they will be ridiculed by the "non-believers".

The opposite worldview sometimes just moves the goalposts as soon as some capability is unlocked.

og_kalu · on Oct 17, 2023

Some people already believe this. https://www.noemamag.com/artificial-general-intelligence-is-...

Indeed, by testable definitions of GI (i.e all or nearly all humans would also pass), the current state of the art is agi.

danielmarkbruce · on Oct 17, 2023

1 - Some people do consider them AGI, or at least that an agent using an LLM as one part of a system is AGI. I do.

2 - I have some sense that OpenAI already has a system (that they haven't released) that many reasonable people will consider AGI.

smoldesu · on Oct 17, 2023

> I have some sense that OpenAI already has a system (that they haven't released) that many reasonable people will consider AGI.

How would you define that? I don't think multimodality is a high-enough bar to say something is defacto AGI.

danielmarkbruce · on Oct 18, 2023

I think it's "you know it when you see it", and, clearly everyone doesn't recognize it the same way so I say "many reasonable people will call it that" as opposed to actually defining it.

aristofun · on Oct 17, 2023

Because there is no agreement on what is ordinary "intelligence" is, let alone artificial one.

That's why.

_qfi9 · on Oct 17, 2023

peter norvig agrees with you

thiago_fm · on Oct 18, 2023

LLMs aren't smart at all, anybody serious in the AI field understand so many of its limitations. I wouldn't even use the word "intelligence" and LLM in the same sentence, much less AGI.

You're just impressed that it can write well and 'sounds' precise, but it's the effects of a lot of RLHF, transformers and H100s, not something capable of solving humanity's biggest problems or greatly improving our comprehension of the universe.

It isn't precise at all, and if you'd give it a whole afternoon, it would still come up with the same stupid solution, meanwhile you give a human an afternoon, and they might come up with something, that has also common sense.

Go read about objective-based AI or other subjects in the field. It's much more promising than that parrot powered by h100s :-)

Sure, there are investors throwing money at any company doing their niche LLMs or whatever, but it's snake oil at best.

LLMs are just a new interesting interface for humans for computers and data. We need other AI-related fields to develop, for us to unlock the real power of "intelligence". Computers are just as smart as they were in 2010s.