I don't think it makes sense to compare human learning to GPT-3 learning: it's a...

MAXPOOL · on July 19, 2020

> I don't think that you can conclude that humans learn more efficiency based on just quantity of data.

You are correct. My main argument was that in-distribution learning is not enough. You can's fix that problem with more data as many responses to my comment seem to assume.

I think out-distribution leaning and small data requirement are connected. If agent can understand the concept separate from the chain leading from sensory inputs, it can understand what it's doing in novel situation even without examples.

killerstorm · on July 19, 2020

Elements used in GPT-3 are capable of transformations such as abstraction (e.g. separate the structure of syllogism from concrete nouns) and logic (ReLU can directly implement OR, AND, NOT which is sufficient to do arbitrary logic).

We can see that it actually uses abstractions and logic in some cases.

E.g. "Bob is a frog. Bob's skin color is ___". Even small GPT-2 models can relate "Bob" to concept "frog" and query "frog" "skin color" attribute. Even basic language modeling requires inference, and GPT-x can inference using transformer blocks.

With more layers it can go from inferencing meaning of words to doing inference to solve problems.

But the inference it is able to do is limited in scope because of the structure of a language model -- each input token must correspond to one output token. So the model can't take a pause and think about something, it can only think while it produces tokens.

Here's an absolutely insane example of embedding symbolic computation into a story which lets GPT-3 to break computation into small steps it can handle. Intermediate results become part of the story: https://twitter.com/kleptid/status/1284069270603866113

https://twitter.com/kleptid/status/1284098635689611264

So I guess one can make a model which is much better at thinking simply by training in a different way or changing the topology. But the building blocks are good enough.

MAXPOOL · on July 19, 2020

The representation GPT-3 learns are flexible. It can learn very complex tasks, including logic and simple arithmetic. The issue is GPT-3 learning algorithm and it's learning capability.

Typically the ability to learn << ability to represent in most cases. Universal approximation theorem type results don't say anything about learnability. GPT-3's abilities quickly fade once it gets outside the distribution it was trained with.

BoiledCabbage · on July 19, 2020

> You can's fix that problem with more data as many responses to my comment seem to assume.

From my reading, you've asserted this to be true, but haven't given any more support to why then the people who have asserted it to be false.

GPT-3 isn't a human brain. I want to understand your argument as more than "planes can't fly because they don't flap their wings".

jkhdigital · on July 19, 2020

Yeah that’s what I was thinking when he started with the nonsensical questions, like “how many eyes does a foot have?” No (non-blind) human would need language input to learn this fact. Makes me wonder if anyone is working on large-scale architectures that can ingest multiple types of data and correlate them to make predictions.