> we are convinced that the ‘big data’ approach to NLU is not only psychologically, cognitively, and even computationally implausible
I wish they didn't just breeze by the psychological and cognitive perspectives. Because even if the giant corpus/giant language model approach can do a lot of understanding-related tasks under the right conditions ... you would never claim to be teaching a child English by having them look at endless piles of documents. You interact with them. You _show_ them things in the world that the words name. They get what they want faster when they learn to say what they want. They "understand" words and sentences in reference to things and experiences. What does it even mean for a network to "understand" language when not exposed to other representations of objects, actions, etc?
Suppose you could time travel and bring English language technical documents back to a Mesopotamian cult (I only care that they haven't seen the latin alphabet), and you got a community of priests to study them as scripture, and after years of study they could tell you which symbols were missing from a sequence, or which symbols appear in the same contexts, or generate a plausible sequence of symbols and flag implausible ones, but they had no idea what the words referred to -- would they have "understood" anything?
Suppose you read a scifi book which makes new words or even has some weird typographic trick to represent something that's definitely out of corpus (e.g. the Language in Embassytown), you can understand it regardless, not because your model tells you about the relationships between those words and other words, but because you have conceptual frames for the situations, actions, intents etc.
I wish they didn't just breeze by the psychological and cognitive perspectives. Because even if the giant corpus/giant language model approach can do a lot of understanding-related tasks under the right conditions ... you would never claim to be teaching a child English by having them look at endless piles of documents. You interact with them. You _show_ them things in the world that the words name. They get what they want faster when they learn to say what they want. They "understand" words and sentences in reference to things and experiences. What does it even mean for a network to "understand" language when not exposed to other representations of objects, actions, etc?
Suppose you could time travel and bring English language technical documents back to a Mesopotamian cult (I only care that they haven't seen the latin alphabet), and you got a community of priests to study them as scripture, and after years of study they could tell you which symbols were missing from a sequence, or which symbols appear in the same contexts, or generate a plausible sequence of symbols and flag implausible ones, but they had no idea what the words referred to -- would they have "understood" anything?
Suppose you read a scifi book which makes new words or even has some weird typographic trick to represent something that's definitely out of corpus (e.g. the Language in Embassytown), you can understand it regardless, not because your model tells you about the relationships between those words and other words, but because you have conceptual frames for the situations, actions, intents etc.