It's 1994. Larry Llyod Mayer has read the entire internet, hundreds of thousands of studies across every field, and can answer queries word for word the same as modern LLMs do. He speaks every major language. He's not perfect, he does occasionally make mistakes, but the sheer breadth of his knowledge makes him among the most employable individuals in America. The Pentagon, IBM, and Deloitte are begging to hire him. Instead, he works for you, for free.
Most laud him for his generosity, but his skeptics describe him as just a machine that spits out words. A stochastic parrot, useless for any real work.
I do anticipate it, but in the situations I'm asked to do such calculations, I don't usually have the option of refusing, nor would I want to. For most real would situations, it's generally better to arrive at a ballpark solution than to refuse to engage with the problem.
In the very unserious hypothetical I'm describing, I'd say Lloyd's capabilities match that of GPT-4. In this case, he's not a calculator, but he is a decent programmer, so like GPT-4 he quickly runs the operation through a script, rather than trying to figure it out in his head.
I would be very careful to claim exactly that as emergent properties seem kinda crucial for artificial and human intelligences. (Not to say that they are equally functioning nor useful.)
What experiment or measurement could I do to distinguish between a machine that “knows” the truth and a machine that merely “spits it out”? I’m trying to understand your terminology here
It is just a machine that spits out words.