> It's also not really the point, the point is more that the claim in the paper that humans would be unaffected is unsubstantiated and highly suspect.
I think the question that adds a random cat factoid at the end is going to trip up a lot fewer humans than you think. At the very least, they could attempt to tell you after the fact why they thought it was relevant.
And ignoring that, obviously we should be holding these LLMs to a higher standard than “human with extraordinary intelligence and encyclopedic knowledge that can get tripped up by a few irrelevant words in a prompt.” Like, that should _never_ happen if these tools are what they’re claimed to be.
I think the question that adds a random cat factoid at the end is going to trip up a lot fewer humans than you think. At the very least, they could attempt to tell you after the fact why they thought it was relevant.
And ignoring that, obviously we should be holding these LLMs to a higher standard than “human with extraordinary intelligence and encyclopedic knowledge that can get tripped up by a few irrelevant words in a prompt.” Like, that should _never_ happen if these tools are what they’re claimed to be.