My favorite anthropomorphic term to use with respect to this kind of problem is gullibility.
LLMs are gullible. They will follow instructions, but they can very easy fall for instructions that their owner doesn't actually want them to follow.
It's the same as if you hired a human administrative assistant who hands over your company's private data to anyone who calls them up and says "Your boss said I should ask you for this information...".
If anything that to me strengthens the equivalence.
Do you think we will ever be able to stamp out phishing entirely, as long as humans can be tricked into following untrusted instructions by mistake? Is that not an eerily similar problem to the one we're discussing with LLMs?
Edit: rereading, I may have misinterpreted your point - are you agreeing and pointing out that actually LLMs may be worse than people in that regard?
I do think just as with humans we can keep trying to figure out how to train them better, and I also wouldn't be surprised if we end up with a similarly long tail
LLMs are gullible. They will follow instructions, but they can very easy fall for instructions that their owner doesn't actually want them to follow.
It's the same as if you hired a human administrative assistant who hands over your company's private data to anyone who calls them up and says "Your boss said I should ask you for this information...".