The anecdote provides evidence for people that are initially fooled by a phishing attack but aren't fooled enough to manually copy-paste credentials when autofill doesn't work.
Your argument about 2FA depends on how many of those people there are.
Therefore the anecdote is quite relevant, indirectly.
Any kind of experiment that doesn't involve 2FA at all is not relevant for this comparison.