False positives: we hired a few folks who could prototype changes but struggled to ship them. This question wasn't solely to blame for it, but it wasn't enough of a signal on someone's ability to _thoroughly_ work within a large codebase.
False negatives: senior candidates who were very used to a particular programming environment (e.g. at Microsoft) and didn't have side projects that kept them up-to-speed with the basics of editing code on the command line over SSH. We did a lot of work over time to setup alternative environments for these candidates.
I'll add to your false negative case. I only write code over ssh in mostly emacs (and sometimes vim).
A couple years ago I was asked to interview via CoderPad and while I appreciate their attempt at having emacs and vim bindings, the fact that they're not actually accurate was worse than using Google Docs.
That is, it's actively worse to be almost your editor of choice than I think it would be to obviously be a "clumsy editor". In the former case, the interviewer perceives you as poor at writing code when you're fighting with a slightly drunk editor, while in the latter (like at a whiteboard) the interviewer adjusts for the environment.
Edit: the interviewers let me retry via projecting my screen while using emacs locally in a terminal. I hope CoderPad has improved it's key bindings since then, but I hope interviewers test out these editing environments before assuming people would be 100% proficient.
For my part, when someone's looking over my shoulder, I get too nervous to rely on basically any editor functionality except cut/copy/paste, undo, and save. I start second-guessing everything. So you may as well just give me Notepad, because I'm gonna look like an idiot in any editor when someone's watching and I'm not already quite comfortable with that person (and the inherent scrutiny/judgement aspect of interviews makes that impossible, in that context).
That's very valid. I was actually only thinking of having someone metaphorically looking over my shoulder, like in a remote interview, but literally having someone in the same room watching is even worse.
To me, purely as far as the using-my-editor part of it goes, it's a lot like having practiced an instrument to an OK level of skill, but only ever playing it alone, then being asked to perform for others, who will make decisions about my future in part based on that performance. So I fall back on playing "Mary Had a Little Lamb" because I'm too worried about screwing up in some stupid-looking way if I attempt something fancier, but that also makes me look like I'm bad at it.
(I’m planning to give the interview question a shot before returning to read any more of your replies.)