Yes it's a real question, since there's nothing that says that a particular misclassification must happen. Watching cars go by on the road, one might suspect that at least one is driven by an alligator, but nothing says that it must be, per se, even the law of large numbers.
Nobody said this particular misclassification must occur. But there will be misclassifications, which is what your original question asked. Since you know the answer, why ask the question? That's why I asked you if what you asked was a real question.
Yes they did, I said that. But it was a claim made as a question, because I didn't know whether it was actually true. I still can't demonstrate formally why this would be so, because again, the reasoning and even veracity of the claim is still in question due to lack of anything but a hand-waved answer.
There is no need to formally demonstrate. The veracity is clearly not in question. It must be true, due to the existence of the article we are commenting on now.
If you want to argue for the general case, you can simply prove the negation is false. Since it is incorrect to say that a network trained with a tiny percentage of possible inputs will never misclassify, it is true that a network trained in such a way will eventually misclassify. This is bolstered by training any network and seeing they always will misclassify something.
> Yes they did, I said that.
You didn't say that. You said a misclassification would happen on some inputs. That is different from saying on these specific inputs.
Quite honestly, in a team that's stressed and run down to the line, checking for that particular classification in a model that has hundreds of classification targets can be really tricky.
Say you have 1000 classification targets. You have to produce a model that checks, for each target, the odds of it being classified as one of other 999.
You have to check, specifically, for "adult male as primate" out of a million potential combinations. And apply secondary business rules or optimizations to prevent that classifications.
So yes it's possible, but it's not cheap, simple or easy.
Facebook just decided to shove the model out the door and not worry about the consequences.
Quality engineering work, costs money and time. Facebook didn't spend it.
We don't actually know how to do that, or how rich is "rich enough." It's an open avenue of research to be able to extrapolate how well-tuned a neutral net is on data not in its training set.
Not to imply the problem is unsolvable, just that if an institution has zero tolerance for this mistake, the fix your describing is no guarantee it won't occur.