The reason these companies don't fix these systems is because they don't know how. It is easier to remove certain outputs or retire the whole system. There is no line of code they can tweak.
Is there a sizable population of unemployed black engineers living in the United States to hire from? What if the qualified candidates simply don’t exist to fill the seats?
are you familiar with the h-1b visa system? there’s a shortage of qualified candidates across every metric you can name for top-band engineering roles in the US.
It’s a probabilistic system. It can give you correct results on one billion samples and still get the 1,000,000,001st case wrong. How do you test every single photo from the past and future?
> sorry it seems discussion about solutions to racial issues isn't welcome here
Oddly it seems like you are the person who doesn't want to have a discussion.
You stated "As an ML engineer I can tell you confidently here's how you fix these systems" and people critiqued your solution. Is that not being open to discussion?
Critiques and debates are fine, but downvotes don't encourage discussion, they scream "shut up" and disincentivize speaking about the issue ever again, and downrank a viewpoint instead of leaving it up for fair discussion.
Because presumably, if it works like many other systems, if you get downvoted enough, you probably get banned, so it means I should be scared to say what I said again.
If people were interested in debating like mature adults without downvoting then I would have left it up and continued the discussion.
The moderators here are exceedingly reasonable. Don’t worry about getting banned on account of downvotes unless you’re violating the guidelines. And the staff at hn@ycombinator.com is shockingly responsive for a site of this size. They really do want to encourage intellectual curiosity.
I second this but not for the conventional reasons of diversity but the subtleties diversity could be difficult to understand.
You need your consumer base to be represented in your product development system. In training an AI model we first test things with what we are personally have a bias for.
For example in training a stock prediction model, I am going to first test it with the most familiar stocks that I know or I have bias for. And I am going to adjust my model until I find the model being correct for my bias then test it across a larger dataset.
I don't work in AI but I know that test data is incredibly large and they supposedly systematically cover all bases. But what I am saying is this the chance of random events will go down when you deviate from a homogenous development team where everyone is biased towards the same things because they share cultural and racial overlaps.
Yes it's a real question, since there's nothing that says that a particular misclassification must happen. Watching cars go by on the road, one might suspect that at least one is driven by an alligator, but nothing says that it must be, per se, even the law of large numbers.
Nobody said this particular misclassification must occur. But there will be misclassifications, which is what your original question asked. Since you know the answer, why ask the question? That's why I asked you if what you asked was a real question.
Yes they did, I said that. But it was a claim made as a question, because I didn't know whether it was actually true. I still can't demonstrate formally why this would be so, because again, the reasoning and even veracity of the claim is still in question due to lack of anything but a hand-waved answer.
There is no need to formally demonstrate. The veracity is clearly not in question. It must be true, due to the existence of the article we are commenting on now.
If you want to argue for the general case, you can simply prove the negation is false. Since it is incorrect to say that a network trained with a tiny percentage of possible inputs will never misclassify, it is true that a network trained in such a way will eventually misclassify. This is bolstered by training any network and seeing they always will misclassify something.
> Yes they did, I said that.
You didn't say that. You said a misclassification would happen on some inputs. That is different from saying on these specific inputs.
Quite honestly, in a team that's stressed and run down to the line, checking for that particular classification in a model that has hundreds of classification targets can be really tricky.
Say you have 1000 classification targets. You have to produce a model that checks, for each target, the odds of it being classified as one of other 999.
You have to check, specifically, for "adult male as primate" out of a million potential combinations. And apply secondary business rules or optimizations to prevent that classifications.
So yes it's possible, but it's not cheap, simple or easy.
Facebook just decided to shove the model out the door and not worry about the consequences.
Quality engineering work, costs money and time. Facebook didn't spend it.
We don't actually know how to do that, or how rich is "rich enough." It's an open avenue of research to be able to extrapolate how well-tuned a neutral net is on data not in its training set.
Not to imply the problem is unsolvable, just that if an institution has zero tolerance for this mistake, the fix your describing is no guarantee it won't occur.