Hacker News new | past | comments | ask | show | jobs | submit login

If attributes such as age or gender are strong features for certain dependent variables that improve the predictive performance of a model, then it must be allowed to use these variables. For example, it is well known that colorectal cancer is more prevalent in men over the age of 50 - so a statistical model used to allocate free colorectal exams would favor old men. Is that discriminatory? Likewise, for any given variable, one will find certain features that favor men or women, young or old; finding these dependencies is precisely the point behind statistical models.



As you know, most AI is curve fitting or representational learning. So the question is, is the statistical distribution your model learning a static one, bound by natural laws like physics or biology, or are they human systems with changing distributions where the machine itself can cause effects and impacts. Your example is of the former, while the ML model predicting 'success at a job' or 'credit worthyness' falls in the latter category. The latter category is harder in every way and has ethical concerns because it necessarily can change (or usually keeps the same) the man made social systems.


On the other hand, if a model fails on certain populations because not enough training data including them was input because they're historically seen as a less important subgroup, then you've simply encoded your societal biases in your model. Understanding that difference and pointing out problem spots like that is a great job for an ethical AI researcher.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: