Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most practitioners seem to understand that what they are doing is creating executable models and they don't confuse the model based on numeric observations with the actual reality. This is why I very much do not like all the AI hype and how statistical models were rebranded as artificial "intelligence" because the people who are not aware of what the words mean get very confused and start thinking they are nothing more than computers executing algorithms to fit numerical data to some unspecified cognitive model.


> Most practitioners seem to understand that what they are doing is creating executable models and they don't confuse the model based on numeric observations with the actual reality.

I think you're being too optimistic, and I'm a pretty optimistic person. Maybe it is because I work in ML, but I've had to explain to a large number of people this concept. This doesn't matter if it is academia or industry. It is true for both management and coworkers. As far as I can tell, people seem very happy to operate under the assumption that benchmark results are strong indicators of real world performance __without__ the need to consider assumptions of your metrics or data. I've even proven this to a team at a trillion dollar company where I showed a model with lower test set performance had more than double the performance on actual customer data. Response was "cool, but we're training a much larger model on more data, so we're going to use that because it is a bit better than yours." My point was that the problem still exists in that bigger model with more data, but that increased params and data do a better job at hiding the underlying (and solvable!) issues.

In other words, in my experience people are happy to be Freeman Dyson in the conversation Calavar linked[0] and very upset to hear Fermi's critique: being able to fit data doesn't mean shit without either a clear model or a rigorous mathematical basis. Much of data science is happy to just curve fit. But why shouldn't they? You advance your career in the same way, by bureaucrats who understand the context of metrics even less.

I've just experienced too many people who cannot distinguish empirical results from causal models. And a lot of people who passionately insist there is no difference.

[0] https://news.ycombinator.com/item?id=40964328




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: