Most practitioners seem to understand that what they are doing is creating execu...

godelski · on July 15, 2024

> Most practitioners seem to understand that what they are doing is creating executable models and they don't confuse the model based on numeric observations with the actual reality.

I think you're being too optimistic, and I'm a pretty optimistic person. Maybe it is because I work in ML, but I've had to explain to a large number of people this concept. This doesn't matter if it is academia or industry. It is true for both management and coworkers. As far as I can tell, people seem very happy to operate under the assumption that benchmark results are strong indicators of real world performance __without__ the need to consider assumptions of your metrics or data. I've even proven this to a team at a trillion dollar company where I showed a model with lower test set performance had more than double the performance on actual customer data. Response was "cool, but we're training a much larger model on more data, so we're going to use that because it is a bit better than yours." My point was that the problem still exists in that bigger model with more data, but that increased params and data do a better job at hiding the underlying (and solvable!) issues.

In other words, in my experience people are happy to be Freeman Dyson in the conversation Calavar linked[0] and very upset to hear Fermi's critique: being able to fit data doesn't mean shit without either a clear model or a rigorous mathematical basis. Much of data science is happy to just curve fit. But why shouldn't they? You advance your career in the same way, by bureaucrats who understand the context of metrics even less.

I've just experienced too many people who cannot distinguish empirical results from causal models. And a lot of people who passionately insist there is no difference.

[0] https://news.ycombinator.com/item?id=40964328