Hacker Newsnew | past | comments | ask | show | jobs | submit | kyllo's commentslogin

Only perfect multicollinearity (correlation of 1.0 or -1.0) is a problem at the linear algebra level when fitting a statistical model.

But theoretically speaking, in a scientific context, why would you want to fit an explanatory model that includes multiple highly (but not perfectly) correlated independent variables?

It shouldn't be an accident. Usually it's because you've intentionally taken multiple proxy measurements of the same theoretical latent variable and you want to reduce measurement error. So that becomes a part of your measurement and modeling strategy.


In other words, as economists say, because OLS is provably the BLUE (Best Linear Unbiased Estimator) aka the Gauss-Markov Theorem.


You make a good point, though the difference between ML and statistics isn't just about interpreting and validating the model. It's about the "novel discoveries" part aka Doing Science.

Statistical modeling is done primarily in service of scientific discovery--for the purpose of making an inference (population estimate from a sample) or a comparison to test a hypothesis derived from a theoretical causal model of a real-world process before viewing data. The parameters of a model are interpreted because they represent an estimate of a treatment effect of some intervention.

Methods like PCA can be part of that modeling process either way, but analyzing and fitting models to data to mine it for patterns without an a priori hypothesis is not science.


> A sailor is sailing her boat across the lake on a windy day. As the wind blows, she counters by turning the rudder in such a way so as to exactly offset the force of the wind. Back and forth she moves the rudder, yet the boat follows a straight line across the lake. A kindhearted yet naive person with no knowledge of wind or boats might look at this woman and say, “Someone get this sailor a new rudder! Hers is broken!” He thinks this because he cannot see any relationship between the movement of the rudder and the direction of the boat.

https://mixtape.scunning.com/01-introduction#do-not-confuse-...


Collider bias or "Berkson's Paradox" is a fun one, there lots of examples of it in everyday life: https://en.wikipedia.org/wiki/Berkson%27s_paradox


Indeed, causally linked variables need not be correlated in observed data; bias in the opposite direction of the causal effect may approximately equal or exceed it in magnitude and "mask" the correlation. Chapter 1 of this popular causal inference book demonstrates this with a few examples: https://mixtape.scunning.com/01-introduction#do-not-confuse-...


This. I bought toothpaste on Amazon, used it up, and the next time I went to buy the exact same toothpaste, the price had doubled.


Yes, they have a dark pattern where people buy a product and give good reviews, so suddenly they increase the price to benefit from that.


USPA has a tested division now and it's been gaining in popularity--it will soon be more popular than the untested division if it isn't already. Most of the top untested powerlifters have moved over to the WRPF (which does also have its own tested division). There are a lot of other smaller, regional, untested feds. Then there's the IPF and USAPL and their affiliates, which are fully tested, and are now far more popular than any of the untested feds. Untested might never go away, but tested has rapidly surpassed it in recent years.


Which is based not on your ability to produce value, but your ability to capture value and charge a cut of every unit, and is thus a massive disincentive to produce public goods.


It's often based on your ability to capture other people's value way more than creating value on your own.


When I lived in South Korea, one of the things that struck me was how much "flatter" the generations there were in terms of pop culture and music taste and awareness, compared to the US. I worked in an office with a bunch of suit-and-tie businessmen who were mostly in their 40s to 60s, and if you were to ask them about any current K-pop group, they all knew their hit songs.


That's sad. That means they had no diversity in their music.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: