Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because if one takes a large dataset with many variables and starts mining for correlations it's roughly equal to trying to get a random number generator to produce numbers in some specific pattern. More numbers, the likelier the result.

This is the reason experiment repeatability should be considered of crirical importance especially for 'gooey' disciplines like medical and psychological research where there is little more than statistics to go with. Was the experiment just mininf random noise for correlation - 'simple', just repeat the experiment. If results can't be repeated it's more likely the original authors just botched the experiment, massaged their data to look how they wanted or just got lucky with a random number sequence.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: