Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unfortunately what is really tripping me up is when we draw that line. I get the principle of regression to the mean, and why it occurs. What I don't get is this particular manifestation of it.

Fair point about "no connection between a and b".

What I should have said was something like: Why is it important that a come before b chronologically? If we were mistaken, and we thought that b came first, then what we would be seeing is "progression from the mean".

Does the concept of regression to the mean depend on the chronology of events? That would be weird -- most probability doesn't, right?



Hmm, I read this:

http://en.wikipedia.org/wiki/Regression_toward_the_mean

And I guess I made an assumption about the situation you described. If the students were all answering in identical random ways, then you'd see what I describe. I think this part of the wikipedia article describes it well:

"Consider a simple example: a class of students takes a 100-item true/false test on a subject. Suppose that all students choose randomly on all questions. Then, each student’s score would be a realization of one of a set of i.i.d. random variables, with a mean of 50. Naturally, some students will score substantially above 50 and some substantially below 50 just by chance. If one takes only the top scoring 10% of the students and gives them a second test on which they again guess on all items, the mean score would again be expected to be close to 50. Thus the mean of these students would “regress” all the way back to the mean of all students who took the original test. No matter what a student scores on the original test, the best prediction of their score on second test is 50."

So, to your question, why is time important? Its important in the sense that you need the first test to determine who your "high flyers" are for the second experiment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: