Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In a sample of 10,000, the number who have trait T9 is about 20. The chance of all 20 being in a randomly chosen half of the sample is about 1 in 500,000. It's quite unlikely that trait T9 is randomized this way, and it's also quite unlikely that the effects of just 20 T9 individuals on the outcome of interest are so strong that a typical analysis would find a significant difference between the treatment and control arms. The chance of both these things being simultaneously true is negligible. Do I misunderstand you?


No, I just wasnt clear.

The issue is that we have T1..Tn in an individual, so there's a very large number of ways you can get one group to have confounders.

The role of the powerlaw is to imply that the generative process which distributes these traits isnt "nice", so that one group can easily get a T9 that the other group doesnt have, and so on, for all T1...Tn

So you have this, let's say adversarial, background generative process which is giving you these confounding traits but never enough of each that you get nice mixtures.

You could see it as a problem of uniform sampling across many powerlaw factors to deliver uniform distributions of those factors. I havent written a simulation, but I don't see why this wouldnt be a serious problem for randomisation.


With a large sample size, it's astronomically unlikely for there to be a confounding trait that is important to the outcome and is also widespread in only one of the experimental arms. If you were to write a simulation showing the issue I might be able explain more specifically why I think the simulation doesn't reflect reality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: