Yeah, good point. I tried to make an experiment: 1 female, 9 males, assign a random number between 1 and 100 to each of them. Then, checking only the cases where the female is in top 2, would we then expect that female to be better than the other male? My head says no, but testing it in code I end up with some bias around 51-52%? And if I make it 1 female and 99 men it's even greater, at ~64 %.
I suspect you have an issue in the way you select the top 2 when they are several elements with the same value.
I tried an implementation with the values being integers between 1 and 100, and I found stats close enough to yours (~51% for 10 elements, ~64% for 100 elements).
When using floating point or enforcing distinct integer values, I get 50%.
My probs & stats classes are far away, but I guess it makes sense that the more elements you have, the higher the probability of collisions. And then, if you naively just take the first 2 elements and the female candidate is one of those, the higher the probability that it's because her value is the highest and distinct. Is that a sampling bias, or a selection bias ? I don't remember...
You're correct! When using floats (aka having much less chance for collisions than hundred numbers with hundred participants) it's practically unbiased. Thanks for exploring this with me, a fun little exercise.
Maybe my code is buggy.