Yeah, good point. I tried to make an experiment: 1 female, 9 males, assign a ran...

asksomeoneelse · 2025-05-20T15:20:37 1747754437

I suspect you have an issue in the way you select the top 2 when they are several elements with the same value.

I tried an implementation with the values being integers between 1 and 100, and I found stats close enough to yours (~51% for 10 elements, ~64% for 100 elements).

When using floating point or enforcing distinct integer values, I get 50%.

My probs & stats classes are far away, but I guess it makes sense that the more elements you have, the higher the probability of collisions. And then, if you naively just take the first 2 elements and the female candidate is one of those, the higher the probability that it's because her value is the highest and distinct. Is that a sampling bias, or a selection bias ? I don't remember...

matsemann · 2025-05-21T07:21:12 1747812072

You're correct! When using floats (aka having much less chance for collisions than hundred numbers with hundred participants) it's practically unbiased. Thanks for exploring this with me, a fun little exercise.