An oversimplification of his idea would be to say that given assumptions about the causal independence of variables, it becomes clear which way you should group the data. Although it's not always possible to make these independence assumptions, much of the time they are obvious and uncontroversial.
Taking the Berkeley gender bias case as an example: We know it's possible that biological gender influences which department graduates apply to, but that it's impossible that the department a graduate applies to influences their biological gender. This fact alone tells us that we need to look at the data by department rather than in aggregate, resolving the paradox.
Well, it certainly makes more sense to look at the by department data rather than the aggregate data. But I would say that the fact that we see such large disparities in which departments are applied to is already enough to refute the idea that male and female applicants are pretty much the same, and that therefore it would be pretty reckless to conclude any sex discrimination based on the difference in acceptance rates.
And Pearl's point is that the "real world" logic you just applied is both important and not statistical—you need to augment your analysis with these causal assumptions in order to translate probability into meaningful causal statements.
I've read a lot of his papers and I own Causality, but unfortunately it ended up at my parents and I always forget to grab it when I'm there. I really would like to get further into the book.
But this is a great, fast example of why his work is important.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34....
An oversimplification of his idea would be to say that given assumptions about the causal independence of variables, it becomes clear which way you should group the data. Although it's not always possible to make these independence assumptions, much of the time they are obvious and uncontroversial.
Taking the Berkeley gender bias case as an example: We know it's possible that biological gender influences which department graduates apply to, but that it's impossible that the department a graduate applies to influences their biological gender. This fact alone tells us that we need to look at the data by department rather than in aggregate, resolving the paradox.