Without wishing to denigrate the achievements of all the people involved in this...

ykler · on Nov 18, 2017

I don't know if you can really put a p-value on this result without a more specific null hypothesis, but anyway it looks like this tournament result provides extremely weak evidence that Stockfish is better than Houdini. In the round robin component, Stockfish and Houdini played two games against each other, each winning one and losing one. In the "superfinal", they had 15 draws, Stockfish won 3 games, and Houdini won 2 games.