Let's say I flip a million different fair coins 20 times each. Then I analyze my findings and see that coin number 54 of them was heads all 20 times. I present my results to my peers saying, "Coin #54 is defective! The chances of this happening by chance are on in a million!" But I'd expect one of them to be heads every time flipping that many coins. It just happened to be coin #54.
The problem is we're not matching up our statistical analysis to our testing when we look for findings then naively test hypotheses based on those very findings.
Two-up ("Come in, spinner!") is a coin-flipping gambling game where people bet on whether a punter will come up double-heads or double-tails. The house makes it's money when neither come up five times in a row (with some variations).
Would using Bayesian statistics fix that? If the prior for a coin being defective to produce only heads is more than one in a million, then 54 is probably defective. If it's less than that, then your analysis doesn't conclude it's defective anyways.
Bayes can be hacked by ommitting info, but not by things like this, I believe.
It's not a bayesian versus NHST thing, it's just about doing the right test. You can do this kind of thing easily enough in a NHST kind of way. But yeah I think doing it in a bayesian manner is much easier to interpret and explain, and so requires less training, which is a HUGE boon for the crisis at hand.
You can do the test with NHST only if you know what the researcher had in mind when experimenting. That can lead to absurd results at times, such as your example. With Bayes, as long as no lies are told, your inference doesn't depend on the researcher's private thoughts.
You only need to know how the experiment was performed, same as you do for bayes. If they present the whole dataset, nothing is changed. If they cherrypick and only show the data for the coin that happened to be all heads saying that was their whole experiment, no amount of stats will help you, bayesian or otherwise.
No, if the coins are independent, the Bayesian is not fooled even when seeing only that coin. The argument was in my post above.
The inference in Bayes only depends on the data, and the flips of other coins doesn't make a difference if independent. Frequentist testing can depend on things like stopping rules and hypothesis tested, which aren't correlated to the actual truth and therefore should have zero effect on inference.
They can cherry pick and only show some flips of that coin, but then they really need to be outright lying or you'll ask why only some flips were reported.
The problem is we're not matching up our statistical analysis to our testing when we look for findings then naively test hypotheses based on those very findings.