bioinformatician here. nobody has intuition or domain knowledge on all ~20,000 p...

pks016 · 2024-10-31T22:06:20 1730412380

> Feed that into FDR correction, filter for p < 0.01

That's the domain knowledge. p-values are useful not the fixed cut-off. You know that in your research field p < 0.01 has importance.

fjkdlsjflkds · 2024-11-01T05:07:24 1730437644

> You know that in your research field p < 0.01 has importance.

A p-value does not measure "importance" (or relevance), and its meaning is not dependent on the research field or domain knowledge: it mostly just depends on effect size and number of replicates (and, in this case, due to the need to apply multiple comparison correction for effective FDR control, it depends on the number of things you are testing).

If you take any fixed effect size (no matter how small/non-important or large/important, as long as it is nonzero), you can make the p-value be arbitrarily small by just taking a sufficiently high number of samples (i.e., replicates). Thus, the p-value does not measure effect importance, it (roughly) measures whether you have enough information to be able to confidently claim that the effect is not exactly zero.

Example: you have a drug that reduces people's body weight by 0.00001% (clearly, an irrelevant/non-important effect, according to my domain knowledge of "people's expectations when they take a weight loss drug"); still, if you collect enough samples (i.e., take the weight of enough people who took the drug and of people who took a placebo, before and after), you can get a p-value as low as you want (0.05, 0.01, 0.001, etc.), mathematically speaking (i.e., as long as you can take an arbitrarily high number of samples). Thus, the p-value clearly can't be measuring the importance of the effect, if you can make it arbitrarily low by just having more measurements (assuming a fixed effect size/importance).

What is research field (or domain knowledge) dependent is the "relevance" of the effect (i.e., the effect size), which is what people should be focusing on anyway ("how big is the effect and how certain am I about its scale?"), rather than p-values (a statement about a hypothetical universe in which we assume the null to be true).

pks016 · 2024-11-01T15:24:24 1730474664

I get that in general. I was replying to the person taking about bioinformatics and p value as filter.