Many claims made by researchers often fail to replicate when tested rigorously. In short, these claims can be wrong. There is a need for students (and citizens) to understand how incorrect claims can come about in research by using questionable research practices. Ask a lot of questions of a data set and make a claim if any p-value is less than 0.05 is p-hacking. HARKing is making up a claim/narrative after looking at the data set. Statistical experts know about p-hacking and HARKing, but they appear to be largely silent. Some researchers know too but they ignore the problem. We present a hands-on demo about rolling ten-sided dice multiple times to show how incorrect claims come about. Several individuals executed simulations of p-values with ten-sided dice and show how easily a small p-value can come about by chance for a modest number of questions (rolls of the dice). Notably, small p-values found by one individual were not replicated by other individuals. These simple simulations allow students (and citizens) to better judge the reliability of a science claim when multiple questions are asked of a data set.
➤ Version 1 (2022-01-21) |
Stan Young, Warren Kindzierski and Terry Meyer (2022). Understanding p-hacking and HARKing. Researchers.One. https://researchers.one/articles/22.01.00007v1
Ryan MartinFebruary 17th, 2022 at 12:30 am
Thanks for the contribution! This is a really nice illustration of challenges facing data-driven science. There's lots of data out there, and we have access to powerful computational and statistical tools to store and analyze that data; but if those statistical methods are used incorrectly -- knowingly or not -- then what we end up with is useless at best and misleading at worst. The authors' presentation, where the numerical results are obtained by a mechanism so clearly unrelated to the medical condition and causal factors, makes their main point much more transparent than if their data were mysteriously simulated and analyzed on a computer.
My only small complaint concerns the authors' statement in the abstract that statisticians "appear to be largely silent" on issues of p-hacking, etc. I don't think that's true -- among others, there has been at least one (maybe two?) recent special issue of The American Statistician dedicated to papers proposing to help get a handle on these problems, and at least one report from an American Statistical Association-appointed task force charged with coming up with practical recommendations for statistical inference. I'd agree that there's been no resolution, but not because statisticians are silent.
Ryan
© 2018–2025 Researchers.One