The Costs of HARKing: Does it Matter if Researchers Engage in Undisclosed Hypothesizing After the Results are Known?
While no-one’s looking, a Texas sharpshooter fires his gun at a barn wall, walks up to his bullet holes, and paints targets around them. When his friends arrive, he points at the targets and claims he’s a good shot.
In 1998, Norbert Kerr discussed an analogous situation in which researchers engage in undisclosed hypothesizing after the results are known or HARKing. In this case, researchers conduct statistical tests, observe their research results (bullet holes), and then construct post hoc hypotheses (paint targets) to fit these results. In their research reports, they then pretend that their post hoc hypotheses are actually a priori hypotheses. This questionable research practice is thought to have contributed to the replication crisis in science, and it provides part of the rationale for researchers to publicly preregister their hypotheses before they conduct their research. In a recent BJPS article, I discuss the concept of HARKing from a philosophical standpoint and argue against the view that it is problematic for scientific progress.
I begin my article by noting that scientists do not make absolute, dichotomous judgements about theories and hypotheses being “true” or “false.” Instead, they make relative judgements about theories and hypotheses being more or less true that other theories and hypotheses in accounting for certain phenomena. These judgements can be described as estimates of relative verisimilitude (Cevolani & Festa, 2018).
I then note that a HARKer is obliged to provide a theoretical rationale for their secretly post hoc hypothesis in the Introduction section of their research report. Despite being secretly post hoc, this theoretical rationale provides a result-independent basis for an initial estimate of the relative verisimilitude of the HARKed hypothesis. (The rationale is “result-independent” because it doesn’t formally refer to the current result. If it did, then the rationale’s post hoc status would no longer be a secret!) The current result can then provide a second, epistemically independent basis for adjusting this initial estimate of verisimilitude upwards or downards (for a similar view, see Lewandowsky, 2019; Oberauer & Lewandowsky, 2019). Hence, readers can estimate the relative verisimilitude of a HARKed hypothesis (a) without taking the current result into account and (b) after taking the current result into account, even if they have been misled about when the researcher deduced the hypothesis. Consequently, readers can undertake a valid updating of the estimated relative verisimilitude of a hypothesis even though, unbeknowst to them, it has been HARKed. Importantly, there’s no “double-counting” (Mayo, 2008), “circular reasoning” (Nosek et al., 2018), or violation of the use novelty principle here (Worrall, 1985, 2014), because the current result has not been used in the formal theoretical rationale for the HARKed hypothesis. Consequently, it’s legitimate to use the current result to change (increase or decrease) the initial estimate of the relative verisimilitude of that hypothesis.
To translate this reasoning to the Texas sharpshooter analogy, it’s necessary to distinguish HARKing from p-hacking. If our sharpshooter painted a new target around his stray bullet hole but retained his substantive claim that he’s “a good shot,” then he’d be similar to a researcher who conducted multiple statistical tests and then selectively reported only those results that supported their original a priori substantive hypothesis. Frequentist researchers would call this researcher a “p-hacker” rather than a HARKer (Rubin, 2017b, p. 325; Simmons et al., 2011). To be a HARKer, researchers must also change their original a priori hypothesis or create a totally new one. Hence, a more appropriate analogy is to consider a sharpshooter who changes both their statistical hypothesis (i.e., paints a new target around their stray bullet hole) and their broader substantive hypothesis (their claim). Let’s call her Jane!
Jane initially believes “I’m a good shot” (H1). However, after missing the target that she was aiming for (T1), she secretly paints a new target (T2) around her bullet hole and declares to her friends: “I’m a good shot, but I can’t adjust for windy conditions. I aimed at T1, but there was a 30 mph easterly cross-wind. So, I knew I’d probably hit T2 instead.” In this case, Jane has generated a new, post hoc hypothesis (H2) and passed it off as an a priori hypothesis. Note that, unlike our original Texas sharpshooter, Jane isn’t being deceptive about her procedure here (i.e., what she actually did): It’s true that she aimed her gun at T1. She’s only being deceptive about the a priori status of H2, which she secretly developed after she missed T1 (i.e., she’s HARKing). Importantly, however, Jane’s deception doesn’t prevent her friends from making a valid initial estimate of the verisimilitude of her HARKed hypothesis and then updating this estimate based on the location of her bullet hole:
“We know that Jane’s always trained indoors. So, it makes sense that she hasn’t learned to adjust for windy conditions. We also know that (a) Jane was aiming at T1, and (b) there was a 30 mph easterly cross-wind. Our calculations show that, if someone was a good shot, and they were aiming at T1, but they didn’t adjust for an easterly 30 mph cross-wind, then their bullet would hit T2’s location. So, our initial estimated verismilitude for H2 is relatively high. The evidence shows that Jane’s bullet did, in fact, hit T2. Consequently, we can tentatively increase our support for H2: Jane appears to be a good shot who can’t adjust for windy conditions. Of course, we’d also want to test H2 again by asking Jane to hit targets on both windy and non-windy days!”
For further information, please see:
For more of my work in this area, please see: https://sites.google.com/site/markrubinsocialpsychresearch/replication-crisis