The following comments are offered with no intent of finality or authority, but rather to (1) improve the accuracy of portrayal about what actually happened in the famous LHC Higgs experiments, and (2) point to questions the statistical set-up raised for those who use or defend use of P-values (such as myself) and those who use or defend NHST (not myself). I do not believe the realities of the experiments and the science are anywhere near as clear cut about this as some philosophers have made it sound.
That portrayal problem is characteristic of the misleadingly oversimplified descriptions of real scientific activities and results I see in much (not all) of the philosophy of science literature, especially in heroic accounts of bold conjectures and experiments in which the latter turn out to be a lot more muddy than presented (perhaps ‘cargo-cult philosophy of science’ would be a suitable label for that practice). The 1919 Eddington eclipse expedition test of general relativity is an example that seems obvious by today’s standards Trust in expert testimony: Eddington's 1919 eclipse expedition and the British response to general relativity - ScienceDirect ; from what I’ve read the LHC experiments provide a more subtle instructive illustration for the present controversies.
As I suspect no one here is a professional particle physicist or even close, I warn that my comments may contain inaccuracies about the LHC physics. But hopefully they correct some worse inaccuracies relevant to the present discussion. For a more extensive discussion of statistical issues in the Higgs/LHC experiments by a physicist see eg the article by Cousins (who is a UCLA member of the CMS group) in this special issue of Synthese Vol. 194, No. 2, February 2017 of Synthese on JSTOR
Part 1 - corrections and refinements. Some philosophers and popular accounts have (as above) technically misrepresented the Higgs/LHC experiments in ways which obscure some interesting issues. First, unfortunately the experiments were not completely independent: While they used different detectors, both of necessity had to use as their generating equipment the LHC. Any undetected defect in that equipment or in the calibration and programming of its subsidiary components capable of reliably producing a ~125 Gev signal that was non-Higgs would have affected both experiments. Having independent teams and detectors did ensure many (perhaps all the worrisome) sources of dependence were absence, but did not exclude every imaginable possibility. That they announced discovery anyway reflects their confidence that no such non-Higgs artefact was present; but that is an auxiliary assumption.
Second, the results were not predicted to exquisite precision; they were read off to rather modest precision (again, for Standard Model particle physics, which has gone beyond a dozen significant digits predictive accuracy at other points). They have been refined by repeated measurements since the landmark 2012 results, but are still only at about 0.1% accuracy (which is unheard-of for “soft” sciences but not so dramatic in particle physics).
Those are just the measurements though; from what I read the predictions were not even that accurate. Before the 2012 results there was quite a bit of uncertainty about the mass of the Higgs particle, as the standard-model predictions themselves depended on measurements whose uncertainties had to be propagated. This was a reason for two independent detectors: to get mass readings out of the LHC by two physically distinct means (ATLAS and CMS).
You can read about the realities at their own website: CMS measures Higgs boson’s mass with unprecedented precision.
Third, as a very small but essential detail in reading their results accurately, their statistical test results were presented as sigmas (Z-scores) corresponding to 1-tailed P-values. This is a mere transform, but the results were at times misreported in popular accounts as if the sigmas or subsidiary P-values were directly taken from a 2-sided test (as commonly done in “soft” sciences).
Part 2 - questions (not independent of part 1): The hypothesis H the experiments statistically tested is often presented as “Standard Model Without Higgs”, with a remark that apparently no one believed this H could be correct. That raises the question of why this H was tested, and especially if its use represented anything other than the habit of using a null. One mundane answer is that H was conveniently precise, whereas the alternative was fuzzy or full of uncertainty (as often the case in “soft” sciences). Was then the null H just a heuristic default?
To go deeper into the issue of alternative H and their detection, it’s been said we need to ask about the counterfactuals (which were potential outcomes before the experiments): Suppose “nothing had been found” as happened with some other theoretical particles subjected to LHC detection efforts (detection failures which for some reason were not played up in the popular press but fortunately were published in physics journals). We’d then have to ask what that “failure” meant, which is an infinite variety of possibilities ranging from (say) two highly compatible measurements but only 3 sigma to two vastly incompatible measurements with one 2 and the other 5 sigma to both within a sigma or two of H. Among the possibilities I’d seen mentioned for the “most null” potential outcomes were that the Higgs existed but was beyond the detection limit of the experiments, and that the Higgs did not exist after all and the Standard Model (despite its successes) simply could not explain why certain particles have mass, just like it cannot explain some other phenomena (like neutrino oscillation and baryon asymmetry). My bet is that had they found the experiments in conflict they would have done well to examine the P-value function (or sigma graph) for the difference in experiments to aid in screening explanations for the difference, along with of course searching for mechanical problems in one or both experiments.