I have discussed the need for statisticians to assure that the measurements to which they apply statistics are reproducible and based on evidence rather than opinion.
This is what happens when they do not do that and simply dutifully do the math using the measurements they are given.
Now everyone believes that Obstructive Sleep Apnea (OSA) is morbid. But OSA is quantified by a set of thresholds guessed in the 1970 & 80s. This is the “gold standard” Apnea Hypopnea Index.
Essentially this is a measurement which is based on addition of threshold 10 second events of various types and durations. A bucket of apples and oranges which have one thing in common…they are all at least 10 seconds long.
Here’s the calculation for an OSA RCT.
Count the number of 10 second airflow attenuations per hour of sleep. 5-15 mild, 15-30 moderate, greater than 30 Severe.
Sounds fairly easy to guess and it was. It works clinically because it is highly sensitive for OSA …but it doesn’t work as a gold standard because its correlation to the actual severity of pathophysiologic perturbations which occur during OSA is poor. For example a 10 second partial breathold counts the same as a 2 minute breathold with an oxygen desat into the 50s.
Yet 35+ years of RCT. 35+ years of compliant statisticians dutifully applying the math. Running the LR. Adjusting for who knows what. All a waste.
You see you cant guess a set of threshold criteria for measuring a disease. You can’t guess a duration like “10 seconds”. That is the number of fingers on the guessors hand but that’s not a clue.
Worse its hard to quantify complex pathophysiology by addition. That worked in ancient Egypt for coins (if they are of the same type) but not for human pathophysiology.
So I discovered the gold standard method of counting 10 second events was guessed when I was studying OSA in 1997. After a retrospective study I funded at OSU failed I revieed that raw data and I could see from the data that the AHI (which I thought was a solid gold standard) was caprecious so I decided to seek its origin. The internet was not available so a medical librarian and I tracked down the souce to Western J of Med 1976… Amazing the gold stand was guessed without data.
I thought the discovery that the gold standard for OSA was guessed was a revelation. No one cared. The RCTs flowed.
I said to the scientists “you will not be able to quantify the disease by using a guess”. “Eventually you will be unable to prove your own disease (OSA) is morbid.” No one cared. Yet, indeed, that is what has now happened.
Now what I predicted has proven true. Yet statiticians could have prevented this waste by simply asking for the origin of the gold standard (AHI) and asking for evidence of reproducibility/variability of the AHI gold standard. As the AHRQ article shows, the AHI is poorly reproducible and varys widely depeding on the definition of the hypopnea.
OSA science was, perhaps, the original threshold science. “Threshold Science” is a pathological science which emerged in the 1970s and uses guessed threshold sets as gold standards in RCT.
OSA science is the prototypic threshold science. As Langmuir described, pathological science looks real. The stats are sound but there is a flaw in the scientific method which occured in the past and no one knows it. (Here, the flaw was the guessed origin of the gold standard.) Often there are reinforcing temporary positive results due to threshold interactions. Yet, the science ultimately fails due to intractable nonreproducibility. It may take decades. With OSA it has taken nearly 4 decades.
Of course, OSA IS morbid. All pulmonary docs know that. In addition CPAP is the best treatment. Why can’t we prove that?
The reason is that OSA was conflated with the guessed AHI. The AHI is a highly variable guessed measurement which is NOT morbid.
Now what role did the staticians play? Well they applied LR and adjustments to a guessed set of thresholds. That does not make statistical sense. Its hard to blame them. Its not their job to investigate gold standards but maybe thats the lesson here. Perhaps, going forward, some due diligence relavant the gold standard should be a general requirement before agreeing to do the math. Then you are not questioning the investigator or the science, just following the rules.
In medicine we now have checklist to assure we do not miss an important care component.
Perhaps statisticians should have a checklist.
Gold Standard based on evidence- check
Gold Standard is reproducible- check