Don’t miss discussion of “Pathological Consensus” on July 14th
I’ve been contemplating the next step.
The thought leaders now see the past pathologic consensus methodology has failed but I don’t think they know why. They seem to think is garden variety (but very severe) heterogeneity. They are going to abandon the methodology but it wont be done quickly. Many will not get the word. It could take 3-5 years
I think what we need is a “RCT Methodology Recall”. Analogous to an automobile recall a RCT methodology recall is promulgated widely. This failed #PathologicalScience methodology was widely promulgated and mandated so the recall has to be widely promulgated lest many continue to do the failed research (drive the unsafe car), .
It’s the 21st century.
Let us call for the first RCT methodology recall in the history of woman
.
What do you think? .
“The primary thing that stands between knowing the right thing and doing the right thing, is fear.”
video provides about 19 min talk about RCT methodology and an 30 year era of failure in crit care RCT. The great new is that this can be fixed promptly.
We have a great opportunity for change. I hope many will spend the time to listen and help. Please provide feed back either here or in DM. Lets get this done…
I’ve watch the first part and based on that can very much recommend the video.
Thank you for posting the video. I think that the issue of selecting subjects for RCTs also has implications for how diagnostic criteria are currently selected by consensus and the resulting problem of over-diagnosis. What are your thoughts about how diagnostic and RCT entry criteria should be established in a systematic and ‘evidence-based’ way?
Thank you again for posting the video. I completely agree with you that arriving at diagnostic and treatment criteria in a non-evidence based consensus manner has caused all sorts of problems. This also means that the foundations of EBM are undermined. Another of the unfortunate consequences is over-diagnosis and over-treatment. The following abstract is based on a talk I gave at a workshop in Oxford in 2017 entitled ‘The scope and conventions of evidence-based medicine need to be widened to deal with “too much medicine” [1].
Abstract: In order that evidence‐based medicine can prevent “too much medicine”, it has to provide evidence in support of “gold standard” findings for use as diagnostic criteria, on which the assessment of other diagnostic tests and the outcomes of randomized controlled trials depend. When the results of such gold standard tests are numerical, cut‐off points have to be positioned, also based on evidence, to identify those in whom offering a treatment can be justified. Such a diagnosis depends on eliminating conditions that mimic the one to be treated. The distributions of the candidate gold standard test results in those with and without the required outcome of treatment are then used with Bayes rule to create curves that show the probabilities of the outcome with and without treatment. It is these curves that are used to identify a cut‐off point for offering a treatment to a patient and also to inform the patient’s decision to accept or reject the suggested treatment. This decision is arrived at by balancing the probabilities of beneficial outcomes against the probabilities of harmful outcomes and other costs. The approach is illustrated with data from a randomized controlled trial on treating diabetic albuminuria with an angiotensin receptor blocker to prevent the development of the surrogate end‐point of “biochemical nephropathy”. The same approach can be applied to non-surrogate outcomes such as death, disability, quality of life, relief of symptoms, and their prevention. Those with treatment‐justifying diagnoses such as “diabetic albuminuria” usually form part of a broader group such as “type 2 diabetes mellitus”. Any of these can be made the subject of evidence‐based differential diagnostic strategies.
In the Oxford Handbook of Clinical Diagnosis, I also suggest a way forward for medical students, medical scientists and doctors interested in a research career who need to work closely with statisticians. The maths has been improved and updated from that in the above paper in the final chapter of the 4th edition (that I am finalising at present) some of which having been described in my recent posts here on DataMethods [see links 2, 3 and 4 below]. I would be grateful for your views about how this corresponds with your vision.
Reference
1. Llewelyn H. The scope and conventions of evidence-based medicine need to be widened to deal with “too much medicine”. J Eval Clin Pract 2018; 24:1026-32. doi:10.1111/jep.12981 pmid:29998473. [The scope and conventions of evidence‐based medicine need to be widened to deal with “too much medicine” | Semantic Scholar]
2. Should one derive risk difference from the odds ratio? - #340 by HuwLlewelyn
3. The Higgs Boson and the relationship between P values and the probability of replication
4. The role of conditional dependence (and independence) in differential diagnosis
This is very well said and presents another problem with “Pathological Consensus”, the amplification of disease and therefore amplification of the treated population “too much medicine” and therefore amplification of expense and unnecessary side effects. The recognition of the excess expense should be motivation alone for reforming the process of deriving consensus definitions.
That’s a great question and to answer it let us first look at sleep apnea. Sleep apnea also called the sleep apnea hypopnea syndrome (SAHS) comprises at least 6 different types of arrhythmia of breathing. Yet all are diagnosed and quantified by a simple sum of the apneas and hypopnea (10 second complete or partial breath holds). This 10 second rule was guessed for apnea in 1975 and hypopneas were added in the 1980s.
The original cutoff for the diagnosis was 30/per night and this was later changed to 10/hour of sleep. Both the use of only 10 seconds and the addition of only partial breath holds were inflationary. However, the consensus group decided that 10/hour was too high and reduced the threshold to 5/hour. This was profoundly inflationary rendering a massive portion of the population “diseased” and greatly decreasing the signal to noise…
The need to reduce to 5/hour was due to the fact that apneas occur in paroxysms (rapidly cycling clusters) so severe paroxysms of very long apnea could be present but fail to contain enough in number to meet the average of 10/hour rule. Here is an image of a paroxysm as detected by high resolution pulse oximetry.
This paroxysm has 17 apneas of duration 1-2 minutes in this image. However 17 ten second “baby apnea” would render the same result since all the AHI guessed gold standard method does is count as if these are simply coins of the same denomination… This is what I discovered in 1998 when I identified severe paroxysms with long apneas in patients with less than the cutoff average of 10 apnea per hour. This is what caused me to examine the origin of the consensus and found it was all guessed,
While changing the cutoff to 5 ten second events /hr. captured most of the cases of severe but brief paroxysms it also captured a massive population of patients.
So looking at this process we see the mistake. The time series data were not analyzed to identify the different types of sleep apnea and the measurement of severity by counting threshold 10 second events 5-15 mild, 16-30 moderate, and greater than 30 severe were simply guessed and were not valid measurements of the pathophysiologic perturbations.
So here you can see just one of the problems with the guess. A patient with 35 ten second long apnea /hour is “Severe” while a patient 10 two minute long apneas per hour.is “Mild”. Simple counting renders profound errors in measurement of the complex diseases captured in the SET. .
So we see where they made the mistake. They bypassed the extensive research required to understand the diseases. Instead they guessed a set of criteria which effectively captured a SET of diseases and also captured very mild states of dubious significance. They then proceeded to perform 35 years of RCT using the SET. The massive research required to sort all of this out was never done. They moved directly to RCT with the guessed criteria and upon identifying cases which were missed by the criteria simply expanded the criteria.
This is the same process which was followed with sepsis in 1992. There was a desire to move directly to RCT of treatment. When the RCT were not reproducible they would simply amend the consensus. The sepsis consensus was amended 3 times over the past 30 years with the amendment in 2012 being highly inflationary. Protocols were built and are still applied using the guessed criteria even thought the RCT were negative. Pathological Consensus is a cancer on EBM and has increased skepticism of EBM among clinicians,
So your point in 2018 was exactly what was (and is) needed. This will require reform as participation in Consensus has replaced discovery as the best means to increase an academic’s citation value.
However we have to get past the talking stage. Something has to be done. We need real reform and the thought leaders show some evidence they are questioning their dogma. Its time to teach them. In fact most were taught this “Pathological Consensus” technique and they need help from statisticians focused on measurement to facilitate reform of the science…
Stipulating that improper “lumping” of patients with very different underlying pathologies for their sleep apnea might have contributed to many decades of non-positive RCT results, the next step would be to ask how researchers back in the 1970s and 80s should have proceeded.
It sounds like you’re saying that patients with the most profound and prolonged desaturations might be different, in important ways, from patients with frequent but shorter-duration and less severe apneas (?) I’m assuming that patients with central contributors to their apneas (e.g. brainstem CVA) were not lumped with patients with obstructive causes for their apneas for the purpose of conducting previous RCTs (?)…If they were lumped together, did this make sense physiologically?
Would it be reasonable to assume that the underlying distribution of prognoses for a group of patients with central apneas could be importantly different from the distribution of prognoses from a group of patients without evidence of central apneas (?) So would the next step have been to separate these two groups of patients and follow them over time to characterize their prognoses (i.e., their untreated clinical event rates e.g., CVA/MI/death)? If you agree, the next question to ask is whether that type of additional study would have been considered ethical at the time, given the demonstrated ability of CPAP to prevent apneas (at least the obstructive ones). Might ethical concerns around not offering treatment to already-diagnosed patients have precluded the types of “natural history” studies that could otherwise have informed and refined the inclusion criteria for future RCTs (i.e., RCTs designed to show “higher-level” benefits (e.g., improved survival/decreased CVA rates)?
So maybe sleep apnea is a bit of a special case, clinically speaking, when it comes to our ability to study it using “gold standard” methods like RCTs (?) The fact that treating OSA can produce benefits that manifest quickly (no more snoring, happier spouse, better blood pressure, weight loss, improved daily function) versus only after a prolonged time, perhaps worked against it in the long run. There’s clearly a spectrum of severity for this condition. And as is true for most medical conditions, we are most likely to be able to identify the intrinsic efficacy of a proposed treatment if we enrol patients with the worst prognoses- accruing more outcome events of interest tends to make the efficacy of the treatment, if present, easier to detect. Yet with sleep apnea, once we have diagnosed it, our hands are somewhat tied, ethically speaking. We can’t really leave these patients untreated for extended periods of time, in order to fully characterize their prognoses in the absence of treatment. A clinical catch-22…
I should have been more clear on this question.
Since the need is so critical, the best way to approach this is with a proposed Action Plan. This action plan relates to research where billions of dollars, countless careers and opportunities for discovery are being wasted. This does not apply to clinical medicine where the use of synthetic syndromes may be useful for quality improvement, flexible protocols, education and administrative use.
First Action
Recall the mandated use of pathologic consensus now.
** Administrators have replaced scientists, guessing has replaced research, and consensus has replaced discovery."
- This is restraining perhaps thousands of individuals or teams of researchers across the globe from discovering the actual conditions which have been improperly defined by pathologic consensus. We need to unleash these researchers. The mistake was to bridle and standardize their work using the same guessed (and wrong) measurements. The research of my team and so many others have been bridled (controlled) by administrative decree of the standard guessed erroneous measurements of pathological Consensus. This has to end now.
Research Step 1 In biological science meticulous and comprehensive collection and analysis of relational time series data pertaining to the phenomena under study. is required before any research can proceed. This is the step which was bypassed previously and replaced with pathological consensus, These retrospective data are presently available as Digital EMR and/or polysomnographic data.
Research Step 2 Separate the diseases by identification of sentinel (unique) relational time series patterns in the data to identify different diseases. Diseases may also be separated by clinical context such as a blood culture positive for a specific organism or a specific event such as perforated bowel or urinary tract infection.
Research Step 3 Identify diagnostic metrics (which may be the sentinel patterns). .
Research Step 4 Identify severity metrics which may be the diagnsotic metrics or the sentinel patterns. At least one morbidity and/or mortality should be a function of the Severity metrics
Research Step 5. Identify time series patterns which are present across multiple diseases and characterize the trajectories. These may be indicative of common pathways (treatment targets) .
Research Step 6 study the morbidity/mortality associated with sentinel patterns and seek surrogate metrics and biomarkers for all new diseases,
Research step 7 initiate RCT.
I look forward to further discussion.
Again, the first instant action "recall of pathological consensus" is required straight away.
The identification of sentinel time series patterns and conversion of that into a statistically definable metric is fundamental to this effort. the former falls within my area of expertise but the latter requires expert statistician input.
Lets revisit the figure I produced earlier.
The first portion of the figure shows 11 rapidly cycling apneas of about 45 second duration with completer recovery between each apnea. This time series pattern is induced by typical reentry cycling (see cycling image below). The less regular pattern of second portion of the above figure is due to arousal failure which distorts the arousal dependent recovery cycling (see later pattern of above image) and the patient’s SPO2 falls to near death levels due to arousal failure. Note the fall past the former stable arousal threshold (stable for the first 11 apneas) and the severe fall to a very low nadir. Note the incomplete recovery despite severely low and life threatening arterial oxygen values (SPO2 values)
So there are three candidate sentinel patterns in the above figure and therefore we would hypothesize that three diseases are present one involving upper airway instability, another arousal failure and a third, recovery failure. The last two could be one disease. or an adverse drug reaction.
The next question is:
Are these sentinel patterns and, if they are, what is the best means to determine the incidence, clinical relationships, and the morbidity casually associated with each of these three sentinel patterns and to quantify each of them them for statistical analysis. This is hard work.
Here you see the AHI does not detect that three patterns are different since the long apnea are simply counted the same as the shorter apnea and the recovery pattern is not considered in the AHI. So a Dr. Referring a suspected case for polysomnography would learn the AHI but not the presence of arousal failure or recovery failure. .
This is a tough problem requiring expert statistical guidance with collaboration with an experts in time series analysis and clinical medicine. . The quantification of severity of time series patterns and conversion into a metric (surrogate) which can be processed statistically is complex work. .This is why the default is to simply guess thresholds and process them. But you see a few components of arousal failure and of recovery failure
-
- The duration of the apnea,
-
- the nadir of the SPO2,
-
- the area above the SPO2 curve,
-
- the magnitude and peak ratio of fall to recovery
- 5 the peak value of the recovery
-
- the global pattern of the paroxysm itself including duration of recovery between each apnea, regularity and onset pattern.
The point is that this, (not superficial analysis with "lumping’ similar looking diseases into into SETS and then guessing criteria for the SETS), is the process for identifying metrics of disease and then investigating the disease.
The same process would apply to the diseases which comprise sepsis. In all these cases genomic, proteomic and other data including time series data can be incorporated. Its a massive effort but each lab can contribute the the discovery of new sentinel patterns and metrics or massive trials can be funded for this purpose. The point is that the preliminary requisite work was not done they jumped right to (and still do) the derivation of synthetic syndromes from 20th century guessed criteria. Pathological Consensus is NOT an alternative to doing the work.
We asked for a grant to do this work for the synthetic syndrome “sepsis” and were declined because the view was that the work was not necessary as the criteria were already established (by the consensus group). The public deserves better for its funds then indefensible, oversimplified, Pathological Consensus. Centrally mandated erroneous measurements of Pathological Science of Langmuir,
Let us recall Pathological Consensus now and unleash the young researchers (the next generation) to get this work done.
Thank you. I am sorry for the delay in responding. Do I understand correctly that the current differential diagnosis for the symptoms suggestive of obstructive sleep apnoea or hypopnoea is:
- No obstructive sleep apnoea or hypopnoea (Evidence: AHI <5 events per hour)
- Obstructive sleep apnoea or hypopnoea: (Evidence: AHI >4 events per hour)
- Other possible diagnoses
Are you are suggesting that there should be a more detailed differential diagnosis (with addition of the sentences in italics) to avoid missing Arousal Failure with recovery and Arousal and Recovery failure?:
- No sleep apnoea or hypopnoea (Evidence: AHI <5 events per hour but no prolonged apnoea pattern of arousal failure)
- Sleep apnoea or hypopnoea with: [A] Repetitive Reduction in Airflow / RRA (Evidence: AHI >4 events per hour without prolonged apnoea pattern of arousal failure) or (B) Arousal failure with recovery or (Evidence: AHI >0 of latter events of prolonged apnoea per hour) or (C) Arousal and recovery failure (Evidence: AHI >0 of latter events of very prolonged apnoea per hour)
- Other possible diagnoses
Some points:
- The treatment for RRA is an oral device or CPAP, weight reduction, etc. Should the treatment for Arousal Failure with recovery and Arousal and Recovery Failure be expected to be the same as for RRA?
- Do we know from observational studies how prevalent Arousal Failure with recovery and Arousal and Recovery Failure is in patients with symptoms suggestive of obstructive sleep apnoea or hypopnoea when the AHI is <5 events per hour and the AHI is > 4 events per hour? Is it possible to suspect Arousal Failure with recovery and Arousal and Recovery Failure clinically (e.g. with additional evidence of neurological dysfunction)?
- In order to establish the threshold for AHI where treatment for RRA provides a probability of benefit, I would do a study to estimate the probability of symptom resolution in a fixed time interval at different AHI values with no treatment or sham treatment (presumably a zero probability at all AHI values on no treatment) and on treatment. This might be done by fitting a logistic regression function or some other model to the data on treatment and on no treatment (when the curve might be zero for all values of AHI) and on treatment (when the probability of symptom resolution should rise as the AHI rises). Treatment should then be considered where the latter curve appears to rise above zero. This rise above the control curve may well happen at an AHI of 5 events per hour, or above or below 5 events per hour (e.g. 3 events per hour). This would be an approach setting a threshold based on evidence (as opposed to consensus guesswork).
- The above would apply to ‘RRA Obstructive Sleep Apnoea / Hypopnoea’. However for Obstructive Sleep Apnoea / Hypopnoea with Arousal Failure with recovery and Arousal and Recovery Failure the curve might be different with perhaps a clear probability of benefit at any AHI > 0. Note therefore that there may be a number of different AHI treatment indication thresholds created by a study of this kind. The symptoms alone might provide criteria for a diagnosis of ‘Clinical Obstructive Sleep Apnoea / Hypopnoea’, but for a ‘physiological’ diagnosis there may be 3 different criteria for (i) RRA, (ii) Arousal Failure and (iii) ‘Arousal and Recovery Failure. Each of these would also be sufficient to diagnose ‘Physiological Obstructive Sleep Apnoea / Hypopnoea’ (i.e. each might be a ‘sufficient’ criterion for the diagnosis) as well as prompting the doctor to offer treatment options. However, the probability of benefit from each treatment based from the logistic regression curve based on the AHI as a measure of disease severity and the adverse effects of treatment would have to be discussed with the patient during shared decision making.
- Although I consider Sleep Apnoea / Hypopnoea in my differential diagnoses in internal medicine and endocrinology and have some understanding of its investigation and management, I have never personally conducted Polysomnography or personally treated patients with CPAP etc., so please correct any misunderstandings. However, based on my work of trying to improve diagnostic and treatment indication criteria in endocrinology, the above is how I would approach the problem for Sleep Apnoea / Hypopnoea. I agree with you that this is a problem that needs close collaboration between clinicians and statisticians. The advice of statisticians such as @f2harrell or @stephen or someone similar in your area would be essential. I think that this type of work to improve diagnosis and treatment selection criteria is a huge growth area for future close collaboration between clinicians and statisticians. I am trying to encourage students and young doctors (and their teachers) to do this in the Oxford Handbook of Clinical Diagnosis, especially in the forthcoming 4th edition.
I appreciate that your response is primarily designed to show a clinical/statistical approach to disentangling different diseases that have been, historically, lumped under one syndrome umbrella (e.g, “sleep apnea”). This is certainly very valuable. But for this particular condition, I’m not sure how much further ahead this disentangling would put us going forward.
I sense that sleep doctors are frustrated by systematic reviews which seem to call into question the value of treating sleep apnea. It costs a lot for the machinery needed to treat this condition properly. And I imagine that any time a payer can point to a list of “non-positive” trials as justification to withhold coverage for treatment, they are tempted to do so. This is presumably why the consequences of non-positive RCTs done historically in this field have been so frustrating for sleep physicians.
Lawrence suspects (I think) that one reason why RCTs of sleep apnea have not been able, historically, to show that treating apneas reduces a patient’s risk of death or cardiovascular outcomes, is that patients enrolled in previous trials should never have been lumped together in the first place. Using an arbitrarily-defined AHI cutoff, people with very mild underlying disease(s) causing their apneas were likely lumped together with patients with more severe/prognostically worse underlying disease(s), an approach to trial inclusion that was destined to produce “noisy” results (and therefore non-positive RCTs).
But even if sleep researchers wanted to rectify this problem going forward, knowing what we know today about the pitfalls of using syndromes as trial inclusion criteria, I really doubt that we’d have the ethical equipoise to do so (?)
Once we know that an existing treatment makes people feel and function better (and perhaps decreases the risk of motor vehicle accidents substantially…), it becomes ethically indefensible to leave them untreated for periods of time long enough to show benefits with regard to less common clinical outcomes (e.g. death, cardiovascular events, car accidents). This would be like denying people with chronic pain their analgesics for 5 years in order to show that those with untreated pain have a higher risk of suicide- unacceptable.
For the field of sleep medicine, this seems like a real dilemma (?) We have a strong clinical suspicion that if we were to design RCTs more rigorously, using patients with more homogeneous underlying pathology and worse (untreated) prognoses, we would be able to show significant benefits of treating their apneas with regard to “higher level” endpoints (mortality/cardiovascular events). However, because the treatment works so well for shorter-term patient-level endpoints, we are ethically prevented from designing the longer-term trials needed to show these higher-level benefits…
Thank you @ESMD. There are at least two issues here. The one that I tred to address is how to set for a particular test and its results, thresholds for probabilities of benefit that makes it worthwhile offering a treatment such as CPAP to a patient. This will vary from treatment to treatment and for different target outcomes (e.g. reduction of daytime sleepiness, snoring, alarm to the spouse because episodes of apnoea , etc). I would have thought that treating this alone should be justification for the use of CPAP. If as @llynn suggests, the diagnosis might be missed and CPAP not provided because of failure to detect dangerous episodes that occur less frequently than 5 episodes per hour, then this should be worth correcting. This could be tested with an RCT in the three types of Obstructive Sleep Apnoea / Hypopnoea as I suggested.
If in addition there is a risk from Obstructive Sleep Apnoea / Hypopnoea of cardiac and vascular complications down the line (as suggested already by observational studies) and a possibility that CPAP might reduce this risk, then this is an extra bonus at no extra cost whether it is true or not. However as you say, it is not possible to test this with an RCT by randomising patients to CPAP or no CPAP. However, it might be possible to use other techniques by following up a cohort of those with an AHI below a threshold and not treated with CPAP and also a cohort of people with an AHI above a threshold and given CPAP.
On a related note, the idea of using a more comprehensive outcome scale that credits for reduction in, e.g., daytime sleepiness but also gives credit for any mortality reduction seems to be in order.
Sorry for the delay and thanks to all the thoughtful comments and suggestions. In my response I will present some fundamental concepts for the broader audience so please forgive the basic nature of a portion of this discussion. Here we are using “sleep apnea” as the prototypic set of diseases defined as a synthetic syndrome under the 1980s-2022 era of pathologic consensus but the fundamental considerations for the timely extrication of this pitfall from critical care science applies broadly.
I agree. There are multiple pathologies associated with the sleep apnea hypopnea syndrome (SAHS)… Obstructive SAHS (OSAHS) is a subset of the syndrome for which CPAP is indicated. The separation of central and obstructive sleep apnea using the AHI is not possible but the AHI is used for both of them and there is considerable overlap particularly because low respiratory drive associated with central sleep apnea often causes the upper airway to collapse producing mixed central and obstructive apneas. .
Here you can see the first aspect of the problem, disease overlap when defined by a common criteria set with weakly objective terms (such as at least 50% having an obstructive compnent).separating the diseases.
.
Using a new diagnsotic paradigm (that I am proposing here), the various sleep apnea diseases are considered pulmonary arrhythmia and defined and quantified by .the time series components of the arrhythmia. These are then studied to identify pathology correlates looking for those providing severity functions. Considering the example of arousal failure this is a pulmonary arrhythmia with a sentinel pattern and its incidence of this could be determined by retrospective review of archived time series. Arousal failure has been theorized as the cause of the not uncommon “opioid associated unexpected hospital death”.so this is important. .
Yes as noted above just as we do for cardiac arrhythmias. Interestingly in the 80s counting premature ventricular contractions was once a way of quantifying a cardiac arrhythmia. The cutoff for treatment was 5 per. minute. Studies shows excess death in the treatment group so the addition approach was abandoned. So the issue would be simply, Does the patient have the sentinel pattern of arousal failure or not, the AHI being irrelevant. This is not much different than your approach. It just eliminates all of the known problems of repeatability of the AHI from lab to lab in the same patient.
. .
For all the reasons above it is pivotal to start fresh as if the AHI does not exist and think about how to define each disease in the “syndrome”… This is appropriate since the AHI was guessed and based on the number of fingers on ones hand (10 second for an apnea and hypopnea) and the original cutoffs of 10, 20, and 30 (which have been changed to 5, 15 and 30 to increase sensitivity) were capricious and based on the metric system not human pathophysiology. The selection of cutoffs of 5, 10, 20 or 100 was common 1980s Pathological Consensus. …
The decision to render diagnosis all of the sleep apnea diseases AND severity index them by the same simple sum of 10 second airflow cessation or attenuations was the apical error which rendered all the subsequent pathologic consensus. So in my view the first thing to do is abandon all anchor bias for the original guess and look at the time series patterns themselves to define the diseases and severity indices, Here we have the thought experiment “what if the AHI did not exist”. First we would identify the highest diagnsotic measurement type achievable for each target pathology.
Diagnostic measurements in descending order of statistical relevance (order of 2 and 3 is debatable)
- Specific measurement which varies as a function of severity of the target pathology
- Non specific measurement which varies with the severity of the target pathology.
- .Specific measurement only rendering true and false state (eg PCR testing)
- Non specific measurement which does not vary with the severity of the target pathology
The guessed measurements group used in 1980-2022 pathological consensus.( AHI, SIRS & Sepsis III are of the type 4. (AKI is type 2). Type 4 measurements may not “rise to the statistical level”. if the measurements are set too sensitive (eg AHI of 5) as this will produce a profound signal to noise problem not mitigated by severity since severity is not a function of a type 4 measurement such as the AHI…
If there are no type 1 measurements then a type 2 measurement must be determined by formal research. So starting as if the AHI does not exist we would examine the time series patterns for severity correlation, first seeking specific patterns not present in health. Again, the separation of the health state from the disease state by quantification of excess events which otherwise occur in the health state .(eg a 10 second apnea). is particularly problematic so we seek sentinel time series patterns which do not occur in health. This is the approach taken by analogous cardiac arrhythmia diagnostics.
Yes. It is likely that the severity of excessive daytime sleepiness (EDS) will be function of a different pathological time series patterns then, for example, hypertension or opioid associated sleep apnea death due to arousal failure. . . . . . . . .
Arousal is generally required to rescue the patient if the airway is obstructed during sleep. So a treatment which prevents obstruction may “treat” arousal failure because the arousal would not be required. This is exactly what we must learn. I recommend CPAP post op for OSA patients with prolonged apneas during sleep and low SPO2 nadirs (arousal failure) precisely because we do not know the efficacy of either but CPAP is considered more effective… .
No. One theory of arousal failure is that it is due in some cases to plasticity of the arousal response as one eventually is not aroused from sleep when living by a nearby train track… If this is true then RRA eventually induces arousal failure and are therefore more likely to be severe themselves. This is an unknown area.
Pretest probability is unknown. `“At Risk” patients include Patients receiving opioids which increase the arousal threshold and in the presence of central injury or genetic conditions such as Arnold Chari malformation and congenital or acquired diseases of Hypoventilation. However some of these are often occult. At the present time arousal failure is generally occult and only identified by screening overnight oximetry…
Absolutely. and not just for OSAHS but for all the guessed “synthetic syndromes” presently defined by the 1980s technique of pathological consensus. This entire problem of measurement (which is profound) needs to be formally addressed by the top statisticians. Only then will we begin to move to the next paradigm.
Linking the pathological time series patterns (severity indexing of RRA, arousal failure and recovery failure for example) to specific pathologies (EDS, Hypertension).would seem to be a second step after identification of measurements for all the pathological time series patterns…
Finally, is there a place for the old guessed measurements like the AHI. I have to say no. There might be value in counting apneas but counting threshold 10 second apneas has proven to be completely inadequate and probaly would add nothing to more robust counting measures which considered duration, and RRA patterns. .
The new future of clinical measurement must begin with an understanding that the apical mistake was to consensus guess a type 4 diagnsotic measurement and base 35 years on this guess as a measurement standard, mandated by grant and publication gatekeepers, A lesson to be learned about the harm of central gatekeepers and lack of seeking the required measurement oversight by mathematicians (statisticians) should also not be missed. .
Its very difficult for researchers in the field to do anything about this. A formal mathematical presentation of this problem in a paper by respected statisticians is required to release the researchers from the 1980s pathological consensus paradigm.
. .
The premature ventricular contractions (PVCs) example is even better an example that it seems. Work done at Duke University Cardiology in the late 1980s showed that the frequency of PVCs is not independently prognostic once the amount of permanent ventricular damage is taken into account. But what is prognostic is the nature of the PVCs. If they are the R-on-T wave type of PVC they are independently prognostic. You need frequency and morphology.
Excellent. The morphology of a time series is responsive to the pathophysiology just as it is in cardiac arrhythmia.
Let us now look at what happens to a field of science under “Pathological Consensus”. This same problem I am about to present is present for the past 35 years in the other science fields of Sepsis and ARDS. Perhaps also in AKI but I don’t follow that literature.
Attached is a Cochrane review of opiates and sedating drugs in Sleep Apnea. Opiate associated sudden sleep apnea death is the scourge of hospital post op wards. Occurring without waring its effects are devastating and the occurrence while not common is also not rare.
From case reports of the few sleep apnea patients which were monitored during sleep apnea death or near death, the cause is arousal failure and we know that opiates delay the arousal response. .So the population to study would be those with pre-existing arousal failure in which such opioid induced delay could be deadly and the primary endpoint would be delay in arousal or the emergence of incomplete recovery (the wide complex tachycardia and R on T analog).
Here we see that they instead used the guessed metric AHI, looking for increased severity by a rise in AHI. This is not even expected based on the pathophysiology but that is what the consensus agrees that severity is so that’s what they looked for. They treat the AHI like a blood pressure as if it rises as severity increases. Of course that what the pathological consensus says it does. 5-15 mild 16-30 moderate. greater than 30 severe. Those are the consensus rules. They also use the ODI (oxygen desaturation index) an old metric based on counting 3 or 4 % dips in SPO2. Its another guessed threshold counting metric like the AHI.
Finally they talk about duration of the apnea in the conclusion but it was not an endpoint. What is duration"? The mean? In any case the study provides no information useful to determine how to screen, what to look for, or any other useful information. It is an exercise in the statistical processing of useless guessed numbers.
This is the state of the art of Sepsis, Sleep apnea, and ARDS. Useless research which looks real just as Langmuir described in 1953 but now, not just in some poor person’s lab, but mandated worldwide by funding and publication gatekeepers..
Feynman’s cargo cult research on a worldwide scale with the fires beautifully laid out, the hapless statistician dutifully doing the work and the Cochrane gate keeper of the science never looking beyond the runway For 35 years the cargo planes never land yet the natives do not lose faith. They dance around their editorials talking about the next, more perfect, runway lining fires they will light.
The entire affair reads right out of the description of the known potential pathologic sociality of science. We have known since the 1960s how this happens and it happened anyway persisting for decades unabated…
The diligent enthusiastic Ptolemaic scientists have been vindicated by the near identical failings of their counterparts in the 21st century… .
Yikes! Mortifying to think that there might be physicians who applied the findings from the linked Cochrane review to their prescribing practices for patients with sleep apnea…
Yes the unfortunate consequence of 1980s “Pathological Consensus”, the research using the guessed metric may be used for high risk medical decision making. It is very likely that many physicians made decisions based on this as it has been highly cited.
Furthermore The Cochrane Collaborative is considered a gate keeper. They are and should be highly respected. They did not know, neither did the authors or the statisticians. This is a tragedy on a grand scale.
This is why I have called for a recall of the the use of Pathological Consensus based metrics for clinical trials. They may be useful for clinical use but they were guessed and have no place as valid measurements in clinical trials where there use can lead to wrongful conclusions and be harmful to risk defining decision making . The recall should be promulgated by the leadership soon.
Finally, a notation of the fact that the AHI and ODI were used for severity indexing in this study and this is not a valid surrogate for Sleep Apnea severity (especially as it relates to opiates) should be provided promptly as this article is still being highly cited, it has been included in at least one guideline, and is (apparently) still considered reliable evidence.
Quoting the conclusion of the article above, specifically relating to opioid, hypnotic, and sedating medication use in OSA.
“The findings of this review show that currently no evidence suggests that the pharmacological compounds assessed have a deleterious effect on the severity of OSA as measured by change in AHI or ODI”.
Since the AHI is the standard measure of OSA severity this is a powerful statement.
Most here at this forum are outlier expert statisticians, mathematicians and clinicians. Most are here because we love science and math. There are no citations which are going to generate CV expansion.
However, this paper is a reminder that we are all at the bedside. What we do or do not do effects decision making and patient care. We see a methodological mistake like this and we want to turn away. We don’t want to get involved. Its like seeing bad care at the hospital, healthcare workers want to turn away. We should not… It is a responsibility we bear as healthcare scientists. We have a great purpose, . Let us embrace it without undo deference to expediency.
Thanks to all and especially Dr. Harrell for this wonderful, scientifically rich, forum…
.