The Petty/Bone RCT

As Prof Farrel said, this discussion keeps getting better. I think the Petty and Bone paradigm, that is, the proposition that you can lump patients with diverse biological mechanisms to receive a “silver bullet” shot is now refuted by decades of negative RCTs.

Regularly updating the diagnostic thresholds as they did recently in pediatric sepsis won’t change the fact that the lumping paradigm is refuted by failure to predict empirical results.

The point is that we need new disease models, i.e., a theory that can articulate pathophysiology with clinical presentation and provide treatment targets. We need biological constructs, not clinical constructs. For instance, the sepsis-3 clinical construct exhorts us to diagnose sepsis using a prognostic tool (SOFA and qSOFA). As I said elsewhere, how long will we keep pretending it is Medicine?

The king is naked. There is no way around it.

There is much more to it. I think the critical care community is shamefully unable to think scientifically. I point to our leaders like J-L Vincent and A Slutsky.

(posts are behind a paywall, but you can use 7 days of free access and then cancel after reading)

I ask our silent readers and active commenters: do you think any triage tool like ARDS and sepsis criteria (add delirium) will ever unveil a mechanistic approach to treatment?

2 Likes

The first use of the term “heterogeneous syndrome” that I could identify relating to “sepsis”in review of Pubmed was from 2004 and states:

Severe sepsis is a heterogeneous syndrome in a heterogeneous population”

Here you see they did not understand the difference between “heterogeneous population or prognosis” and “heterogenous treatment effects”.

So this shows the basis for the cognitive error which arises from a misunderstanding of the Bradford Hill RCT as it relates to it’s ability to control prognosis and population heterogeneity NOT treatment effects heterogeneity which could be unknowingly massive when tens or hundreds of sets of different diseases (each set with a different average treatment effect) are lumped into a “heterogeneous syndrome“.

This 2004 paper was seeking to understand the failure of PettyBone RCT reproducibility for an anticoagulant treatment for the set of diseases which met their measurements to be called “sepsis”. Of course they did not recognize this as the PettyBone pathological science.

Langmuir taught that seeking of the long past apical error is pivotal relevant the understanding and eliminating pathological science. This “seeking and investigation”, not trust and defending, should be the standard method of all young scientists. Papers should not start with repeating the dogma but with a deep analysis of its origin and basis in science.

In the pathological science of N-Rays, the Langmuir apical error was a technical interpretative error.

In contrast, in the PettyBone RCT, particularly in the fields of sepsis and ARDS, the Langmuir apical error was a cognitive error. A simple misunderstanding of the teaching of Fisher and Bradford Hill as it relates to the term “heterogeneous” and it’s mathematical management.

https://www.tandfonline.com/doi/10.1080/07853890410027943?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed

1 Like

I think I understand the nuance that you’re highlighting here, but I’d like to double check. It feels like a lot of really important concepts are packed into this brief statement of yours.

My understanding is that if we want to stand the best chance of identifying a therapy’s intrinsic efficacy via an RCT, then it’s a good idea to enrol patients who are reasonably likely to experience outcome events of interest over the course of the trial. If events don’t occur, we won’t be able to tell if the treatment we’re testing is efficacious or not. Often, this imperative means that we’ll end up enrolling “sicker” patients in a trial- the sicker they are, the more likely they will be to experience an outcome event of interest. For example, when testing the ability of a drug to reduce MI rates, we will want to enrol patients who are at risk of experiencing an MI over the course of the trial, not healthy patients at low risk for MI. If we enrol a “mix” of patients in the trial- some at high risk for MI and some at low risk, we will observe fewer outcome events of interest, thereby making our estimate of intrinsic efficacy more uncertain (uncertainty intervals around the relative treatment effect will be wider).

But I think (if I’m not mistaken), that you’re saying that prognostic homogeneity is a bit of a double-edged sword (?) By enrolling only those patients with the worst “untreated” prognoses, we will optimize our ability to detect any efficacy signal (if intrinsic efficacy does exist), while simultaneously incurring the risk of detecting efficacy signals that are statistically significant but clinically meaningless for many patients (?) If a treatment’s intrinsic efficacy will manifest to a “statistically significant” extent only if a trial enrols the very sickest patients, then this result might not end up being clinically useful to the vast majority of patients with the disease (who, by virtue of their milder disease/better prognoses, will likely derive a much smaller absolute benefit from the treatment) (?) And unless we have reliable ways to estimate untreated prognosis for patients we see in the clinic (e.g., cardiovascular risk calculators), it could be difficult to translate the relative treatment effects derived from our RCT, performed in only the sickest patients, into absolute terms when counselling individual patients with a wider spectrum of prognoses. And since, for many (?most) conditions, we don’t have reliable ways to estimate untreated prognosis, RCTs that enrol patients with a wider spectrum of untreated prognoses could sometimes be desirable (?)

Finally, what is the most fundamental reason why including patients with a wider spectrum of prognoses in an RCT can sometimes valuable. Is it that:

  1. This practice can sometimes, potentially for the first time (if prior observational studies don’t exist), allow us to observe (within the context of the RCT) disease trajectory/prognosis among untreated and healthier patients in a rigorous way, thereby allowing us to better estimate their untreated prognosis? OR that
  2. This practice will lead us to observe fewer outcomes of interest, resulting in a wider uncertainty interval which will, in turn, make us (appropriately) more circumspect when assessing the efficacy signal derived from the trial (?)

“But what we need is homogeneity in treatment effects across patient types.”

Your choice of the word “need” here is interesting - I’ve heard you use it before in various contexts, but have never been sure exactly what you mean. Are you saying that we want/“need” enrolled patients to share the biological causal pathway leading to outcomes of interest AND for this causal pathway to be matched to the mechanism of action of the therapy being tested? In other words, we want to avoid enrolling patients who are biologically incapable of responding to the treatment we are testing (?) We “need” these criteria to be met because the most fundamental purpose of our trial is to test the impact of a treatment on a specific biologic causal pathway? So, if we enrol patients for whom this potential causal pathway is not present (i.e., patients who are biologically incapable of responding), we we will not be gaining an accurate picture of the therapy’s intrinsic efficacy (?) For example, if we are testing the efficacy of a cancer therapy that is targeted to a tumour’s specific genetic mutation, then we “need” all patients whom we enrol in the trial to have tumours that harbour that mutation- otherwise, the efficacy signal of the targeted therapy will be “distorted” (see below) due to non-response among patients who are incapable of responding (?)

So, to summarize, ensuring prognostic homogeneity (enrolling only the sickest patients) in an RCT can optimize the prospects for detecting a therapeutic efficacy signal, but can sometimes leave us uncertain how to translate the trial findings to patients with a wider spectrum of prognoses. In contrast, ensuring that the therapy’s mechanism of action is “matched” with the underlying biologic causal process leading to outcomes of interest (aka “homogeneity in treatment effects”), when deciding which patients to enrol in a trial, is necessary in order to permit a valid estimate of a therapy’s relative efficacy (?) Here, “validity” refers to the ability of the trial to measure what it is supposed to measure: namely, the ability of the targeted therapy to improve outcomes among patients with the target mutation.

1 Like

This discussion is richer than I have to to fully engage in at the moment but a couple of quick points. Patient heterogeneity in outcome tendencies is a good thing, as long as they are not very heterogeneous with regard to treatment-relevant etiology. We want to have a spectrum of disease severity in the sample.

This doesn’t follow. Have a wider prognostic spectrum in a sample makes the sample more useful in translating treatment effectiveness estimates into other clinical populations, if you have covariates that describe this spectrum. Absolute benefit is the easiest thing to model, as a function of covariates (and can often be simplified to be a function of predicted risk under control therapy). Enrolling a variety of patients allows an RCT to be the best source for estimating risk for standard-care patients.

Thanks. Point of clarification re “This doesn’t follow.” I’m not suggesting that a drug wouldn’t be clinically useful to less sick patients if the RCT only enrolled the sickest patients, but rather that it would be challenging to translate trial results to less sick patients if we have no information (from inside or outside the trial) on their untreated prognosis. Your last paragraph is exactly what I was trying to say (maybe in a less-than-clear way).

1 Like

Maybe you could elaborate. If the patients are on a continuous spectrum of the same disease, absolute benefit of Tx tends to be a simple function of baseline risk, because of how seldom we see important treatment x covariate interactions.

Yes, no arguments from me. This is what I was trying to say, but I must have mangled my wording if you are perceiving a disagreement (?)

I find it really challenging to put statistical concepts into words accurately in order to ask clarifying questions. But I’ve come to believe that this type of narrative slogging, with meticulous attention to phrasing/wording, is very important. Without it, non-experts like me will usually be completely lost. Often, a misplaced or poorly chosen word (or use of a word that means different things to different people) is enough to create mass confusion/misinterpretation or betray a profound misunderstanding. Appreciate your patience. :slight_smile:

2 Likes

I realized that there is a better mathematical characterization of “heterogeneous syndrome”. Here is that definition.

A overarching disease agnostic plurality of sets of a second plurality of sets of different diseases which fall within the scope of set of triage thresholds as amended from time to time by pathological consensus.”

So this is a figure of three PettyBone RCT of a corresponding three sets which are contained within the overarching set of the heterogeneous syndrome called ARDS.

Here you see how the term of art “heterogeneous syndrome” hides layered disparity of treatment effects inside the PettyBone RCT. Absolutely the polar opposite of a Bradford Hill RCT.

With PettyBone, the ATE is not the the ATE of the heterogeneous syndrome but rather of a limited set of diseases of the heterogeneous syndrome much less the ATE of the disease of the instant patient under care.

So the PettyBone RCT is pathological science with a definable apical error but it is also the perfect “RCT Mimic” It is readily generated by anyone by triage case finding. In that respect, research investigating the PettyBone syndromes, is flowing spice and…the spice must flow. (Dune I).

image

I am considering writing a second book examining the interfaces between pseudoscience, pathological science, pathological consensus, and real science.

Regarding pseudoscience, a characteristic stands out; The use of ambiguous terms of art.

It appears that some of pseudoscience emerges from pathological science which survives and some emerges from the charlatan dimension.

Pathological science which remains inside of real science for decades becomes “internalized pseudoscience”. Here it is the ambiguity of the terminology of the pseudoscience which allows the pseudoscience to coexist inside of real science. In a sense this ambiguity provides the armor which deflects the interrogating mathematicians and statisticians.

One distinguishing feature of this internalized or mainstream pseudoscience is an evolving generality of the scientific lexicon.

Homeopathy speaks of “potentiation” and chiropractic scientists speak of the “subluxation”. Sleep apnea severity science is based on the guessed 1970s counting science of the “apnea hypopnea index” and critical care is rooted in the study of 20th century guessed “heterogeneous syndromes”.

It’s interesting to examine the discussions between Fisher and Hill. Hill was the champion of pragmatism but only within the limits of the math as guided by Fisher.

You see Hill, who was not a physician and had tuberculosis himself, remained functionally humble deferring to the statisticians while decrying what he perceived was their limited ability to think in sufficiently pragmatic terms.

Yet here we see that over next 3 generations, the physicians, who replaced Hill, empowered their pragmatism into a dominant force.

From here the PettyBone pseudoscience emerged and lives under the auspices of another ambiguous term of art, the “pragmatic trial”.

The only hope is a return to the leadership of the statisticians who Hill decried but, accepting his limitations, followed meticulously.

Chalmers, I. (2003). “Fisher and Bradford Hill: Theory and pragmatism?”. International Journal of Epidemiology . 32 (6): 922–924, discussion 924–8.

I couldn’t agree more!

When engaging marked complexity into which mathematical and methodological pathology has intruded, IMO, a deep dive slog is the only means to effectively discuss the math and methodology to facilitate the timely extirpation of the pathology.

The hard part is getting anyone to participate. Extirpation is not generally expedient

I would be interested in learning from @Doc_Ed and his colleagues or any reader and asking them this question:

What is your working mathematical description or formula of a critical care “heterogeneous syndrome”?

Lawrence just a general suggestion: Adding other angles on the argument may help, e.g, computing the $ lost from finishing futile clinical trials, describing missed opportunities to do more meaningful clinical trials, possible missled opportunities to do adaptive clinical trials for discovery of homogeneous threads of a heterogeneous syndrome, etc. For the latter, the ISPY-2 study comes to mind where an adaptive design found which biomarker can be used to enhance treatment of which cancer.

1 Like

Yes. It will be difficult to quantify the cost and lost opportunities. To do this I will need to recruit an expert in such techniques.

Agree I should focus more on this. We all think such threads are there in at least a part of the diseases in the heterogeneous syndrome. It’s is interesting that the heterogeneous syndrome of ARDS was based on a Petty’s idea of a homogeneous thread of surfactant deficiency and this is why he thought he was doing Bradford Hill RCTs.

I am trying to understand the history of research publication authorship. Looking at the British Research Council work it is hard to identify statisticians. We see the PI leading and apparently writing the method section often in general terms. I’m trying to understand what, historically, led to that level of authorship subordination and hierarchy?

Do you think it would help if statisticians were listed as co authors? There are exceptions but this is not the rule and it seems that if statisticians are co authors and funded separately to eliminate conflicts, they are going to dissect the terms of art into mathematical components to objectively define the RCT function.

What do you think about funding statisticians separately rather than as a cost inside the application so they are independent of the PI, and then having them be listed co authors so they have ownership of the methodology.

1 Like

Yes, it seems like times have changed/are changing. Investigators in complex disease areas (e.g., oncology, critical care) seem to be looking forward, rather than backward. Many are clearly getting on board with more modern trial designs. Hopefully this trend will continue.

It’s clear that researchers understand the problems posed by heterogeneity and are trying to address them with new designs (e.g., platform adaptive trials). It will be interesting to follow up on the results they generate over the next few years.

A couple of references:

https://www.nejm.org/doi/full/10.1056/NEJMra1510062

1 Like

I don’t follow that Erin.

This presented adaptive trial is simply a more streamlined means to perform trials including PettyBone RCT for sepsis, ARDS and delirium.

“An adaptive platform trial studies multiple interventions in a single disease or condition in a perpetual manner, with interventions allowed to enter or leave the platform on the basis of a decision algorithm” (emphasis added).

Note they say “condition” but they are referring to “heterogeneous syndrome”.

There are some Bradford Hill RCT here but this is also just an easier pathway to do PettyBone pathological science.

There is no teaching or correction here about the PettyBone pathology or why 40 years of research has failed. There is much talk about heterogeneity but about as much specificity here as there is in the term “heterogeneous syndrome”

This is exactly what is wrong. Critical care science does not just need more efficient, it needs to be less pathological.

The public should not pay for a crawling managed “paradigm drift”.

The best thing for the public is for us to do open failure mode analysis. IMO we don’t have the right to avoid looking backwards because that’s the only way we can understand, study, and extirpate our pathological science.

An adaptive trial doing PettyBone research is simply the equivalent of giving the public’s credit card number to the same PettyBone scientists after 40 years of failure.

1 Like

Erin, the 2024 (looking forward) paper you cited is perfect to demonstrate the culture problem and the lack of introspection and self policing problem with critical care science.

The paper is the first answer below to 40 years of failure.

  1. “The methodology hasn’t worked and we are going to do X”.

The public deserves and trusts us to give them this second answer below. We have no right to decide, as a function of volition, not to give them exactly that which is in the best interest of rapidly fixing the science without regard to expediency. That means that we have to do two things. First look back and second look forward using what we learned by looking back. Here is the second answer the public deserves and is sure we would do.

  1. “The methodology hasn’t worked. These are the results of a comprehensive root cause failure mode analysis by a commissioned panel of expert statisticians (some from outside the discipline) of our past and present standard methodology, it’s origins, and the mathematical meanings of our terms of art and they discovered the following. We propose to correct the following to fix and improve the methodology.”
1 Like

Okay Lawrence, this will be my final contribution to this thread. Please don’t take offense when I say this (I think you know that I respect your views), but I think that you’re cognitively “stuck.” I’ll close with a question for you to chew on. If you can’t answer it, then it might be helpful to introspect as to why this is the case.

Let’s pretend that some brave statistician (or group of statisticians) were to publish, tomorrow, a long article about how wrong all your clinical colleagues have been for so many years. Even better, let’s pretend they adopted your preferred terminology (“Petty/Bone RCT,” “pathologic consensus,” “synthetic syndrome” “root cause failure mode analysis”…) to explain why they were wrong. Further, let’s say they estimated that 1 trillion dollars has been wasted on poorly-designed trials over the past 30 years and that 1 billion human lives might not have been lost if only they had seen the light sooner. And let’s imagine that the critical care “thought leaders,” after reading this damning condemnation, all issued heartfelt apologies. Finally, let’s pretend that you, Lawrence, were given 10 billion dollars to design a series of critical care trials any way you see fit, to “turn the ship around,” so to speak.

My question is- and it’s a very SPECIFIC question: “Then what?” What, EXACTLY, would you do with this money? Resist the urge to fall back on highlighting problems of past trials and your preferred terminology (this is what I mean by “cognitively stuck”)- try to move your thinking forward. Since you don’t seem to feel that the critical care specialists and statisticians involved in establishing large international adaptive platform trials know what they’re doing, I’d be interested to hear what you would do differently. Go beyond the use of nonspecific terms like “root cause failure mode analysis” and explain EXACTLY what this would mean, in very concrete terms, for the design of a trial. Because I think, whether you realize it or not, this is the underlying principle behind a trial being called “adaptive…”

Best of luck in your endeavours.

5 Likes

I understand you mean well Erin. Certainly sorry to see you leave. I will follow this post with a discussion of alternatives to PettyBone science.

I find the disparagement that I am “cognitively stuck” when I have been the one calling for desperately needed reform for 20years absolutely unsurprising.

This is the same treatment I received from those mandating the failed “counting science” of OSA severity. As I discussed elsewhere in this forum there are an abundance of well known alternatives to the AHI in that controlled field of science just like there are alternatives to PettyBone science but the presence of alternatives makes no difference to those holding both entrenched dogma and the purse. In this sense Thomas Kuhn might not have anticipated the synergistic power of that combination.

I lectured in Marburg Germany about counting science alternatives in the early 1990s. Tom Penzel one of the authors of this paper below was a champion of change. Many from outside the US know it’s USA and NIH based pathological science and they laugh.

So I recognized that the OSA thought leaders were wrong just like I recognize the PettyBone lumping science advocates are wrong. A scientist should not defer to leadership.

“Root cause failure mode analysis” is a formal and well established process. It is not a vague request.
This is what should be done when there is failure regardless of the setting. This is what the FAA does all the time.

But let’s ask the public which they expect. Answer 1 or 2 below.

  1. “The methodology hasn’t worked and we are going to do X”.

  2. “The methodology hasn’t worked. These are the results of a comprehensive root cause failure mode analysis by a commissioned panel of expert statisticians (some from outside the discipline) of our past and present standard methodology, it’s origins, and the mathematical meanings of our terms of art and they discovered the following. We propose to correct the following to fix and improve the methodology.”

I learned much in my failure to defeat the, still extant, but utterly foolish counting science of sleep apnea severity. I am not going to make those same deferential, hopeful mistakes again.

As promised, I will describe the changes which need to be made.

[quote=“ESMD, post:57, topic:22077”]
Let’s pretend that some brave statistician (or group of statisticians) were to publish, tomorrow, a long article about how wrong all your clinical colleagues have been for so many years
[/quote] (emphasis added)

First you may not know it but very many of my colleagues agree with me not the “task force, triage threshold sets for RCT, generators”. As you know it’s not expedient to speak up.

So perhaps this was your intent and, if it was, “hats off”, because your quote beautifully betrays the gravest fundamental problem. This problem is that in the present culture of critical care science, it takes bravery to publish an anti-dogmatic article. In contrast, in Wood’s time, as lauded by Langmuir, this was a badge of honor. Science policing was the responsibility of all. Now, with the new consensus and task force class culture, it risks the career.

So, pretending, as you propose, that the social paternalistic/maternalistic bubble of the central control of the task force class can be pierced, here are the steps which need to be taken.

  1. The first step the NIH should take is to empower with funds, the statisticians to lead by funding them independently from the PI. They would be co authors and responsible for the methodology. They would be selected by a process which does not include the PI. If the statistician is responsible for the entire function including the basis for the measurements then, of course, she would investigate the terms of art like “heterogeneous syndrome” and convert them into math. That would change the paradigm very quickly. The apical error would be eliminated.

As Prof Harrell points out a pivotal part of the mathematical function of the RCT is the presence of a “homogeneous thread”. The lack of this thread is the apical error but this error has been unrecognized by the principal investigators and this created the present PettyBone era.

So empowered, the statisticians would stop assuming the PIs have sufficiently investigated the basis for the lumping. They would then assure, with the PI, that this common thread is present.

Here you see the difference between a PettyBoneRCT and a Bradford Hill RCT is the presence of a homogeneous thread in the latter.

The traditional means to assure a homogeneous thread was well described by Hill. However science has advanced so a homogeneous thread may be a measurable fundamental pathology common to all in the subjects by many means. It may even be a hypothetical common thread as it was with Petty and Surfactant if there is a solid discovery basis for that hypothesis. However we now have learned to our sorrow that one cannot simply lump by guessing, as Petty, Bone and decades of triage threshold generating task forces have, lest the thread be lost in the disease mix.

  1. The second step the NIH should take is to summarily defund the non disease specific threshold set consensus generating task forces. This is the 33 year source of pathological consensus and a major cause decades of critical care PettyBone RCT failure. The attempt to unify the world’s RCT funding by guessing triage threshold set “criteria” for the RCT is the dripping open wound of 33 years PettyBone pathology. Source control is required when curing any pathology and these task force, Delphi (guessing) triage threshold set consensus meetings, which recur every decade, are the source. The funds should be redirected to meetings of experts seeking to discover common threads.

PettyBone guessing leaders (who guess non-specific threshold based lumping science) meeting should be replaced by meetings of bench and clinical discovery scientists and expert statisticians.

  1. Since we are pretending that the reasons for the failed PettyBone paradigm
    has been promulgated by amazingly altruistic and “brave statisticians” then the third step of releasing the thousand points of youthful genius light to advance critical care will already have occurred in this idealized, altruistic paradigm.

Upon this promulgation, the search for the homogeneous threads at the bench and in the databases and at the bedside will be empowered and these searched will not have to link their discovered thread to the consensus guesses of the task force class. Some of this is emerging with the “treatable traits” approach but the world’s scientists must be released to search for such threads without regard to the edicts of the task force. Because of past paternalistic indoctrination AND intellectual colonization with the PettyBone error USA critical care science owes this cathartic promulgation to the rest of the world without delay. Because this release only happens if the trusting and intellectually colonized critical care scientists in the rest of the world are, without delay, told the whole truth.

So Erin the fix is easy because the apical mistake of the pathological science is the straightforward loss of focus on the need for a fundamental homogeneous thread.

So where should this discussion go once the PettyBone apical error has been eliminated?

Once the critical care research community at large is released from the well intentioned, central control of the paternalistic heterogeneous syndrome apologists and control is transferred to the entire community as individuals and as a whole and debates turn to;

“What constitutes a homogeneous thread?”,

rather than;

What is the new consensus task force derived threshold set to triage for RCT?”

When that happens, and it will, because this reform can only be delayed not stopped, there will be no more need for my antidogmatic efforts.

I hope you will come back Erin. I have so much enjoyed and respect your thoughtful comments and I think they are very helpful for the silent readers.

Zeal in the quest for science is no vice.

I’ve worked on critical care trials in the past, but have been out of it for a few years now. Would a productive way forward be to think a lot more about heterogeneity of treatment effects? Is there any mileage in retrospectively looking at trials to try to identify heterogeneity? It is presumably possible to identify groups that would be expected to benefit more from specific interventions than others? (assuming there is some understanding of the biology). Or even to design trials with the specific intention of looking for heterogeneity?

2 Likes