The growing interest in integrating causal inference and Design Theory

There is rapidly growing interest in integrating a simpler clinician facing of structural causal modeling (SCM) called causal symbolic modeling (cSM) in critical trial design. I am working on a paper which presents cSM as a bridge between the clinician, causal inference, and design theory.

The need for better critical care trials and better explication of cause agnostic RCT is driving this interest. (See link) But the epistemic gaps between 1. design theory, 2. causal inference, and 3. clinical science are substantial.

SCM itself can generate so many nodes that it overwhelms and much of this is addressed by randomization . But on the other hand, as proven in critical care syndrome science, randomization cannot solve some hidden pathological design features which are readily detected by cSM.

In many trials iatrogenic (design generated) heterogeneity has been introduced before randomization. Treatment assignment is randomized across a mixed causal population, not within a coherent or disease or cause entity.

Without cSM, SCM can still be correctly applied. DAGs can be drawn, estimands identified, RCTs can ge performed, and effects estimated. But if the prior structure (eg the gate & outcome) are invalid, the estimand represents an average over incompatible causal mechanisms. It is mathematically valid yet causally non-transportable.

I hope to have the paper out soon. I would welcome any comments.

Just to remind of the need, here is a December Petty-Bone sepsis RCT from JAMA. It’s SOFA at the gate plus some thresholds and a “SOFA baseline to multiple endpoint differences” at the outcome.

Remember this was published in JAMA and expected to be the state of the art. But I labored over supplement 2 and I advise all interested in critical care science to do that. It’s data salad.

i liked the move to precision biologic targeting. This is an important gate limitation but otherwise the gate was a wide open triage. It was so wide and survival so low (what killed these people is unknown) that the study is not interpretable. This is sad because they have a target but both the gate and outcome are based on the SOFA composite guessed in 1996.

This type of RCTvneeds to be designed with cSM.

https://jamanetwork.com/journals/jama/fullarticle/2842634

Here is the 90day mortality.

Yet this study was reported as positive based on a 1.7 point difference in SOFA at 9 days but they do not show a daily time series of the SOFA.

Much of this is buried in supplement 2. The point is that these trials are increasingly wasteful. This was a massive multicenter trial in scope and one has to have the highest respect for these ethically motivated workers but the trail lacked causal explication at the gate and outcome and was therefore doomed. Only a massive treatment effect could have bailed that widely gated design out.

Come join this discussion on X or let’s begin a discussion of cSM here.

https://x.com/soboleffspaces/status/2005739600904610040?s=46

The only references I can find to causal symbolic modeling are written by you. In your recently published 25 page review of your “Petty-Bone RCT” hypothesis, you attribute causal symbolic modeling to Wright and Pearl but do not provide a reference as far as I can tell. Could you provide a reference here?

1 Like

The majority of “causal inference” solutions to the problem involve oversimplification by pretending that every patient has the same expected outcome, within each treatment group. So I’m failing to see how the causal inference approach is going to help with the problems you’ve identified, as opposed to just using robust experimental design and robust covariate-adjusted outcome modeling.

1 Like

Thank you Elias. The paper to which you refer was written for clinicians and clinical trialists. Although it uses the PettyBone mistake as an index case, and certainly proves that particular design is flawed, the paper has a deeper purpose: to use that decades old design error to highlight the fact that, in the present state fatally flawed trials can readily pass through CONSORT and statistician review, but these flaws would be immediately exposed which the design structure is interrogated by cSM as I did in the paper. So the paper is a call for reform of the process of design in clinical science not of design itself.

Causal symbolic modeling (cSM) is a design-level causal framework that interrogates the causal integrity of symbolic trial components prior to formal structural causal modeling (SCM) and/or design

In the paper I demonstrate that CONSORT and conventional thinking about internal RCT validity (for example by @Stephen ) is incomplete because the need for valid transportability is primary to bedside decision making of the clinician.

So if you read the paper closely (sorry about its long length) you will see what cSM is. Pearls SCM is designed for broad structural design. What I discovered was that in clinical medicine we evolved a symbolic heuristic based language which translates poorly to design (and SCM). The paper shows this clearly. It explains how statisticians were fooled by the symbolic language of clinicians. This could also could fool CI practitioners as they venture more deeply into clinical science.

However, the problem is also social. There is an epistemic interface gap which the statisticians do not cross in CONSORT. This is discussed in detail the paper in section titled:

Why did RCT mimicry persist unnoticed until now?

Operationally, the specifics of cSM can be summarized as:

Applying Pearl’s structural causal modeling logic as far upstream as feasible in the trial lifecycle to explicate and interrogate the causal integrity of design assumptions. These assumptions include:

  1. Disease labels
  2. Eligibility criteria and cohort gates
  3. Interventions
  4. Outcomes
  5. Protocol rules and co-interventions

cSM determines whether these symbolic objects can legitimately function as nodes in a causal model or as part of the design structure.

cSM:

1.does not introduce new mathematics (Its Pearl’s stuff)

2.does not replace randomization or SCM

3. does not estimate effect.

3.does not adjudicate causal truth from data

Instead, cSM determines whether a causal question is well posed.

As the paper demonstrates design presupposes that the entities placed into the model correspond to real causal objects. Design can represent causal coherence, but it does not validate it. CONSORT checks internal validity of the design but not transportability. The logical ordering is therefore:

  1. cSM determines whether a causal estimand can meaningfully exist.

  2. Design and/or SCM is applied after (or with) cSM analysis.

The paper makes it clear that when symbolic assumptions are incorrect, design may clear CONSORT and still yield mathematically valid results that are clinically non-transportable. cSM is introduced to prevent this decades old failure mode and to introduce the clinician to SCM and design explication. This also allows deeper (below deck) failure mode analysis beyond the typical questions of CONSORT re: power, compliance, etc which are all pivotal but clearly not enough.

cSM is not really new, it’s simply the language of Pearl placed upstream to be applicable to clinical science (pathophysiology) which forces explication the components of the causal model itself before defining its structure in design, SCM, or both.

There is no reference for cSM yet as the paper is in progress but you can grasp it from the linked paper alone if you see that the linked paper is less about the PettyBone RCT mistake then it is about how to align design with a valid and fully explicated causal model. The basic teachings of Pearl is all that’s required to understand cSM.

The place to start is “The Book of Why”. It’s a good basic source of the concepts embodied in cSM.

Both design and CI make assumptions (pretendings). Internal assumptions for CI and external assumptions for DT. These assumptions are quite synergistic in their mitigations, since both can be approached by the alternative discipline when combined.

Sepsis trialists and statisticians have assumed, for 34 years, that the effects they calculate are transportable to everyone who enters the gate. That has proven false with much loss. However when DT “pretended” that the RCT derived ventilator protocols for ARDS were transportable as EBM to severe COVID pneumonia, that was wrong at global scale and produced a bedside clinician revolt. This design based assumption error cannot be allowed to pass without corrective action.

As I presented to Elias, cSM is not CI but rather the use of Pearl’s language to assure a design (or causal inference based study) reasonably can generate transportable results since “CONSORT cleared designs which are falsely attributed a transportability function have proven to be associated with adverse outcome.

I agree that design direct by a DT expert with a pathophysiology clinician and no epistemic gap in communication COULD probably solve most of these problems alone, but history has proven that they won’t. Virtually no statistician will even acknowledge the problem of iatrogenic non-transportability as a function of the task force defined gate. Without intervention the task forces are not going to change their approach.

As the paper describes in detail, cSM comprises the synergistic connection between design and CI needed to begin the process of bridging the epistemic interface gap. In a broader sense I’m trying to facilitate synergistic work between you and Pearl by finding the portion of DT and CI where you will both will find agreement if you look.

Elias, this is simplest way to illustrate what cSM is:

Below is the standard CONSORT approved path for an RCT. “Inclusion” is the selection gate (S=1) which is set by a task force.

This looks valid. Naive OS and CI based SCM could start also here. This might be called “trust based design theory” because the design statistician is trusting the taskforce to understand what defines a valid measurement for defining a valid cohort. But many task forces do not understand that.

So this “standard RCT” is transportable only if the task force created “RCT valid” inclusion (selection). However since many taskforces have not been taught by the statisticians how to do that and CONSORT doe not require it, many standard (CONSORT approved) RCT will not transport as a function of standard (pathological consensus based) design error. The false view that they are transportable is a major problem.

In contrast below is the cSM model (although T and Y would be further modeled) Note here cSM uses causal modeling to examine the task force derived cohort itself. That’s the key:

cSM forces the modeling the components of the model.

CSM is required because statisticians and clinicians need to have a language in which to build the components of the model. “Words” have proven inadequate. They need a way to include modeling of the components in the checklist to assure a valid cohort in CONSORT.

cSM is simply “the causal model of the components” of an RCT or OS. A CI expert might trust an offered node by a clinical expert as DT experts have, so cSM forces the question for CI experts too. Since DT experts do not recognize that this is necessary. cSM forces the question.

Since I have had no substantive effect on correcting this problem by teaching statisticians and trialists over the past years, I am advancing the cSM to clinicians and funding bodies (eg the NIH) because this is subtle, anchor bias entangled stuff. Words are ineffective tools of change when dealing with a Lakatos Scientific Research Programme

So cSM is just SCM taken upstream as far as possible designed to guide DT. DT directed RCT, OS, as well as CI might assume the expert task-force selections are valid. cSM says prove it to me.

I have been trying to follow the various threads on here where this has been raised and still don’t understand how graphs like those above illuminate the real problem you have identified in critical care beyond a general (word-based) statement that “average treatment effects are of little use, or are misleading, when there is (probably) substantial treatment effect heterogeneity”. The real contribution is in the strength of the argument that there is meaningful HTE in a particular scenario. What is the main benefit of these symbolic graphs? Especially given that these kinds of causal graphs are agnostic on the presence of HTE/treatment effect interactions.

Relatedly, it also isn’t clear how strong the connection is here to Pearl’s DAGs, as in these graphs the arrows appear to have a different meaning. In a randomized experiment it is not possible for X to cause treatment selection, which I would take from X –> T in a conventional DAG. If this changing of the meaning was intentional, what is the proposed benefit?

1 Like

Thanks. Yes it is hard to understand because it’s hard to believe this is design would pass CONSORT with such a simple obvious error, conditioning on a disease agnostic X generated by hundreds of different diseases. WHO thinks that would work? Answer: virtually everyone in critical care science despite five decades of failure of transportability.

This is not the classic biology associated HTE, this refers to “Synthetic HTE”: HTE generated by symbolic aggregation such as by a task-force generated triaging threshold (X). More specifically, this synthetic HTE is caused by the absence of SCM during trial design, which allows a mismatch to persist between the causal structure of the trial actually implemented and that of the trial investigators intend to run.

This difficult to articulate because the design is pathological so the representative DAG is pathological.

The best place to start is with this article which describes the development evolution and consequences of the streamlined (high n generating) threshold triage based RCT.