Cause-Agnostic RCT (CAR)

Frank correct me if I misinterpret because I am learning from you and applying the lessons to the critical care field. Your point about risk magnification contains an implicit critique of marginal estimands and identifies properly performed conditional estimands as potentially superior for some purposes.

In another post there is a discussion of ATE, ATT, ATO, and ATM as often described as targeting different populations.

What your observation makes clear is that the estimands diverge because different groups of patients dominate each weighted average, not because they correspond to distinct biological effects. In other words, they are not distinct causal effects but different weighted averages across heterogeneity, statistical reflections of who is in the sample rather than biologically defined quantities.

ATE, ATT, ATO, and ATM each include different groups of patients in the calculation. So the effect you get depends on who is included, not on one biological disease. So marginal estimates require “averaging of unlikes,” producing sample-bound, fragile estimates. Even with a constant treatment effect, a simple shift in the population captured and therefore covariate distribution (e.g., a younger population) changes the marginal effect, already a concern for disease-specific RCTs (CIRs).

I would add that for cause-agnostic RCTs (CARs), this critique becomes insurmountable.

Take sepsis as a concrete example. “Sepsis” is not a disease; it is a gate (S=1) that aggregates 10–50 distinct mechanisms of acute illness, including:
-pneumococcal pneumonia
-influenza A viral pneumonia
-aspiration pneumonitis
-pancreatitis
-cholangitis
-pyelonephritis
-meningococcal sepsis
-abdominal catastrophe
-fungal bloodstream infection
-post-operative infection
-etc.

Each of these diseases has its own risk structure and its own covariate–outcome relationships.

Now consider just a few commonly used covariates:

Lactate
-Pneumonia → lactate = shock/oxygen debt
-Influenza → lactate = respiratory muscle fatigue
-Pancreatitis → lactate = third-spacing/hypovolemia
-UTI in elderly → lactate = dehydration/frailty
-Meningococcal disease → lactate = fulminant DIC
Same lab value, five different mechanisms.

Age
-Influenza ->age = higher severity (varies with strain)
-Meningococcal sepsis ~> younger age = higher incidence/severity
-Pancreatitis ->middle-aged drinkers = worst outcomes
-Cholangitis ~> elderly = highest risk
One covariate, opposite effects depending on disease.

WBC / CRP
-Pneumonia → infection markers (but dynamic bimodal risk)
-Pancreatitis → sterile inflammation
-Fungal sepsis → WBC often low or nonspecific
-Viral syndromes → WBC may be normal unless secondary superinfection, CRP variable
Same numbers, different meanings.

These covariates have no unified causal interpretation across the disease mixture called “sepsis.” This is a CAR.

And this is where IMO your blog critique becomes decisive but not just as stated but also relevant the CAR.

In a CIR (real disease), conditional estimands can reduce variance and clarify effect heterogeneity. (The blog’s conclusion).

But I would add that this logic is incomplete because the conclusion should be “RCT species dependent”. In a CAR (mixed diseases), conditional estimands are worse than marginal ones (but both are invalid) because they impose a single causal interpretation on covariates that do not share one. Conditioning amplifies the collider bias at S=1 and substantially guarantees internal invalidity.

ATE magnifies one blend of diseases.
ATT magnifies a different blend.
ATO magnifies the overlap between blends.
ATM magnifies the matchable subset of blends.

These are not strata of one disease, they are different mixtures of different diseases.

This is the direct generalization of “averaging of unlikes,” except here we are not averaging over unlike risk strata but over unlike diseases, with incompatible causal pathways. There is no unified risk surface to magnify, no single HTE structure to average across, and no stable meaning for the covariates that generate the weights.

Thus internal validity collapses before external validity is even considered. Randomization cannot rescue an invalid gate; it can only balance whatever happened after flow through the gate. If S=1 does not define a coherent causal system, no marginal estimand is biologically interpretable.

In summary: IMHO the logic of the linked blog is RCT species dependent; sound for CIRs, trials of real diseases. But when extended to CARs, the logic beautifully exposes that thestructure of the design makes it impossible to generate a valid causal estimand, regardless of how well the trial is executed.

Sepsis (as used in RCTs) contains many mechanisms. Therefore, in a CAR, a marginal estimand is not the marginal estimand of anything biological and a conditional estimand is worse.

I can’t make the link to the blog work but the link is provided in this excellent thread.

https://discourse.datamethods.org/t/propensity-score-weights/28536/7