Collider in RCT Subgroup Analysis

Yeah, this why I chose to model the confounding influence of U on the baseline variables that we know at time 0. While simpler than modeling the interaction, it has very high payoff in practice to focus on this.

Correct. This is also related to @f2harrell’s comment:

Indeed, we exactly discuss and formalize this point in Section 3.4 here using the example of HER2. Notice that contextual knowledge from correlative and functional lab research is needed to choose the subgroup and develop the therapy for it. Hence the focus on that paper on transporting such knowledge across domains.

Notice the qualifier oncogenic EGFR signaling (not just all EGFR signaling which exists in normal cells). The oncogenic mutations on the tyrosine kinase domain of EGFR induce oncogenic EGFR signaling that can then be targeted (causally modified) by EGFR tyrosine kinase inhibitors.

I had forgotten that the Impervious to Randomness paper focused on teasing out oncogenic EGFR signaling. The DAG was drawn in my head during a day hike with my then soon-to-be wife around Santorini on 7/18/2021. I drew it on the piece of paper below and then wrote that manuscript as a way of not forgetting this concept. But as shown here (from 1:34:00 onwards) that EGFR pathway mental dissection allowed us subsequently (in May 2022) to come up with the most powerful therapy developed to date for renal medullary carcinoma – the deadliest kidney cancer in adolescents and adults. There are patients alive today (some even cancer free) that would otherwise no longer be with us if not for this.

Once we started thinking in a structured way about randomizing a patient’s covariates to remove these confounders this led to sampling theory. Then we spent a lot of time thinking about the implications of random sampling versus random treatment assignment and wrote this very long paper to summarize these points.

Depressingly, this line of thinking then allowed me to recognize the oxymoronic nature of randomized non-comparative trials (RNCTs). To this day, I struggle to convince some biostatisticians why RNCTs are such a bad idea. These DAGs are one method of communicating these concepts but they still need attention and may not work for everyone. Different tools may be a better fit for at least some people.

2 Likes