Yes, an easy one is that he did not accept this counterargument which is very easily understood with graphs. Note that we used the same kind of reasoning in our renal medullary carcinoma study (sensitivity analysis and supplementary figure 2).
Thanks! This looks super interesting.
I assume you mean not an effect modifier by “neutral”? Because these measures are noncollapsible, adjustment for a prognostic covariate tends to reduce bias (not so for collapsible binary effect measures) and of course in a large study the bias dominates. I believe however that a study has to be really small for loss of precision to dominate if there is gross heterogeneity of treatment effect across these covariates.
The authors of the odds ratio paper also state that “It should be noted that for large sample sizes the mean square error will be dominated by its bias rather than variance component, so that for sufficiently large samples the adjusted estimator will always be preferable.”
Another fantastic question that merits clarification. A major challenge I sense here again, as has happened in a prior thread, is the different terminologies / notations used between fields. @Sander belongs to the rare breed that can seamlessly connect them but they otherwise remain a great source of misunderstanding.
Fully cognizant that it can be an oversimplification, it helps in this case to adopt Pearl’s demarcation between statistical (associational) concepts such as collapsibility, odds ratios, hazard ratios, regression etc and causal concepts such as confounding, randomization, instrumental variables etc. Our review focused on causal concepts whereas all your questions refer to statistical concepts. Thus, when we say “neutral” we do not refer to statistical effect modification (e.g., a multiplicative interaction in a regression model) but a variable that does not confound (in the causal sense), mediate or serve as a collider for the exposure-outcome relationship of interest. This is described both in our aforementioned review and other references offered both there and in this thread.
The difference in terminologies between statisticians and causal inference methodologists applies to the term “bias” as well, and you are very correct that I should have clarified this. @AndersHuitfeldt excellently elaborated on this distinction in this thread. Your comment on bias follows the statistical definition and the related bias-variance trade-off. I am working on a draft paper where we do, for example, connect the bias-variance trade-off with causal concepts, consistent with the notion that the demarcation between “statistical” and “causal” is far from perfect even though in our particular thread here it helps us maintain focus.
Yes, I agree fully with your observation and indeed I was referring to a completely different concept. I have however been pondering the concept you have defined so I have a question:
Can change in precision with noncollapsible measures really be compared given that the precision comparison is made for different groups of people and thus the estimators are computing estimates of the value of different estimands (conditional and marginal)?
(does not apply to the power issue since the conditional and marginal estimands share the same null).
Another fantastic question. I have alluded in other post in this forum that I indeed agree that the precision comparison is typically complicated for non-collapsible outcomes (it actually may paradoxically show decreased precision for the conditional estimate), whereas the power comparison is straightforward because both the conditional and marginal estimates share the same null hypothesis. See section 1.3 of this excellent article for a more detailed discussion. This is an example where the frequentist approach can provide a simple and elegant solution.
Having said that, I do believe there can be Bayesian solutions here but this is a story for another day.
Responding to ESMD Sep 2021, I think that there are some practical ways forward to increasing confidence in rationale behind DAG relationships. Risk of bias assessments are standard for systematic reviews and meta-analyses, and the same tools can be useful for assessing studies that inform causal diagrams. This link has a full description of my opinion.
I think that Professor Harrell’s steps at the beginning of this thread are useful, highlighting the utility of interdisciplinary collaboration. However, it is important to reflect the source and specifics of rationale when writing up our modeling methods in publications. Clinicians have the benefit of observing patients for outcomes in ways that are not always recorded in research, but part of the rationale for research is to provide insights that are not biased according to non-random selections of patients for individual clinicians, highlighting the importance of research. Particularly younger physicians derive their knowledge from scientific literature, curricula, senior clinicians, guidelines etc. John Ioannidis argues and my experience as a peer review both highlight the need for quality improvement in literature, Richard Smith highlights that older clinicians spend very little time reading it, and local physicians have confirmed his finding. A surgeon told me that he would estimate that 70% of surgeons do not read scientific journals; please validate or refute if anyone else has checked into this.
There are a lot of sources of information, different ones will be available to different researchers, and all are snapshots of reality with limitations. Tools exist for evaluating literature critically, which can also provide ideas for critical thought in discussions with experts. Sensitivity analysis using different sets of independent variables is underused in regression-based research, providing a solution where uncertainty remains about causal diagram structure. And datasets can be merged to fill in gaps where data on variables are missing, described in the Rehfuess et al. paper cited in the link posted above.
Model specification is a vexing challenge, even to Nobel Prize winners, whose papers provide useful insights. James Heckmann wrote a paper in 1999 titled ‘Causal parameters and policy analysis in economics: A twentieth century retrospective’.
I hope these ideas are helpful, that your research is going well, and appreciate any further ideas in addition to those people have posted in this thread already.
This 2005 paper by @Sander_Greenland should be required reading for anyone dealing with analysis of observational data.
Greenland, S. (2005), Multiple-bias modelling for analysis of observational data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 168: 267-306. link
I’d describe it as a case study in Robust Bayesian Analysis.
Hi @f2harrell and @Drew_Levy. I would like to ask about the understanding of some terminology and concepts and how they relate to the framework that has been presented in the RMS course.
Frameworks for prognostic research:
My understanding of prognostic research is largely based on the framework proposed in the PROGRESS series of publications, but even more on a publication by Kent et al 2020 “A conceptual framework for prognostic research” https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01050-7. I found it very helpful, also for a general understanding of the literature and published work (especially Table 1 in the Kent et al paper).
However, according to the Chapter 1 of the RMS (Uses of models: hypothesis testing, estimation, prediction), you do not seem to make a further distinction within the “prediction” type of model strategies. This made me wonder about your perspectives on the above frameworks, and if there are reasons why you do not distinguish, for example, between studies/purposes about “association”/“predictor finding” and “prediction model development”?
Distinction between causal/etiology studies vs. prediction studies and implications for adjustment:
A further distinction is often made between causal/etiological and prediction research, as described, for example, by van Diepen et al 2017 " Prediction versus aetiology: common pitfalls and how to avoid them" https://academic.oup.com/ndt/article/32/suppl_2/ii1/3056968?login=true. And I think this distinction is related to the distinction between “estimation” and “prediction” uses of models made in the RMS Chapter 1. According to van Diepen et al, confounding and corresponding adjustment is not an issue in prediction research, whereas it is an issue in causal/etiological research. However, as I understood from the course, even if the purpose of a study is purely prognostic, we should always be concerned about proper adjustment if possible. Is this correct and how does this relate to the understanding of van Diepen et al? The van Diepen et al perspective is kind of intuitive to me and I would appreciate any thoughts on this and how it relates to the RMS course.
I also tried to use DAGs to think about the confounder adjustment in both cases, etiology and prognostic studies. As I understood it, in the case of etiological research, we are interested in a certain effect of variable X on an outcome Y. And of course, in this case, we need to adjust for confounders in order to correctly estimate the effect of interest. This is how I understand the use and application of DAGs as introduced in the course. However, when we want to develop a prediction model, we are interested in a whole set of variables (X1, X2, etc.) that can predict the outcome Y. So how do I know what to adjust for when there is no focus on a specific effect? Or in other words, how would I draw a DAG in this case, do I need to consider confounders of all effects between X’s and Y simultaneously? I would be very grateful if you could help me untangle this confusion.
Thanks a lot for any feedback!
These are great issues to bring up. I hope that @Drew_Levy can discuss the causal inference part of this in addition to anything else he wants to discuss.
In Chapter 4 I discuss 3 strategic goals at the end. Association assessment comes under the Estimation and Hypothesis Testing goals. But the text there neglects to emphasize the importance of confounder adjustment as being unique to these two goals, and as you stated, is not so important for a pure prediction task. You’re motivating me to improve this section, which I’ll start doing very shortly. I hope to update the online notes by 2023-05-30.
You are asking a difficult (and intelligent) question: difficult because I think a satisfying answer requires nuanced argument.
Reconciling all the many different perspectives and treatments of this problem (e.g., Kent; or van Diepen, etc.) is a reach. So I will attempt just to reconcile what we covered in the RMS short course.
Fundamentally, we should prefer models that make good predictions (see McElreath pp 13). This means not only models that predict well in the sample but also reliably predict future observations. We see from Frank and others that there a different ways to predict well. If you understand the data generating process (subject matter knowledge is available and reasonably complete) and have collected the essential or neceaary variables in that process, and carefully design your model in accord with the data generating process, you can expect that your model will likely perform well in the sample and in future observations (out of sample). This scenario recommends the causal models approach that I tried to represent in the RMS short course—as so many others have advocated (and done so better than I). A model that captures the data generating process may be expected to reliably predict outcomes even among new observations to the extent that it is faithful to the underlying causal mechanism.
However, a model might also predict well under a different scenario: you have a lot of information in a lot of variables and you include as much of that information as possible in your model (either because your sample size allows or you used data reduction techniques that preserve the essential information in the predictors), and you discipline that complex model to not overfit the sample (e.g., with shrinkage) so the model has acceptable expected predictive accuracy.
So there are these two approaches to a model with good expected predictive accuracy: one approach may be said to be more mechanistic and the other more empirical. Confounding complicates the former—the causal or mechanistic—approach; but not so much the later–empirical—approach. Of course, you are less able to draw conclusions about specific variables and effects in the later approach; but if your interests and objectives (and incentives) do not include understanding the particular role of specific variables for outcomes, then the non-causal ‘empirical’ approach may be useful and gratifying. As the later does not leverage causal (‘mechanistic’) understanding for its expected predictive accuracy, it does not benefit from a causal DAG in selecting variables for adjustment.
The important or essential part of your query, “how do I know what to adjust for when there is no focus on a specific effect”, suggests that your objectives are primarily prediction and your interest is primarily in optimizing expected predictive accuracy, so the later scenario above seems more likely consistent with the motivation for your question. And it is Franks recommendations in RMS (such as data reduction maneuvers and shrinkage) that are especially helpful in discipling the model to generate good expected predictive accuracy in the empirical scenario.
Does this clarify things for you? I hope so.
Thanks for your helpful insights and feedback! I will make sure to read chapter 4 of the RMS again. Your two comments have already helped me to clarify and add some complexity to the frameworks/perspectives mentioned.
Yes, it is very difficult to reconcile the different perspectives/frameworks! So thank you very much for taking the time to share your thoughts and experiences!
This is such a great summary, Frank - just sent the link to one more colleague, and have been using it ever since in my courses - thank you so much for this wonderful set of guidelines.
Causal mediation analysis for time to event endpoints in randomized controlled trial
1
/
1
Hi,
I am struggling with a situation.
I have an RCT where there is an intermediate variable and an ultimate endpoint of survival (a time to event endpoint).
I want to establish that the intermediate variable is a casual mediator of the treatment effect on the ultimate endpoint i.e., OS.
I wonder how can I do that and is there a specific R package that can help me with this? The struggle here is the fact that the confounders only affect the intermediate variable or the mediator and the main endpoint (OS) while the treatment is randomized and not affected by confounders. So even though I tried to use the mediation package, I have been facing difficulty in formulating the two models. In my mind, the outcome model is
- coxph(Surv(OS,Death)~Treatment*intermediate variable
and the mediator model is
2. Intermediate variable~Treatment + C (confounders).
Are these valid assumptions and good for investigating the causal mediation by the intermediate variable?
Please provide guidance.
My question is regarding g-estimation. Is it accurate when I say that
For a continuous treatment, traditional regression estimates are g-computation estimates under the assumptions of a) existence of counterfactuals and consistency of counterfactuals b) Positivity and c)
Rank preservation.
G-estimation and G-computation are different approaches to estimation of causal quantities. I am not sure which one you are interested in, but your criteria are not correct in either case. The easiest way to see this is to consider a time dependent confounder.
Hi Anders I am sorry I did not understand how adding time as a confounder would turn this into g-computation ? Here’s an example showing g-computation when treatment is binary,
in my case treatment is continuous
G-computation was developed to handle situations where there is a time-dependent treatment and time-dependent confounders. If you want to understand how G-computation differs from regression, you have to think about situations where there treatment can change over time. There is no advantage to using G-computation if treatment does not change over time, this setting is only used as a toy example to illustrate how the method works in the simplest case
G-computation can rely on regression models to estimate the components of the G-formula. Usually, you need more than one model, and the idea of G-computation is that it shows you how to produce a prediction by putting the predictions from each model together correctly. In the special case of a toy example where treatment is time-fixed, you only need one model, and the predictions from G-computation will correspond to the predictions of that model. This will be the case regardless of the criteria you have listed (the first two of which are required for the validity of the G-formula and any other method for causal inference, the third is not relevant in this setting)
All comparisons are against A and you can’t make a statement about reduction due to A. But why is this under the RMS Causal Inference topic?
Suppress standard errors and P-values when you anti-log a column.
Got it, I was assuming that to look at the effect of A , one has to consider only the intercept because B,C,D =0 and intercept is what represents A. Thank you Frank, also I will move this to a different thread.