Discussion of Assessing Heterogeneity of Treatment Effect, Estimating Patient-Specific Efficacy, and Studying Variation in Odds ratios, Risk Ratios, and Risk Differences

f2harrell · March 25, 2019, 2:23pm

This is a place to discuss this blog article which shows an example of a formal assessment of heterogeneity of treatment effect, shows how to use penalized maximum likelihood estimation to get patient-specific efficacy estimates, and discusses odds ratios, absolute risk differences, and risk ratios.

This topic is also the place to discuss this blog article which shows how personalization of efficacy estimates can result in worse estimates for individuals if HTE is not large.

byrdjb · March 25, 2019, 6:41pm

Thank you for making & sharing this resource!

PaulBrownPhD · March 25, 2019, 9:44pm

i wonder then about some journals’ requirement for eg sex-specific estimates of the effect eg Circulation: “Please provide sex-specific and/or race/ethnicity-specific data when appropriate in describing the outcomes of epidemiologic analyses or clinical trials, or specifically state that no sex-based or race/ethnicity-based differences were present.” submission requirments. It’s not exactly clear if they’re alluding to a treatment interaction but i believe they are. As you note in the blog there is typically a serious lack of power, and thus the test it anti-conservative. The forest plot of ORs across subgroups that you describe would be a compelling visual however

regarding the OR, i noticed @timdisher challenged anyone on tiwtter to shake his faith: The odds ratio is the best effect measure of all time…

f2harrell · March 25, 2019, 10:12pm

The age-old NIH requirement was never quite appropriate because of the reason you mentioned Paul. This cannot be done reliably without borrowing of information, e.g., by putting a prior distribution on the interaction effect to assume that the response to tx for males is more like that for females than not. This Bayesian idea of having interactions “half in” and “half out” of the model is discussed well here.

timdisher · March 26, 2019, 12:54pm

I owe my OR love to @f2harrell and most of my day to day work being focused on whether or not HTE exists (as applied to systematic reviews). Will definitely be referring to this/adapting some of these visualizations for my own work.

PaulBrownPhD · March 26, 2019, 12:59pm

it’s off topic ie not binary outcomes, but i wonder how one evaluates HTE with a rank-based composite. I’m not sure i’ve seen much literature on that, especially if you want to power on the interaction. In the social science literature they talk about “the adjusted rank transform test”. I’m not familiar with it and wondered if anyone had experience with this, maybe the probability index is used for the visual display and simulations for power estimation at the design stage… power est for composites seem tenuous anyway

f2harrell · March 26, 2019, 2:41pm

Use the same methods as in the blog article but applied to a semiparametric ordinal regression model such as the proportional odds model (generalization of the Wilcoxon test).

Agnes_Cororaton · May 10, 2022, 5:20pm

I really want to present this blog to my journal club. Do you have publications (clinical and also statistical) that look at HTE for the odds ratio?

R_cubed · May 10, 2022, 11:15pm

There is this mega thread on the issue with lots of references:

f2harrell · May 11, 2022, 12:21pm

I believe that the best over-arching approach is to develop a model that reliably estimates risk for individual patients then to use this model to estimate differences in risks for individuals. The model will typically be stated in terms of log odds ratios because this most often leads to the simplest model that fits, i.e., doesn’t require interactions to rescue lack of fit.

arthur_albuquerque · July 30, 2022, 11:33pm

These graphs are fantastic.

Frank, could one say you used the G-formula to produce these plots?
Is anyone aware of a published RCT with at least one of these plots?

f2harrell · July 31, 2022, 12:16pm

I don’t know. I thought this was simpler than the G-formula. If not, then perhaps I understand the G-formula better than I thought I did.

arthur_albuquerque · July 31, 2022, 6:21pm

Sounds like what you did is what the authors described as “Method 1: marginal standardization” in this article.

f2harrell · July 31, 2022, 8:19pm

What I was describing was anti-marginalization.

arthur_albuquerque · July 31, 2022, 8:41pm

Good point! G-computation requires one to calculate the mean, as you depicted with an arrow in this plot:

Is there a technical term for showing the whole distribution?

P.S: Here is an R code example of proper g-computation (adapted from here).

fit <- glm(y ~ t, data = data, family = binomial())

#Estimate potential outcomes under treatment
data$t <- 1
pred1 <- predict(fit, newdata = data, type = "response")

#Estimate potential outcomes under control
data$t <- 0
pred0 <- predict(fit, newdata = data, type = "response")

#Compute risk difference
mean(pred1) - mean(pred0)

f2harrell · August 1, 2022, 11:13am

I don’t know a concise technical term for showing the whole distribution. To avoid being serious for a moment terms like “honesty” and “full disclosure” come to mind but we need some shorter version of “maintaining full conditioning to recognize that at least one effect measure must be covariate-dependent when the risk factor is not ignorable.”

arthur_albuquerque · August 14, 2022, 10:10am

Dr. Harrell,

The developer of R package {marginaleffects} is interested in posting a case-study about one of your posts on the distribution of risk difference. Do you allow it? With proper reference to your work, of course.

More details about this inquiry here:

github.com/vincentarelbundock/marginaleffects

Documentation suggestion: unit-level contrasts can be very meaningful

opened 04:33PM - 12 Aug 22 UTC

arthur-albuquerque

Hi Vincent, In your [“Contrasts” vignette](https://vincentarelbundock.github.…io/marginaleffects/articles/contrasts.html#average-contrasts), you suggest that unit-levels contrasts “…can be unwieldy and hard to interpret.” Although I overall agree, I would like to highlight that a focus in these unit-levels contrasts has been rising in the medical literature. For example, in this [excellent blog post](https://www.fharrell.com/post/rdist/), Dr. Frank Harrell discusses possible summaries from a logistic regression model. He uses a dataset from a famous RCT, [“GUSTO-I”](https://www.nejm.org/doi/full/10.1056/NEJM199309023291001). He argues that a graph depicting all unit-level contrasts (or, as he states, “distribution of risk difference”) is much more informative than “one-number summaries”. Hence, I suggest adding a small case study in `marginaleffects`’ documentation showing how the package can be easily used to fill this new trend in the medical literature. Of course, feel free to fully ignore this suggestion if you think it’s not in the scope. The code below is inspired by Dr. Harrell's blog post mentioned above. I changed some details since `marginaleffects` does not support logistic regression models fitted with `rms::lrm()`. Nevertheless, I can reproduce his graphs fully with `glm()`. ---------------------------------------------------------------------------------------------- Load packages and data ``` r library(rms) library(data.table) library(ggplot2) library(marginaleffects) load(url('https://hbiostat.org/data/gusto.rda')) setDT(gusto) gusto <- gusto[tx %in% c('SK', 'tPA'), .(day30, tx, age, Killip, sysbp, pulse, pmi, miloc, sex)] gusto[, tx := tx[, drop=TRUE]] gusto$tx = factor(gusto$tx, levels = c("tPA", "SK")) ``` Fit the full covariate-adjusted model. ``` r f <- glm(day30 ~ tx + rcs(age,4) + Killip + pmin(sysbp, 120) + lsp(pulse, 50) + pmi + miloc + sex, family = "binomial", data=gusto) ``` ## Manual approach Distribution of absolute risk (proportion) difference ``` r d <- gusto d$tx = "SK" p1 <- predict(f, newdata = d, type = "response") d$tx = "tPA" p2 <- predict(f, newdata = d, type = "response") diff = p1 - p2 data.frame(ARD = diff) |> ggplot(aes(x = ARD)) + geom_histogram(color = "white", bins = 100) + labs(x = "SK - t-PA Risk Difference (aka, 'unit-level contrasts')") + theme_minimal() ``` ![](https://i.imgur.com/jFt0xId.png) # {marginaleffects} approach ``` r comparisons(f, variables = "tx") |> ggplot(aes(x = comparison)) + geom_histogram(color = "white", bins = 100) + labs(x = "SK - t-PA Risk Difference (aka, 'unit-level contrasts')") + theme_minimal() ``` ![](https://i.imgur.com/fmiFuYQ.png) ``` r sessionInfo() #> R version 4.1.2 (2021-11-01) #> Platform: x86_64-apple-darwin17.0 (64-bit) #> Running under: macOS Catalina 10.15.7 #> #> Matrix products: default #> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib #> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib #> #> locale: #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 #> #> attached base packages: #> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages: #> [1] marginaleffects_0.7.0.9000 data.table_1.14.3 #> [3] rms_6.3-0 SparseM_1.81 #> [5] Hmisc_4.7-0 ggplot2_3.3.6 #> [7] Formula_1.2-4 survival_3.3-1 #> [9] lattice_0.20-45 #> #> loaded via a namespace (and not attached): #> [1] httr_1.4.3 splines_4.1.2 assertthat_0.2.1 #> [4] highr_0.9 latticeExtra_0.6-29 yaml_2.3.5 #> [7] pillar_1.7.0 backports_1.4.1 quantreg_5.93 #> [10] glue_1.6.2 digest_0.6.29 RColorBrewer_1.1-3 #> [13] checkmate_2.1.0 colorspace_2.0-3 sandwich_3.0-2 #> [16] htmltools_0.5.2 Matrix_1.4-1 pkgconfig_2.0.3 #> [19] purrr_0.3.4 mvtnorm_1.1-3 scales_1.2.0 #> [22] jpeg_0.1-9 MatrixModels_0.5-0 htmlTable_2.4.0 #> [25] tibble_3.1.7 generics_0.1.3 farver_2.1.1 #> [28] ellipsis_0.3.2 TH.data_1.1-1 withr_2.5.0 #> [31] nnet_7.3-17 cli_3.3.0 mime_0.12 #> [34] magrittr_2.0.3 crayon_1.5.1 polspline_1.1.20 #> [37] evaluate_0.15 fs_1.5.2 fansi_1.0.3 #> [40] nlme_3.1-158 MASS_7.3-57 xml2_1.3.3 #> [43] foreign_0.8-82 tools_4.1.2 lifecycle_1.0.1 #> [46] multcomp_1.4-19 stringr_1.4.0 munsell_0.5.0 #> [49] reprex_2.0.1 cluster_2.1.3 compiler_4.1.2 #> [52] rlang_1.0.4 grid_4.1.2 rstudioapi_0.13 #> [55] htmlwidgets_1.5.4 base64enc_0.1-3 labeling_0.4.2 #> [58] rmarkdown_2.14 gtable_0.3.0 codetools_0.2-18 #> [61] curl_4.3.2 DBI_1.1.3 R6_2.5.1 #> [64] gridExtra_2.3 zoo_1.8-10 knitr_1.39 #> [67] dplyr_1.0.9 fastmap_1.1.0 utf8_1.2.2 #> [70] insight_0.18.0.4 stringi_1.7.6 vctrs_0.4.1 #> [73] rpart_4.1.16 png_0.1-7 tidyselect_1.1.2 #> [76] xfun_0.31 ``` <sup>Created on 2022-08-12 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>

f2harrell · August 14, 2022, 11:34am

No problem with that.

arthur_albuquerque · August 14, 2022, 7:57pm

Thanks. Case study is now online here.

arthur_albuquerque · August 16, 2022, 6:11pm

@f2harrell I enjoyed reading this related paper by you and colleagues: A tutorial on individualized treatment effect prediction from randomized trials with a binary endpoint. The attached R code is very insightful.

I want to double check with you that the first plot below (from your article’s Figure 4) corresponds to the estimated ARR distribution in your blog post (second plot below).

Please note I am not referring to the underlying logistic regression model. Instead, I refer to computing the predicted probability of an outcome for each observed row of the data in two counterfactual cases: when treatment is “tx==0” and when treatment is “tx==1”. Then, computing the differences between these two sets of predictions.

Discussion of Assessing Heterogeneity of Treatment Effect, Estimating Patient-Specific Efficacy, and Studying Variation in Odds ratios, Risk Ratios, and Risk Differences

first plot (from article)

second plot (from blog post)