Context
I’m an MD currently doing a PhD in Clinical Epidemiology with a focus on Clinical Trial Design. My statistical foundation isn’t the best, but I’m trying to improve. I’d really appreciate some insight on survival modeling with the Weibull distribution.
TLDR
I’m planning the analysis of an RCT and rely on simulation because I don’t trust myself. I’m creating potential outcomes in survival time, dependent on covariates for semi-simulated data (based on real patients, details below).
I simulate survival times Y_i for each patient by drawing from a Weibull distribution, whose mean is predicted for each patient based on covariates, while the shape parameter is the same for everyone. For Y^{a=1}, I increase the scale parameter by a factor x, e.g., 1.25. For every simulation, patients are randomly allocated to treatment or control and their survival time is sampled from their Weibull distribution under a = 1 and a = 0 respectively.
I then try to recover the introduced effect size (increase in expected survival time, expressed as a ratio) using brms and two Weibull models: one without covariates and one with covariates, using standardization from marginaleffects. Somehow, I always seem to overestimate the effect size by a bit and I don’t understand why.
Long Version
I’m currently planning the statistical analysis of an RCT. For planning, I’m using routinely collected data of past patients from our hospital’s registry who would have been eligible for the trial, if it had taken place at that time. Therefore, I can’t share data as of now, but will try to produce a fully simulated, reproducible example if that’s needed.
When I run some visual diagnostics, the log(survival_time) and log(cumulative_hazard) seem to be roughly linear parallel for some important predictors, so at first glance a Weibull distribution seems appropriate. (Even if it wasn’t, I don’t think it would affect my simulation, as the Weibull distribution for each patient will be the ground truth later, not their actual survival time from the registry). Note: if there are some checks I should do for continuous predictors, let me know.
As I’m not allowed to add more than 1 picture to this post, please find the album here: imgur album
Modelling
First, I fit a Weibull model with some covariates to predict the survival time of the patients in my data set. I use this model for predictions under no treatment (a = 0). I’m using brms, and I think the model is parametrized as follows. I wrote this up myself after going through the brms documentation and Stan forum, so there might be errors.
We model the log of the expected survival time as a linear function of covariates:
\log(\mu_i) = \beta_0 + \beta_1x_{i1} + \beta_2x_{i2} + ... + \beta_px_{ip} \tag{0}
where \mu_i is the expected survival time for individual i, and x_{i1}, x_{i2}, ..., x_{ip} are their covariate values.
For notational simplicity, I drop the individual-level subscript i in the subsequent equations, but all individual-specific quantities (\mu, \lambda, and S(t)) should be understood as \mu_i, \lambda_i, and S_i(t) - representing the expected survival time, scale parameter, and survival probability for a specific individual with their corresponding covariate values.
The relationship between the mean \mu and the Weibull parameters is given by:
\mu = \lambda \Gamma(1 + \frac{1}{k}) \tag{1}
where \lambda is the scale parameter and k is the shape parameter. The survival function for the Weibull distribution is:
S(t) = \exp\left(-\left(\frac{t}{\lambda}\right)^k\right) \tag{2}
To calculate survival probabilities, we need to convert the model output \log(\mu) into the scale parameter \lambda. By rearranging equation (1), we get:
\lambda = \frac{\mu}{\Gamma(1 + \frac{1}{k})} \tag{3}
Note that the shape parameter k is assumed constant and does not vary with covariates in the regression model.
Substituting equation (3) into equation (2), we obtain the survival probability at time t in terms of \mu:
S(t) = \exp\left(-\left(\frac{t}{\frac{\mu}{\Gamma(1 + \frac{1}{k})}}\right)^k\right) \tag{4}
Another quick check: When I fit the model with ECOG as the only covariate, and overlay it on the KM curves, it looks okay. There probably are better checks which I don’t know.
# model fitting
fit.surv.ecog <- brm(
formula = survival_time | cens(censoring) ~ 1 + ecog_fstcnt,
data = surv_df,
family = weibull(),
cores = 4,
seed = 1,
iter = 4000,
control = list(adapt_delta = 0.85), #increased because 7 divergent transitions
file = "models/fit_surv_a0.rds",
file_refit = "on_change"
)
# manually calculating survival time + CrIs
# for the CI we obtain the scale parameter for every draw from the posterior, by:
# - getting all the posterior draws for the shape parameter (the same for every group)
# - joining this using draw id on each drop with the ecog conditional estimate
# - use these to calculate scale parameter for every group for every draw
# - then obtain mean, median, and quantiles for uncertainty
shape_draws <- fit.surv.ecog |>
spread_draws(shape) |>
mutate(drawid = factor(.draw)) |>
select(drawid, shape)
avg_pred_ci <- avg_predictions(fit.surv.ecog, type = "response",
variables = "ecog_fstcnt") |>
posterior_draws() |>
left_join(shape_draws, by = "drawid") |>
mutate(scale = draw/gamma(1+1/shape)) |>
crossing(etime) |>
mutate(surv_prob = exp(-(etime/scale)^shape)) |>
group_by(ecog_fstcnt, etime) |>
summarise(mean = mean(surv_prob),
median = median(surv_prob),
conf.lower = quantile(surv_prob, 0.025),
conf.higher = quantile(surv_prob, 0.975),
.groups = "drop")
Next, I fit a model to make individual level predictions of survival time. I will use this model to create potential outcomes for each patient for each simulation.
fit.surv.indiv <- brm(
formula = survival_time | cens(censoring) ~ 1 + diagnosis_lumped + ecog_fstcnt + poly(c_reactive_protein, 2) + poly(albumin, 2) + poly(lactate_dehydrogenase, 2),
data = surv_df,
family = weibull(),
cores = 4,
seed = 1,
iter = 4000,
control = list(adapt_delta = 0.85),
file = "models/fit_surv_indiv.rds",
file_refit = "on_change"
)
fit.surv.indiv summary
> fit.surv.indiv
Family: weibull
Links: mu = log; shape = identity
Formula: survival_time | cens(censoring) ~ 1 + diagnosis_lumped + ecog_fstcnt + poly(c_reactive_protein, 2) + poly(albumin, 2) + poly(lactate_dehydrogenase, 2)
Data: surv_df (Number of observations: 264)
Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
total post-warmup draws = 8000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 6.89 0.32 6.32 7.57 1.00 5368 4580
diagnosis_lumpedLungtumorincludingmesothelioma 0.05 0.35 -0.64 0.73 1.00 5528 5478
diagnosis_lumpedProstatecancer 0.53 0.45 -0.34 1.43 1.00 5933 5744
diagnosis_lumpedOther -0.24 0.34 -0.93 0.41 1.00 5381 5242
ecog_fstcnt.L -1.27 0.31 -1.90 -0.67 1.00 7101 5924
ecog_fstcnt.Q -0.15 0.26 -0.66 0.39 1.00 7166 5895
ecog_fstcnt.C -0.27 0.21 -0.71 0.13 1.00 6799 6001
polyc_reactive_protein21 0.13 2.28 -4.19 4.78 1.00 5408 5334
polyc_reactive_protein22 -1.02 1.44 -3.81 1.83 1.00 7947 5796
polyalbumin21 6.88 2.76 1.68 12.61 1.00 6195 5352
polyalbumin22 3.85 2.34 -0.43 8.79 1.00 6381 5799
polylactate_dehydrogenase21 -3.45 1.61 -6.35 -0.01 1.00 4795 4350
polylactate_dehydrogenase22 3.14 1.38 0.77 6.10 1.00 5928 3885
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape 1.09 0.11 0.89 1.31 1.00 4435 4919
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
These are the treatment models I will use to recover the effect I’ll introduce later:
# model with only trt_group
fit.surv.trt <- brm(
formula = survival_time | cens(censoring) ~ 1 + trt_group,
data = surv_trt,
family = weibull(),
cores = 4,
seed = 1,
iter = 4000,
control = list(adapt_delta = 0.85),
file = "models/fit_surv_trt.rds",
file_refit = "on_change"
)
# model with adjustment variables (currently all from pred model)
fit.surv.trt.adj <- brm(
formula = survival_time | cens(censoring) ~ 1 + trt_group +
diagnosis_lumped + ecog_fstcnt + poly(c_reactive_protein, 2) +
poly(albumin, 2) + poly(lactate_dehydrogenase, 2),
data = surv_trt,
family = weibull(),
cores = 4,
seed = 1,
iter = 4000,
control = list(adapt_delta = 0.85),
file = "models/fit_surv_trt_adj.rds",
file_refit = "on_change"
)
Simulations
This is (part of) the code I use for the simulations and recovery of the treatment effect. This is inside a parallelized function, with pred_model
using fit.surv.indiv
from above:
Code for Simulations
# shape is assumed to be constant/the same for all patients
shape <- posterior_summary(pred_model)["shape", 1]
# expected value of survival time for each patient
pred <- predictions(pred_model, type = "response") |>
data.frame()
# Set seed for reproducibility
set.seed(base_seed + i)
newdata <- pred |>
mutate(
shape = shape,
# calculate scale parameter with estimate (mu) and global shape
scale_a0 = estimate/gamma(1+1/shape),
# increase scale by factor of 1.25 (or any other treatment effect)
scale_a1 = (estimate*effect_size)/gamma(1+1/shape))|>
rename(pat_id = rowid) |>
group_by(pat_id) |>
# realization of weibull random variable for each patient for each potential outcome
mutate(
weibull_draw_1 = rweibull(1, shape = shape, scale = scale_a1),
weibull_draw_0 = rweibull(1, shape = shape, scale = scale_a0)) |>
ungroup() |>
# create treatment assignment (1:1) and choose potential outcome belonging to treatment
mutate(
trt_group = sample(c("A", "B"), n(), replace = TRUE),
survival_time = case_when(
trt_group == "A" ~ weibull_draw_0,
trt_group == "B" ~ weibull_draw_1),
# create follow-up time dependent of recruitment time (365 days fu time, 500 days recruitment)
fu_time = fu_time+ifelse(
is.infinite(recruitment_time),
0, # No additional time when recruitment_time is infinite
round(runif(n(), min = 0, max = recruitment_time))
),
event_death = case_when(
survival_time <= fu_time ~ 1,
TRUE ~ 0
),
survival_time = case_when(
survival_time > fu_time ~ fu_time,
TRUE ~ survival_time
),
censoring = case_when(
event_death == 1 ~ "none",
TRUE ~ "right")
)
# Update the treatment model with new data
updated_fit <- update(
trt_model,
newdata = newdata,
seed = base_seed,
cores = 1
)
# Calculate comparisons
comparisons <- avg_comparisons(
updated_fit,
variables = "trt_group",
type = "response",
comparison = "ratio"
) |>
data.frame()
Given that the ground truth for each patient is a Weibull distribution, I’d assume that a Weibull model should accurately recover the effect size, especially the model which includes all predictive covariates. However, I seem to overestimate this slightly.
For 200 simulations with effect size (ratio of the expected survival times of the groups) 1.25 I get on average:
# With heavy right censoring (about 70%)
> mean(results_adj$estimate)
[1] 1.298035
> mean(results_unadj$estimate)
[1] 1.285314
# With infinite follow-up
> mean(results_adj_inf$estimate)
[1] 1.276131
> mean(results_unadj_inf$estimate)
[1] 1.323146
Red dashed line at 1.25 = truth.
For the null (only 100 iterations) these are the results:
# With heavy right censoring (about 70%)
> mean(results_adj_null$estimate)
[1] 1.016193
> mean(results_unadj_null$estimate)
[1] 1.046222
# With infinite follow-up
> mean(results_adj_null_inf$estimate)
[1] 1.020453
> mean(results_unadj_null_inf$estimate)
[1] 1.074315
Is this expected behavior? Am I making any mistakes? Should I approach the planning entirely differently? Thanks a lot for your input! If I should provide any additional info, let me know.