External validation of a time-to-event prognostic model

philiph99 · March 25, 2024, 7:12pm

A clinical prediction model has been published in the field of oncology. The prediction model is only available as a nomogram or web-calculator. No regression coefficients or else are reported. I have calculated the probability of the outcome using the web-calculator.

I have tried the following:
auc_cox ← as.numeric(timeROC::timeROC(
T = dat$time_surgery, # time from surgery to outcome
delta = dat$status, # survival status
marker = dat$pred_model, # predicted probability of the outcome [0-1]
times = 3,
cause = 1
)$AUC[2]) # 2nd value because first value is at t = 0

val_ests ← rms::val.surv(
est.surv = dat$pred_model,
S = Surv(time = dat$time_surgery, event = dat$status),
u = 5*365,
fun=function(p)log(-log(p))
)
plot(val_ests, xlab=“Expected Survival Probability”,ylab=“Observed Survival Probability”)

However, the results are quite strange.
Is this the correct approach?
I want to calculate Harrel’s C, Uno’s AUC and draw a calibration plot.
Which R functions can I use to externally validate this model using an independent dataset?

Thank you.

f2harrell · March 26, 2024, 1:08pm

I’m not clear on why time-dependent ROC curves are that helpful here. In terms of the external validation you are doing, you didn’t show the result.

I can’t see anything wrong with the code if time_surgery is in days.

val.surv is only for external validation.

felippelazar · May 2, 2024, 1:24pm

@philiph99 have you solved your problem? I’m running into same issue and was wondering if it worked properly.

One approach I can think of to get estimated AUC or c-index would be running a model with the predicted probabilities as predictor and the outcome as your time-to-event variables. Then, would extract model performances including the c-statistic. Something like this:

summary(coxph(Surv(time, status) ~ predicted_prob))

Would this approach work as intended?
I asked the authors for the actual punctuation of the score and not only the predicted probabilities but they have not answered so far. In this case, I think I can not “recalibrate” the algorithm in my dataset. Any ideas?

To make thinks clearer, I have only three available columns (predicted probability, outcome_time, outcome_status)

My general idea would be calibrating and externally validating this time-to-event score in my dataset.

If @f2harrell has any ideas, would be happy to hear.

Thank you in advance,

Felippe