Time-varying covariates in a shared frailty survival model

caseyleemcatee · September 28, 2020, 3:47pm

Good morning,

I am somewhat new to survival analysis. I am having trouble coding a survival analysis model for recurrent events that includes time-varying covariates. Due to the data structure, a shared frailty survival model is preferable (i.e., high inter-subject variability). I am familiar with the various approaches to modeling recurrent events (e.g., Andersen-Gill), but I cannot figure out how to build this model within R in such a way that correctly includes a time varying covariate.

Suppose in this example we are modeling recurrent infections. The data structure that I’m familiar with creates a data frame where every row represents an event (or right censoring if there is no event). Subjects with 0 events will have one row only, subjects with 1 event would have 2 rows, etc. Individual covariates are in columns along with necessary start and stop time information, patient ID, etc.

The data structure for the shared frailty regression model that I’m familiar with is created with the following code. I’ve included the first two subjects and one covariate: sex.

require(dplyr)
id <- c(1, 1, 1, 2, 2)
start <- c(0, 42, 476, 0, 91)
end <- c(28, 52, 700, 77, 375)
episode <- c(1, 1, 0, 0, 1)
time <- c(28, 10, 224, 77, 284)
sex  <- c(1, 1, 1, 2, 2)
df <- tibble(id, start, end, episode, time, sex)
print(df)

I can then generate my model using the code: mod_frailty <- coxph(Surv(start, end, episode) ~ sex + frailty (ID, dist=“gamma”),data= df)

QUESTION - suppose I have a time-varying covariate which is a laboratory value. It can be either normal or abnormal (coded as 0/1), but it varies throughout the follow-up time, and the intervals of that variation are different for each patient (the lab is checked inconsistently among subjects). The lab value is generally obtained within a month of the event, and I have the specific dates for the labs. The general data structure for time-varying covariates that I’m familiar with splits the rows according to when the time-varying covariate changes from 0 to 1 (or 1 to 0) – that is, the rows are defined by the variable and not be the event as in the data structure for recurrent events.

How does one combine these two approaches so that I can include the time-varying variable in this model for recurrent events?

I hope that this was clear.