On surrogate end points in cancer research

Good morning, I wanted to ask datamethods readers what they think is the best method implemented in R to estimate the correlation between two censored variables, such as PFS and OS, so that the former is contained within the latter. It would also be interesting to reflect on the most appropriate criterion to consider that PFS is a good surrogate marker of OS, taking into account how the number of available treatment lines or the aggressiveness of tumors influence that criterion. Thank you.

1 Like

As we all know, even the humble quadratic equation does not admit of a comprehensive treatment unless we carry our analysis into the Complex domain. Likewise, I suspect these ‘surrogacy’ questions will not yield to analysis on purely statistical considerations. Rather, they require that we carry our analysis into the realm of stochastic modeling of tumor dynamics, game-theoretic treatment of sequential treatment (as you initimate), etc. Here are 2 recent papers that could be read as cautions against blithely depending on any straightforward, workmanlike statistical treatment of these issues:

  1. Johnson K, Gomez A, Burton J, et al. Directional inconsistency between Response Evaluation Criteria in Solid Tumors (RECIST) time to progression and response speed and depth. European Journal of Cancer. 2019;109:196-203. doi:10.1016/j.ejca.2018.11.008 (open access)

  2. Staňková K, Brown JS, Dalton WS, Gatenby RA. Optimizing Cancer Treatment Using Game Theory: A Review. JAMA Oncol. August 2018. doi:10.1001/jamaoncol.2018.3395

i think Piantadosi’s chapter titled ‘objectives and outcomes’ describes problems and validation of surrogate endpoints Clinical Trials: A Methodologic Perspective. Correlation wouldn’t do it. PFS is defined by OS, the correlation is by definition. Anyway, isn’t PFS a well-established surrogate?

I’ve looked at those references. The feeling is that the surrogates are very specific to the disease, and its context. Difficult to standardize the criteria, I think. As Norris says, there doesn’t seem to be an easy answer. In my specific case, I have estimated a correlation between PFS and OS according to Schemper et al (Stat Med 2013; 32: 4781-4790). I found r=0.55 (CI 95%, 0.35-0.69). Is it the best method? How to interpret it then? According to some recommendations, 0.55 would be evidence of lack of usefulness of PFS as a surrogate for OS in this context:

However, intuitively I am not convinced that correlation is as relevant as the sole criterion for ruling out the usefulness of PFS in this case. The correlation is possibly low because these tumors have many possibilities of successive lines of treatment beyond progression. But the truth is that the clinical situation I have studied is based on a phase III trial that used PFS as the main end point, which is generally considered a good subrogate of OS here. But I don’t know on what basis.??

I agree that surrogacy only tells you a small part about surrogacy. Multistate models might be useful to derive a concept of cor(PFS, OS), as outlined here: https://arxiv.org/abs/1810.10722. Multistate models might also be useful to understand the dynamics of further lines of treatment (if you have enough data). The paper also provides a short discussion, including a few references, on the surrogacy aspect in general.

1 Like

I’ve wondered whether anyone has cast this problem as a multiple imputation one. Given a large dataset containing both the surrogate and the gold-standard outcome, one can develop multiple imputations to get between the two. Then in another dataset without the gold standard, one could multiply impute the gold standard from the surrogate and analyze these imputed values. This would also expose a certain type of futility of using surrogates, as the effective sample size for this analysis of the gold-standard target would be much less than the apparent sample size.

1 Like

A more general question. How do you analyze the correlation of two censored variables, such as PFS and OS, taking into account that one of them, PFS, cannot last longer than OS? Is there a method in R to do this?

but the iqwig guideline you point to says “The demonstration of a correlation between the surrogate and the clinical endpoint is in itself insufficient to validate a surrogate.” Surely that is true. The regulatory authorities will demand more than you can achieve, however, i don’t see who will be persuaded by the estimation of a correlation, especially if PFS includes OS in its definition. But if you need to quote some correlation estimate, the paper re joint modelling that you linked to seems a good way to go about it, at least to me


The way I see it, there are two main approaches to estimate and analyze the correlation between PFS and OS: multi-state models and copula regression models.

The literature on this topic is abundant and covers not only the methods but also the surrogacy question … here are a few links to start with:


  • DOI: 10.1002/sim.3918 [Dejardin et al. SiM 2010]
  • DOI: 10.1080/19466315.2015.1093539 [Xia et al. SBR 2016]
  • DOI: 10.1002/sim.7641 [Zeng et al. SiM 2018]
  • DOI: 10.1002/bimj.201700238 [Lauseker et al. BiomJ 2019]
  • DOI: 10.1002/sim.3637 [Fleischer et al. SiM 2009]
  • DOI: 10.1002/sim.6501 [Li et al. SiM 2015]
  • DOI: 10.1002/sim.7651 [Xu et al. SiM 2018]
  • DOI: 10.1002/sim.8001 [Weber et al. SiM 2018]

Likewise, there are several R packages allowing the implementation of either approach.

Hope that helps,


It’s very interesting. I’ve been reading the references provided by Francois, very useful. Thank you very much. The problem is that according to Weber (doi/10.1002/sim.8001) copula models are questionable because they do not take into account the fact that the PFS is contained in the OS. Weber’s article is the most interesting in my opinion. However, I am not currently able to estimate the Kendall tau from the coefficients of the multi-state model. Can anyone provide a practical step-by-step example of how the correlation coefficient would be derived according to Weber’s article?

There’s a paradox here. Regulatory agencies seems to accept PFS as the main end point in some tumors that have many treatment options behind the first line, so paradoxically just in these populations it is impossible to demonstrate a correlation between PFS1 and OS, because it is obscured by the rest of the treatments that are administered afterwards. Therefore, I believe it is true that correlation measures may have a scarce practical utility in these cases.

Hello Alberto,
I think you can refer to the R code provided in supplementary material of the Weber and Titman article.
Best regards, Francois

I tested the code from Weber’s article. It strikes me that I get a higher kendall tau using the Clayton copula model (0.517), even though in theory it was a more conservative approach in comparison to the multi-state model (0.352). With the R package ‘SurvCorr’ the correlation is 0.557. Is the low result in the multi-state model possible? Maybe the parametric model does not fit the data? Any insight?