Estimate a HR for a subset population from recent NEJM immunotherapy head and neck cancer paper

Hello datamethods,

I am a head and neck cancer surgeon. We recently had a practice changing clinical trial published in the NEJM. This study investigated neoadjuvant (before surgery) and adjuvant (after surgery) immmunotherapy in locally advanced head and neck cancer patients undergoing definitive surgical resection. It was a phase 3, open label trial. This trial demonstrated that the addition of neoadjuvant and adjuvant pembrolizumab to the standard of care significantly improved event-free survival among participants with locally advanced HNSCC.
https://www.nejm.org/doi/abs/10.1056/NEJMoa2415434

There was a recent FDA approval for pembrolizumab based on the results of the trial FDA approves neoadjuvant and adjuvant pembrolizumab for resectable locally advanced head and neck squamous cell carcinoma | FDA. This approval is for CPS score patients 1 or greater.

My question for datamethods: Is there a way to estimate the HR for EFS for the CPS 1-10 group?

CPS 10 or greater = 65% of the cohort (465 / 713)
CPS 1 or greater = 30% of the cohort (217 / 713)
CPS less than 1 = 4% of the cohort (31 / 713)

CPS 10 or greater HR 0.66 (95% CI 0.49 - 0.88)
CPS 1 - 10 HR Not Reported - is there any way to estimate?
CPS 1 or greater HR 0.70 (95% CI 0.55 - 0.89)

Was the EFS benefit in CPS 1 or greater population brought up significantly by the benefit in the CPS 10 or greater group?

My major concern is that there is potentially less benefit in the CPS 1-10 population and this is a significant change in practice. The clinical trial was enriched in CPS 10 or greater patients (2/3) whereas my practice is closer to 50 / 50.

Thank you all for the consideration.

5 Likes

You’ve posed a question that seems to have several dimensions. On the one hand, this could be seen as an exercise for a statistics problem set, examining how HRs of 2 populations combine with pooling. But also there are other, larger questions.

  • It would have been interesting to see a spline used to explore continuous variation with CPS (or indeed, with its several components). Would that have been feasible with these sample sizes?
  • For clinical decision-making, wouldn’t risk differences be more useful?
  • Wouldn’t OS be the more meaningful endpoint? I note that the same HR = 0.72 for death was reported for both \text{CPS}\ge 10 and \text{CPS}\ge 1 populations. So presumably their set difference \text{CPS}\in[1,10) would have the same?
  • On whom should we blame the categorization of a continuous covariate here? Did the surgeons make the statisticians do it? Did everybody just knuckle under to NEJM statistical ‘standards’?
2 Likes

David made excellent points. This kind of dichotomization has hurt research and practice in so many ways and has resulting in lost opportunities to get more information from expensive studies. Had I been a reviewer I would have demanded a continuous dose-response-type analysis for the interaction effect, and another analysis that compares this trend for the mortality effect to the trend for progression.

When analyzing an ordinal or continuous variable like CPS it is important because of sample size considerations to borrow information across the variable’s levels using an interpolating method (nonparametric or splines as David mentioned).

The investigators tacitly assumed that patients with CPS of 20 are the same as those with CPS of 10, one of many hidden assumptions.

2 Likes

Thanks so much for your response David.

If someone could help with the exercise for a statistics problem set - explaining how HRs of 2 populations combine with pooling - that would be awesome.

Here are my responses to your comments:

Agree that a spline used to explore continuous variation with CPS would have been a better approach.

Risk differences would likely be more useful for clinical decision making.

EFS is a reasonable endpoint in this population. The cost of recurrence is extremely high. Often very morbid and typically leads to death due to cancer.

To clarify this trial was organized by several oncologists in head and neck oncology and industry supported. The CPS > 1 and CPS > 10 analyses were pre-specified. I don’t necessarily agree with these arbitrary cutoffs but this was the approach for FDA approval.

To Frank’s point - I agree and do not think that a CPS of 1 has the same response as a CPS of 10 or 20. This is my major concern.

2 Likes

To clarify, did the oncologists not collaborate with an experienced biostatistician when doing so?

Thanks Frank - I was not involved in study design so a bit difficult for me to comment. There is a industry biostatistician on the NEJM authorship.

1 Like

This is interesting; do you propose this in the spirit of ‘falsification testing’ [1]?

  1. Jena AB, Sun E, Goldman DP. Confounding in the association of proton pump inhibitor use with risk of community-acquired pneumonia. J Gen Intern Med. 2013;28(2):223-230. doi:10.1007/s11606-012-2211-5
1 Like

At a high level, even if everything else on this particular RCT is perfect, looking at the HR changes between such subgroups is more likely to confuse rather than inform clinical decision-making. We may typically be better off reading tea leaves instead. For data and discussion on this related to oncology see here and here (with commentary here).

Instead of fishing for such so-called “predictive” interactions, we are better off harnessing the so-called “prognostic” effects (e.g., assuming a stable hazard ratio across all subgroups) as described, e.g., here. See also our review tailored to an oncology audience here.

Oncology has somehow become uniformly convinced that it is better to prioritize “predictive” differences at the hazard ratio scale as opposed to the typically more powerful “prognostic” main effects. This is likely due to confusing statistical interaction with biological interaction. Pockets, e.g., within breast oncology are an exception. This has been a hugely missed opportunity as prognostic forecasting and risk modeling is severely underdeveloped in our field. The kidney cancer community has began to recognize this and the kidney cancer association will release a statement on this topic in the coming weeks – just in time for the emergence of biomarkers across kidney cancer that will need rigorous validation using principles taught to us by @f2harrell and nicely summarized, e.g., in this recent review.

3 Likes

I wasn’t thinking of that.

Pavlos I had missed the Efthimiou et al tutorial. Wow it’s wonderful. Thanks for all the great information you provide on datamethods.

1 Like