Statistical power in randomized clinical trials versus observational studies with identical questions

albertoca · October 5, 2019, 3:04pm

Hi, in Spain we wish to replicate this randomized clinical trial with an observational registry of cancer:

ncbi.nlm.nih.gov

Phase III study of docetaxel and cisplatin plus fluorouracil compared with cisplatin and fluorouracil as first-line therapy for advanced gastric cancer: a report of the V325 Study Group.

E Van Cutsem, VM Moiseyenko, S Tjulandin, A Majlis, M Constenla, C Boni, A Rodrigues, M Fodor, Y Chao, E Voznyi, ML Risse and JA Ajani, Journal of clinical oncology : official journal of the American Society of Clinical Oncology, Nov 2006 01

In the randomized, multinational phase II/III trial (V325) of untreated advanced gastric cancer patients, the phase II part selected docetaxel, cisplatin, and fluorouracil (DCF) over docetaxel and cisplatin for comparison against cisplatin and fluorouracil (CF; reference regimen) in the phase III part.Advanced gastric cancer patients were randomly assigned to docetaxel 75 mg/m2 and cisplatin 75 mg/m2 (day 1) plus fluorouracil 750 mg/m2/d (days 1 to 5) every 3 weeks or cisplatin 100 mg/m2 (day 1) plus fluorouracil 1,000 mg/m2/d (days 1 to 5) every 4 weeks. The primary end point was time-to-progression (TTP).In 445 randomly assigned and treated patients (DCF = 221; CF = 224), TTP was longer with DCF versus CF (32% risk reduction; log-rank P < .001). Overall survival was longer with DCF versus CF (23% risk reduction; log-rank P = .02). Two-year survival rate was 18% with DCF and 9% with CF. Overall response rate was higher with DCF (chi2 P = .01). Grade 3 to 4 treatment-related adverse events occurred in 69% (DCF) v 59% (CF) of patients. Frequent grade 3 to 4 toxicities for DCF v CF were: neutropenia (82% v 57%), stomatitis (21% v 27%), diarrhea (19% v 8%), lethargy (19% v 14%). Complicated neutropenia was more frequent with DCF than CF (29% v 12%).Adding docetaxel to CF significantly improved TTP, survival, and response rate in gastric cancer patients, but resulted in some increase in toxicity. Incorporation of docetaxel, as in DCF or with other active drug(s), is a new therapy option for patients with untreated advanced gastric cancer.

The original RCT compared DCF vs CF in stomach cancer. The authors specified a sample size of 230 patients per arm, for 95% power.
In our observational registry we already have more events and patients per category (238 DCF + 1139 CF).
However, we need to fit a multivariable Cox model with at least 8 confounding factors selected by experience, a few missing data (low rate in same covariables), and I have to test several key interactions.
Can we assume the same statistical power as in the RCT? How can I estimate my statistical power in such a setting? Since I only have to fit the Cox regression with our current dataset to see if we had enough statistical power or not, and I will have residual bias, is my question utterly stupid?

f2harrell · October 5, 2019, 5:52pm

In already-collected data, power is not the issue. I would concentrate on the precision (e.g., half-width of confidence/compatibility interval) of key parameter estimates.

It is a bit unusual to do the observational study second. Normally we validate results of observational studies with RCTs.

The big question here is bias, more than power or even precision. And doing a sensitivity analysis for possible effects of unmeasured confounders will help.

albertoca · October 5, 2019, 6:28pm

Ummm, indeed, the reason why we find the analysis interesting is because at that time (2006) there was not so much awareness that stomach cancer is actually several different pathologies. Thus, we would like to check the heterogeneity of effects according to histology and other criteria that have been seen to be important.
It may also make sense to check what happens in the real world, since DCF is a fairly toxic treatment, and probably in daily practice is applied to patients older and with a worse general situation than those included in the original trial.
The data are already collected, but we had been waiting to have at least the same patients with DCF as in the original study.
So I understand that in the statistics section of the hypothetical future article it would not make sense to allude to sample size except to briefly explain why this isn’t important. However, then in results some commentary or reference should be made about the amplitude of the confidence intervals obtained, but without pretending to control their size beforehand. Is that it?

albertoca · October 5, 2019, 6:38pm

To better control the issue of bias, what I had thought was to do a Bayesian analysis with brms package, introducing the result of the randomized trial as prior. My idea was to analyze the heterogeneity of effects assuming that prior that indicated the superiority of DCF. I had thought that could be beautifuly effective at controlling bias there, because if despite a prior in favor of the existence of an effect, and despite the residual bias possibly in favor of DCF, it turns out that the hazard ratio = 1 in some histology, possibly this would be convincing. But this analysis is hard!

f2harrell · October 6, 2019, 12:03pm

Yes. Once the sample size is fixed, aim at discussing the yield of that sample.

An interesting idea. But it may work against how the result will be perceived re: “independent” replication.