Leaving aside issues with trial design (it’s unfortunately out of my hands)… we have a study looking to review a number (~20-30) of potential markers of response in a single arm study (where everyone will get treatment), to look at whether these can help direct treatment. These have been identified in previous studies so we are aiming to see which may be worth pursuing. Response will be modelled using a proportional odds model. Looking at @f2harrell’s resources on biomarkers, I am considering modelling each biomarker individually and using some summary of model performance to rank the biomarkers, and using bootstrapping to find confidence intervals of the rank. First question is - does this sound reasonable? Second question is (and assuming the answer to #1 is yes!): are univariable or multivariable models preferred? Finally: is there a preferred model summary for a PO model to rank them with? (LR or pseudo-R2?).
We have used Richard Riley’s work to consider sample size. Looking at this group’s recommendations for analysis, i am not sure how far to take it. We are not interested in a predictive model to be used in practice per se, rather trying to understand what (if any) markers are worth exploring further.
One lesson I learned as an advocate participant in Biospecimen workshops at NCI is that how the bio-specimen samples are taken can influence the validity of a finding. For example, how much time it takes to snap freeze a tumor sample can influence its genetic and epigenetic expression – the processing of the sample must be standardized. The reason this and other potential issues are important to patients is that correlative studies mandating biopsies (in particular) are a burden to participants and can repress timely accrual without increasing knowledge.
And how is response defined? You often need to have a randomized crossover study with multiple periods to be able to determine response for an individual patient. Without that, response is distorted by regression to the mean and who is called a responder may well not be if they were studied again. @Stephen has written widely on this.
I’ll bet you’ll find that the probability of a response will depend on the baseline tumor volume, so it’s not a proper response measure. A better way of thinking about this is to model tumor size as a function of time, adjusted for the initial tumor volume using a flexible model.
Could you provide an example where this has been done? This is crucial in tumors where, for example, RECIST criteria are used to define response. Patients often have an initial response (e.g., PR), but by the next control, they may progress to PD, and even in a subsequent assessment, they might have a partial response again, but still be classified as PD according to the criteria.
What you’re mentioning seems extremely interesting for assessing the potential association with a biomarker. If you have an example where this has been used, or even an article that discusses this, it would be very interesting.
Thank you!
My understanding: Tumor response is considered by FDA a surrogate endpoint with an unreliable relationship to clinical benefit - living longer or better. More often time-to-event endpoints are used for efficacy in cancer trials, such as Progression Free “Survival.” The S in PFS is a misnomer in that a PFS advantage does not always predict improved survival. Tumor response is sometimes used in single arm studies for accelerated approval for indications that are dire – that have no effective treatment.
I think the agency appreciates this too – the reason accelerated approvals are conditional - requires a controlled study with more reliable endpoints as a condition for continued marketing approval. As an advocate I think of clinical science as a hybrid science that will bend to the urgent needs of patients - putting time ahead of certainty in some clinical settings.
For example what AIDS advocates pushed for (and got) changed over time. When HIV was a death sentence, they pushed for accelerated approval based on surrogates; when effective treatments emerged they pushed for more rigor in the clinical research - validating surrogates.