Do we need a research hypothesis for developing a clinical prediction model?

Dr.ya.dev · March 20, 2022, 10:16am

Protocol committee insists on having a research hypothesis for a protocol on developing a clinical prediction model for mortality for a particular condition. I tried to defend that our objective is to develop a clinical prediction model from a set of putative predictors identied from extensive literature search and expert opinion, available at time of prediction and in routine use and less costly, However they insist on a research hypothesis. I am at a loss how to formulate a research hypothesis for a prediction problem.

f2harrell · March 20, 2022, 12:29pm

Sometimes we just need to get predictions. Somtimes to get good predictions we need to understand shapes of relationships between predictors and response. The latter is clinically very useful even if you don’t use the predictions. But behind all of this is the hypothesis that clinical decision making can be improved. Many a predictive model has been developed with no information ever obtained about how it improves clinical practice. So further developing the project to measure impact of the predictions could be quite important in your setting.

Dr.ya.dev · March 20, 2022, 1:38pm

Thanks prof Harrel for your suggestions

med_stat · March 21, 2022, 10:12pm

In addition, you can also take the currently known variables and suggest that some additional variable (say, a blood test or a biopsy) will improve the predictive ability of the model (say, make the brier score smaller). This is a tangible hypothesis that may satisfy their desire.

pmmbossuyt · March 31, 2022, 12:47pm

After studying “spin” in clinical research, I became even more convinced that we should prespecify minimally acceptable performance criteria when evaluating medical tests, and a prediction model should be regarded as a clinical test. Too many articles reporting prediction models end with generous conclusions about model performance.

When can we produce “good predictions” (as Frank H wrote on 20/3), when are they good enough to recommend the use - or the further development - of the prediction model?

In that sense, having predefined performance criteria (in evaluations) can be regarded as the equivalent of a research hypothesis (in explanatory research).

R_cubed · March 31, 2022, 2:42pm

Any thoughts on this post by Dr. Harrell that criticizes the use of retrospective probabilities like sensitivity and specificity?

Blockquote
Even physicians who understand the meaning of a probability are often not understanding conditioning. Conditioning is all important, and conditioning on different things massively changes the meaning of the probabilities being computed. Every physician I’ve known has been taught probabilistic medical diagnosis by first learning about sensitivity (sens) and specificity (spec). These are probabilities that are in backwards time- and information flow order. Even physicians who understand the meaning of a probability are often not understanding conditioning. Conditioning is all important, and conditioning on different things massively changes the meaning of the probabilities being computed. Every physician I’ve known has been taught probabilistic medical diagnosis by first learning about sensitivity (sens) and specificity (spec). These are probabilities that are in backwards time- and information flow order.

I’d think the rigorous framework for medical testing would be a decision theoretical analysis, where the utility of the test is placed in the value of information framework. Lots of interesting scholarship can be found here.

Foundational paper that discusses the challenges of decision theoretic methods for test evaluation, and methods to address them:

Closing paragraph of paper:

Blockquote
Hilden [21] has written of the schism between what he describes as “ROCographers”, those who are interested solely in accuracy, and “VOIographers”, who are interested in the clinical value of information (VOI). He notes that while the former ignore the fact that their methods have no clinical interpretation, the latter have not agreed upon an appropriate mathematical approach. We feel that decision curve analysis may help bridge this schism by combining the direct clinical applicability of decision-analytic methods with the mathematical simplicity of accuracy metrics.

trumanfrancis · March 31, 2022, 9:26pm

According to Ron Howard of Standford understanding conditional probability is something everyone from Judges to MDs get wrong. (39) Professor Ron Howard: Conditional Probability Lesson - YouTube

llynn · April 9, 2022, 6:01pm

This is important work so I wish you well. I, perhaps, understand their concern and it would rest on the term “set of putative predictors”. The set would need to be complete to provide reliable prediction. Without a hypothesis defining the specific surgery or illness, it may be difficult to determine if the set is complete. For exampleFEV1 is highly predictive of mortality with some surgical procedures less with others.

Also, although some might disagree, the defining of a universal set applicable to urgent surgeries would have to include relational time series data since the trajectory of the time series matrix of the patient is pivotal. Here I am referring to the mathematical patient model, the human time series matrix. In that model a force acting on a patient generates a distortion of the matrix comprised of a set of pathologic perturbations and compensatory perturbations followed by recovery vectors. The relational time pattern of that distortion is the state of disease. We see components of this distortion when we discuss platelets falling and their slope or the WBC count falling despite the presence of worsening disease.

Yet a model based on a set of static independent variables along the matrix can be useful in evaluation of populations or its output may be trended to insert time into the output. The clinical value of static models also depends on the volatility of the variables and whether or not they are part of an acute distortion or part of a low volatility (stable) baseline matrix (which may be abnormal due to chronic disease)

For these reasons I think a hypothesis which considers these issues (perhaps with an alternative model of the mathematical human) is required.