Precision and decision making

Sensitivity and specificity apply to retrospectively sampled data, e.g., case-control studies, and play no role in prediction. The prediction problem does not condition on the future to predict the past.

Changes made to odds ratio throughout. However, for ovarian cancer the rare disease assumption seems quite reasonable so the OR does approximate the RR.

I am primarily interested in the principle around the precision of estimates and clinical decision making. Whether any given estimate is relevant to a specific population is a different issue (though important).

This is probably a more complex question than you hoped it was, since your decision to treat will also need to depend on things like outcomes from surgery, life expectancy with/without Sx, costs (to help capture opportunity cost), quality of life in any additional years etc… etc… and that’s before you even get to the uncertainty part. So in my opinion what you need is some sort of the loss function, with the approach I am most familiar with in health being to create a cost-effectiveness model, but you might also be interested in/able to get away with something like stochastic multi-criteria acceptability analysis. In both cases I would argue that precision is only important if you are in a position to recommend further research (see: https://www.ncbi.nlm.nih.gov/pubmed/10537899).

I guess you could also look at whether recommendations would differ along your CI? But again I think that depends on assumptions you make around baseline risk and all the risks/benefits of subsequent therapy.

I would say that the RR approximates the more valuable OR in that case :slight_smile:
OR is more transportable since unlike RR it does not depend on baseline risk.

My point was to take as given all those other issues and assume that the decision threshold was established. This is quite a pragmatic issue. Many family cancer clinics offer prophylactic surgery at a given risk threshold. If the odds ratio estimate for a given gene were 3 (0.3 - 30) then it is unlikely that anyone would be testing for that gene at all. The idea that clinicians would be carrying out some complex sliding scale decision analysis seems unlikely. So, at what level of precision does it become useful to do testing for that gene?

I guess what I’m saying is that I don’t think this is really possible without some sort of loss function. How expensive is the test? Who bears those costs? What care is foregone as a result of loss of that money/clinician time? From whose perspective is the decision being made? How long will the test be relevant? What benefit does it provide over the current best-practice (genetic risk + history and environment I assume?). I guess you could try to assume that the OR is directly proportional to utility of the decision, but I have a hard time believing that will lead to good decision making. Even then, the uncertainty related to the decision is more relevant for whether to delay implementation to fund future research to increase precision or implement now and also fund research.

Say that you use some rule that says you shouldn’t recommend treatment if your 95% Credible interval includes 2 (because of some sense that then risks of surgery or risks + costs outweigh benefits). Then in this case you would not recommend use of the screening tool even though it’s more probable to be effective than not. The link above is a seminal paper on exactly this problem (i.e. how to incorporate uncertainty into decision making).

Decision analysis for a given population would be done by a research team and then used by clinicians and patients as part of the decision making process if the patient in front of them is relevant to that analysis.

Some other thoughts that come to mind:

  • Your OR is really only a tool to turn an assumed baseline risk into absolute probability for a given population. Making a cut point based on an OR assumes that you have a well enough defined patient population that they all have similar balances of benefits/harms at that OR.
  • All of this also assumes your predictions are well-calibrated, your OR of 3.01 can be interpreted causally, etc…
1 Like

I think we are discussing at cross purposes. My point is not whther 3 or 4 or 2 or any other value is the right trheshold, but if we have a threshold what credible interval is relevant. You selected 95% but why? Why not if the 50% credible interval excludes 2, or even just the point estimate.

I just used 95 to be consistent with what you were using above.

The relevant interval is going to depend on whatever loss function you implicitly used when coming to your OR = 3 threshold. Maybe the easier question is at what point would you no longer recommend treatment? OR = 1.3? 2? 2.5? Are you risk averse? Is the patient? Maybe the best approach would be to make a plot with ORs ranging from 1-5 on the x axis and P(OR at least that large) on the y.

Whether or not you recommend using the tool shouldn’t be based on any cut-point from that plot though, if your goal is to maximize patient benefit.

I respectfully disagree. It is perfectly possible to estimate the accuracy of a test by estimating their prospective sensitivity and specificity. Here are a couple of examples: doi: https://doi.org/10.1136/bmj.327.7426.1267 and https://doi.org/10.1111/j.1464-5491.2005.01494.x. The “prospective” sensitivity and specificity of the test may be the best way to assess the accuracy of any tool used for prediction, at least from the clinical perspective. If we are to make a treatment offer to a patient, based on the results of a predictive test, which is the issue in this case (if I got it right), we should know how good our prediction would be conditional on the finding of the predictive tool. A relative measure of effect does not tell that. We do not offer treatments based on risk ratios or odds ratios. As suggested in a previous comment, what is needed in this case is some sort of loss function, such as a cost-effectiveness analysis. Otherwise, we can not know what is the need benefit of using this prediction tool and what is its incremental cost-effective ratio as compared to the standard option (i.e. not using the prediction tool). For instance, when we use the Framingham equation to inform the decision of offering preventive treatment, we do not do it because the risk of cardiovascular events in patients with a high Framingham score is 5 or 10 times higher than in patients with a low Framingham score. We do it because we know that a sizable fraction of patients with a high score will develop a cardiovascular event and that, event if not all of them will develop a cardiovascular event, the benefit of this approach (selecting patients by their expected risk) is greater than the cost (including the cost of not treating patients with a low score who will develop a cardiovascular event).

I have obviously failed to explain the issue sufficiently. The Framingham score is somewhat distinct from a genetic test. The genetic test is either positive or negative - there is not some sort of continuous measure with the test being labelled as positive above a given threshold.

The penetrance or disease risk is akin to the positive predictive value - it’s the probability that someone who tests positive will get the disease in the future. SO we ahve a direct estimate of the PPV and the issue is about the precision of that risk estimate associated with a positive test.

I would no longer offer intervention if I were certain the OR<3. The question is how certain?

Thanks for clarifying this.

I think you have been clear. You’re asking a question about uncertainty in decision making, but I think the issue is that we approach recommendations for medical decision making very differently. I just don’t think you can answer any questions about how you should deal with this kind of uncertainty without really knowing the costs of being wrong/benefits of being right.

If we take for granted that the OR of 3 is a good decision making rule, and that any value below 3 would result in withholding the test then the value of information (i.e. future research) would be at its peak so then the question is whether future research is actually worth it, since as you mention:

So one approach to whether you should make a decision today vs do further research is to conduct a value of information analysis which requires:

  1. A estimate of the size of population for whom the test is relevant
  2. A time horizon over which this test would be relevant (i.e. when it will be replaced)
  3. The cost of future research
  4. The time required for additional research
  5. A loss function

If the cost of future research (in $ and foregone health) is more than the value of that information then the right decision is to make the decision today that maximizes utility. This way you’re taking into account all potential benefits, and harms including the harm of delaying access. This assumes you aren’t risk averse, so it will change if you are (i.e. being a little wrong is worse than the truth being a little better). You may be interested in reading Gianluca Baio, Carl Claxton, Mark Sculpher, Andrew Briggs (white book). This is why I don’t think you can separate the problems of specifying a cut point and assessing how you should handle uncertainty in your parameter.

Using retrospective reverse information-flow quantities of sensitivity and specificity is like making three right turns to do a left turn. Coupled with the fact that sens and spec vary significantly by patient types it’s very hard to see why the use of transposed conditionals ever caught on in the first place. If you like sens and spec then you’d have to like the following way of summarizing a hypertension randomized trial: patients with higher blood pressures were more likely to have been randomized to treatment A.

Let me clarify my original concern, as I’m finding it hard to grasp the relevance of your arguments. If I understood correctly, what motivated the original question was the investigators desire to develop a clinical decision rule: When should we offer an oophorectomy to a patient with a positive genetic test? To answer that question the investigators are looking for ways to refine the precision of the odds ratio for the association between the gene variant(s) and the incidence of ovarian cancer. My point (limitation?) is that I fail to see how such a rule could be based on a measurement of the association between the gene variant(s) and the disease. As far as I understand, such a rule can not be defined without taking into consideration the properties of the test (sensitivity and specificity), the expected incidence of the disease, and the absolute and relative cost and benefits of the proposed intervention. I fail to see what this has to do with the flow of information in estimating sensitivity and specificity or the direction of causality in interpreting randomized trials.

They are not that different, since we make the Framingham score positive or negative by using a predefined cut point. I think Tim suggestions are right on target: “we approach recommendations for medical decision making very differently”… you should conduct a value information analysis (I’d say a cost-effectiveness analysis) to define the best cut point. Assuming you could get the best precision possible for the OR, I still do not see how you would go from the OR to a clinical decision rule (a decision to offer or not to offer an oophorectomy to a patient with a positive test). Of course, my not been able to figure out how to do this, does not mean it is impossible.

Yes thanks for pointing us back to the original question. When you mentioned sensitivity and specificity those were unnecessary to the discussion.

While having good precision in estimating ORs (and adjusted regression effects in general), the discussion really needs to be on the absolute risk scale with full adjustment for all relevant clinical variables. Then one can simply plot post-test probabilities against pre-test probs. to see relevance (where “test” = new information such as genetics) for clinical decision making, as is shown here.

I made is explicit in my original post that this is not the question. The question is, given a defined clinical threshold based on a risk, how precisely do we need to estimate this risk. I can only suggest you go back to my original question and reread it carefully.

I see your point now, and I believe we’re in the same page. This is my overall take: A precise OR would be best for assessing whether the new genetic test adds to the “standard” clinical data in predicting who will get the disease. This is a necessary step, before determining how make the best use, from the clinical perspective, of an improved risk score scale. Some form of cost-effectiveness or value of information analysis should be used in the latter step.