Evaluating how well does a proxy measurement represent the real thing

A friend of mine consulted with me his design for an interesting experiment. I don’t fully understand the medical context so I hope I am not messing up, but I am interested in the abstract question anyway. The situation is that there are two ways to determine whether a patient’s heart function post-operation will improve when given intravenous fluids (I’m told this helps in some but by far not all cases):

  1. Give them a small amount of fluids and see what happens. This gets you a very good answer, but in case the patient does not respond, you’ve just caused the body some unnecessary stress.
  2. A non-invasive, proxy measurement. Possibly more error, but less stress on the patient.

The point is to determine how well does 2) approximate 1). In the study they plan to measure 2) first and then do 1) for a set of patients. The reverse order doesn’t make sense as 1) is likely to influence the result of 2). The measurement have quite different measurement units.

My first response would be to fit a linear or some restricted non-linear (e.g. monotonic splines) Bayesian model and then look at the distribution of residual errors and directly interpret those, as in “The posterior probability that the error is < [clinically relevant difference] is X%, the maximal observed error is Y, which is still acceptable, unless there is high risk”. Or even “Assuming this decision rule, the probability of incorrect decision is Z%”.

But my experience is that when doing Bayesian analysis it is less risky for review to also run some frequentist equivalent (at least as a smokescreen :slight_smile: ). And I have no idea how to do that here. How would you approach the problem from a frequentist viewpoint?

If you have some ideas to improve the Bayesian part, I’ll be also happy to hear that.

1 Like