What kind of model to use?


I am working on a project where N doctors were asked to prescribe the same medication to the same set of n hypothetical patients.

Each doctor could choose a different dose of medication. For the purposes of this question, let’s assume they could choose between “small dose” and “large dose” and that n = 3 (i.e., there are 3 patients, seen by all doctors). Each of the 3 patients presents with a different indication.

As an example:

                          **Patient # 1       Patient # 2         Patient # 3**  
                             (Headache)       (Muscle Ache)       (Other Aches)  

        **Doctor # 1**     "Small dose"       "Small dose"        "Large dose"
        **Doctor # 2**     "Large dose"       "Small dose"        "Small dose"

There are two types of doctors in the data and interest lies in estimating the effect of doctor type on the probability of prescribing a “Large dose” rather than a “Small dose”, controlling for the amount of the total daily dose.

The first thing that I thought about was that I could treat doctors and patients as crossed random effects in a mixed effects binary logistic
regression model which would relate dose type to doctor type, controlling for total daily dose. However, not only is the number of patients small, but I only have one dose type per doctor by patient combination. So this approach is out of the question.

I also don’t know if it makes sense to drop patient as a random effect from the above model? The fact that all doctors consulted the same patients doesn’t seem to bode well for a model without a random patient effect.

Then I thought that I could analyze the data for each doctor separately (since we are not really interested in making comparisons across indications) via binary logistic regression, but what niggles at me is that the doctors all consult the same (hypothetical) patient for each indication. This likely invalidates the assumption of independence of the dose types prescribed by different doctors to the same patient. Maybe if I used the quasibinomial distribution in the binary logistic regression I would feel slightly better about this simplified approach (presuming the quasibinomial might be able to correct to some extent for the lack of independence), but I can’t because of separation issues for one of the indications.

Not sure what other types of models would make sense here - any ideas would be much appreciated.



Isabella, your question seems to regard your data as if they unexpectedly dropped on your desk. What’s the ‘origin story’ for these data? Why were they collected? What questions did the investigators have which motivated their conduct of this study? Ultimately, your scientific questions should determine the model(s) you attempt to estimate.

Just to make up an example, maybe there are 2 different specialties treating these problems, and the question is whether professional standards in one or the other specialty result in more uniform prescribing, or perhaps outright differences in dosing.



The question is whether there are differences with respect to the two types of investigators in whether they prescribe one dose type rather than the other, controlling for daily dose; this question can be pursued separately for each type of indication.

I was not involved in the study design, so in that sense the data landed on my desk. But I have my doubts as to whether the question can be answered given the current design - I guess that is what I am really trying to figure out.

The setting that generated the data would be more appropriate if one wanted to look at agreement in prescription between physicians (at least that’s what it seems like to me).

1 Like


The “agreement” thought struck me also. Were that of interest you might look at the observer variability chapter of BBR.