Very small number of groups - MLM or OLS?

I’m working with observational data collected via random samples of patient charts. The goal of the study is to assess whether a new treatment regimen is more effective than the standard of care. Data were collected from the period before and after the implementation of the new regimen (on different patients).

Primary dependent variable = pain score

Primary independent variable = new treatment regimen

Potential level 2 variable = physician

Data were collected on the patients of three physicians who implemented the new treatment regimen. There are relatively equal amounts of patients per physician with a total N =~300. Given that prior research indicates that there is likely high between-physician variability (I haven’t yet checked because data are still being collected) on these types of outcomes, I thought that a multilevel model would make sense in this case.

However, after reading some reviews (an example: https://www.tandfonline.com/doi/pdf/10.1080/00273171.2016.1167008 — page 6) and simulation studies that examined the performance of MLMs when cluster sizes are small, I’m wondering if this should just be addressed via an OLS model (or some variation of that), using physician as a control variable. I’ve also read that there are Bayesian methods that could be applied using a half-t or half-cauchy distribution, but I’m not familiar with Bayesian methods and am wondering if it’s worth my time to get acquainted given such a low number of clusters. Any guidance would be greatly appreciated.