Blockquote
But if you look at page 89, under the description of “bias,” you’ll see an example of why physicians get confused when they’re told not to look for baseline covariate imbalances as possible sources of bias…
I wanted to bump this as I found a great blog post that provides an accessible, constructive proof of why attempting to look at an instance of covariate balance to judge the validity of a randomization procedure is mistaken.
Covariate-Based Diagnostics for Randomized Experiments are Often Misleading (link)
Assuming a homogenous population, a large N, a true treatment effect, complete randomization, and K covariates, we can show that:
Blockquote
This leads to a strange disconnect: the outcome of our experiment is always a perfect estimate of the ATE, but we can always detect arbitrarily strong imbalances in covariates. This disconnect occurs for the simple reason that there is no meaningful connection between covariate imbalance and successful randomization: balancing covariates is neither necessary nor sufficient for an experiment’s inferences to be accurate. Moreover, checking for imbalance often leads to dangerous p-hacking because there are many ways in which imbalance could be defined and checked. In a high-dimensional setting like this one, there will be, by necessity, extreme imbalances along some dimensions. For me, the lesson from this example is simple: we must not go looking for imbalances, because we will always be able to find them.
The epistemic state of an individual involved in a randomized experiment is distinct from someone who is reading the results of one, not having been involved in the planning.
Would a rational, Bayesian scientist who managed to convince another agent to set up a randomized experiment to decide a factual question accept these post hoc criticisms, especially when the other agent was permitted to decide and validate whatever randomization procedure was actually used?
Of course, the reader is not in the position to judge the quality of the randomization. He or she might discount the report based on prior knowledge, the adequacy of the report of the randomization procedure, etc.
But acceptance of post hoc arguments on imbalance would mean that no RCT would ever be able to decide a question of fact, since we can always retrospectively find “imbalance” if we look hard enough.