Canonical references for different types of "bias"

olipes · May 5, 2022, 1:30pm

I work in an area where we typically deal with observational data, largely nonprobability samples, in order to create descriptive trends (i.e. inferences about changes over time in broader populations). I have many stats reference books, read many papers etc., but I have found it difficult to find a reference clearly outlining the difference between (asymptotic) biases in estimators and systematic biases that result from things like model misspecification (e.g. confounding) or non-random sampling. I have already bought quite a lot of stats books, but often this topic is not dealt with as explicitly as I hoped. I wonder whether datamethods readers could provide recommendations for books or papers that clearly and explicitly deal with this issue of these two broad types of bias (and note that I am more interested in descriptive inference than causal inference – I have many texts on the latter). (I should add that I understand the difference, I just lack a “canonical” reference!)

R_cubed · May 5, 2022, 2:01pm

Does this help?

olipes · May 5, 2022, 2:16pm

Thanks. My understanding of this is that it is a catalogue of different types of systematic biases that can occur in health research. What I am looking for is a reference that clearly states the difference between the use of the word “bias” to refer to biased estimators in terms of classical theory, and the use of the word “bias” to refer to these types of systematic measurement error.

whuber clearly notes the difference here, but I’m looking for something “canonical” that is better for citing in academic work.

Pavlos_Msaouel · May 5, 2022, 3:09pm

I am currently writing an article for medical readers on exactly these distinctions and how they affect clinical practice but it will probably take a few months to come out. The typical statistical definition of bias as the difference between the average value of the estimator and true value of the target parameter is indeed not granular enough. The epidemiology literature instead defines bias as the systematic error, i.e., the nonrandom difference between an estimate and the true value of the target parameter. Textbooks such as this one do a great job deconvoluting error types. This overview of hierarchical modeling is not necessarily as detailed, but is certainly cognizant of these distinctions. Hope this helps.

R_cubed · May 5, 2022, 3:37pm

I remember reading something about this by @sander awhile back, and posted some links in this thread; the articles make a subtle distinction between bias and confounding…

olipes · May 5, 2022, 3:55pm

Thanks @Pavlos_Msaouel chapter 2 of James et al. (available here) is a very useful overview, and Sander Greenland’s writing always contains numerous fascinating insights. Ideally I would find something that was, however, even more explicit than these about the distinction/link between these two conceptions of bias. I think I can see, however, that the irreducible error of James et al. could contain systematic pattern, and I can also see various places where the Greenland paper alludes to this aspect of inference. No doubt it is just a case of citing a few different sources, and making my own definition clear in what I write.

And thanks @R_cubed . Yes, I see the link to aspects of exchangeability etc. I have read a few papers/books where the link between causal inference in observational settings (adjusting for confounders) and extrapolating from nonprobability samples to target population (adjusting for selection bias) is made. I’ll be interested to read that thread and learn more.

Pavlos_Msaouel · May 5, 2022, 3:59pm

Yup, exactly. Because the definitions can be so widely different between fields, but at the same time the topic is so multi-disciplinary, you have to cite different resources and explicitly state your definitions. That is what I am currently doing as well.

And you are exactly right to observe how @Sander alludes to irreducible error and its features throughout his writings without explicitly calling it as such, as far as I can tell. But he can correct me if I missed it somewhere.

Pavlos_Msaouel · June 9, 2022, 5:21pm

The article I mentioned was published and is now freely available here.

olipes · June 10, 2022, 8:31am

Thanks @Pavlos_Msaouel congrats. Good to see the Meng papers in there. The idea of a data confession in particular (Meng, 2021) seems very close to formal risk-of-bias analysis. We have been trying to promote this idea in my area recently. See here for example