I work in an area where we typically deal with observational data, largely nonprobability samples, in order to create descriptive trends (i.e. inferences about changes over time in broader populations). I have many stats reference books, read many papers etc., but I have found it difficult to find a reference clearly outlining the difference between (asymptotic) biases in estimators and systematic biases that result from things like model misspecification (e.g. confounding) or non-random sampling. I have already bought quite a lot of stats books, but often this topic is not dealt with as explicitly as I hoped. I wonder whether datamethods readers could provide recommendations for books or papers that clearly and explicitly deal with this issue of these two broad types of bias (and note that I am more interested in descriptive inference than causal inference – I have many texts on the latter). (I should add that I understand the difference, I just lack a “canonical” reference!)
Does this help?
Thanks. My understanding of this is that it is a catalogue of different types of systematic biases that can occur in health research. What I am looking for is a reference that clearly states the difference between the use of the word “bias” to refer to biased estimators in terms of classical theory, and the use of the word “bias” to refer to these types of systematic measurement error.
whuber clearly notes the difference here, but I’m looking for something “canonical” that is better for citing in academic work.
I am currently writing an article for medical readers on exactly these distinctions and how they affect clinical practice but it will probably take a few months to come out. The typical statistical definition of bias as the difference between the average value of the estimator and true value of the target parameter is indeed not granular enough. The epidemiology literature instead defines bias as the systematic error, i.e., the nonrandom difference between an estimate and the true value of the target parameter. Textbooks such as this one do a great job deconvoluting error types. This overview of hierarchical modeling is not necessarily as detailed, but is certainly cognizant of these distinctions. Hope this helps.
I remember reading something about this by @sander awhile back, and posted some links in this thread; the articles make a subtle distinction between bias and confounding…
Thanks @Pavlos_Msaouel chapter 2 of James et al. (available here) is a very useful overview, and Sander Greenland’s writing always contains numerous fascinating insights. Ideally I would find something that was, however, even more explicit than these about the distinction/link between these two conceptions of bias. I think I can see, however, that the irreducible error of James et al. could contain systematic pattern, and I can also see various places where the Greenland paper alludes to this aspect of inference. No doubt it is just a case of citing a few different sources, and making my own definition clear in what I write.
And thanks @R_cubed . Yes, I see the link to aspects of exchangeability etc. I have read a few papers/books where the link between causal inference in observational settings (adjusting for confounders) and extrapolating from nonprobability samples to target population (adjusting for selection bias) is made. I’ll be interested to read that thread and learn more.
Yup, exactly. Because the definitions can be so widely different between fields, but at the same time the topic is so multi-disciplinary, you have to cite different resources and explicitly state your definitions. That is what I am currently doing as well.
And you are exactly right to observe how @Sander alludes to irreducible error and its features throughout his writings without explicitly calling it as such, as far as I can tell. But he can correct me if I missed it somewhere.