Differential misclassification of the outcome (by exposure status)


Hello! I work on a linked health administrative dataset of all known people living with HIV in a given geographical area; an HIV clinical database has been linked to several administrative data sources. As is typically done in these studies, outcome variables (e.g., particular conditions/diseases) are defined by some (potentially validated) set of ICD-9/ICD-10 codes. The study that I work on also includes a random sample of the general (HIV-negative) population in the same area. This random sample is often used to do “HIV+ vs. HIV-” comparisons, usually by calculating incidence rate ratios (incidence rate [HIV+] / incidence rate [HIV-]).

Side note: Even after age- and sex-adjustment of the IRRs, I would not conclude that a higher incidence of condition X (e.g., emphysema) is being driven by HIV specifically, as any comparison by HIV serostatus (HIV+/HIV-) would undoubtedly be biased by additional confounding factors (unless some analytical wizardry was performed, and certain assumptions were clearly stated). Anyway…

Confounding aside (and also ignoring the myriad of other issues that come with using admin data for research purposes), I am finding myself concerned about misclassification in this setting, specifically, differential misclassification of a hypothetical outcome (as defined by a set of ICD-9/ICD-10 codes) by “exposure” status (HIV+/HIV-).

Adapting Rothman’s words (from here: http://sphweb.bumc.bu.edu/otlt/MPH-Modules/EP/EP713_Bias/EP713_Bias6.html) to an HIV-related example:

Suppose a follow-up study were undertaken to compare incidence rates of emphysema (as defined by ICD codes) among HIV+ and HIV- persons. Emphysema is a disease that may go undiagnosed without unusual medical attention. If HIV+ persons, because of concern about health effects of HIV, seek medical attention to a greater degree than HIV- persons, then emphysema might be diagnosed more frequently among HIV+ than among HIV- simply as a consequence of the greater medical attention. Unless steps were taken to ensure comparable follow-up, an information bias would result. An ‘excess’ of emphysema incidence would be found among HIV+ compared with HIV- that is unrelated to any biologic effect of HIV. This is an example of differential misclassification, since the underdiagnosis of emphysema, a misclassification error, occurs more frequently for HIV- than for HIV+.

Essentially, one can only be “measured” in administrative data if they present themselves to care. Given HIV- people are usually less engaged in the healthcare system (than people living with HIV), you are almost always bound to find a higher incidence rate of everything in the HIV+ group; I would wager that there are several scenarios when it is mostly/entirely due to misclassification and not due to HIV having an impact on the outcome.

First question: does differential misclassification of ICD code-based outcome measures, by HIV serostatus, seem like a potentially major issue in this setting?

Second question: I know that differential misclassification is less predictable than non-differential. Therefore, I am trying to track down some papers (i.e., admin data studies) that have attempted to mitigate this type of bias. Does anyone have any recommendations?

Thank you all so much!


Q1: Yes, this seems like it could be a potentially major issue.

Q2: The first thought that comes to my mind is to use multiple imputation to account for differential misclassification, along the lines of some papers by Jessie Edwards and Steve Cole (i.e., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3983405/). Multiple imputation should be able to handle differential misclassification. The challenge to using this approach, however, is that you need some internal validation data – i.e., you would need to have a subsample of records (both HIV+ and HIV-) where you know the true outcome status. (I guess you might only need it in the HIV-negatives if you’re willing to assume that misclassification is only occurring in the HIV-negatives, which is sort of how you motivated this question.) The bigger the validation subsample, the better, and of course, for this to work you have to be able to obtain the true outcome in the validation subsample, which doesn’t always seem to be the case with EHR data.

Good luck!


Hello Bryan! Thanks for the clear and detailed response. I am familiar with MI (for non-response on questionnaires) and validation “exercises” (e.g., ‘gold standard’ measures, validation sub-samples, etc.)… but I have not used either in the context of misclassification. Sadly, while we do have some ‘gold standard’ measures available for the HIV+ participants… we’re basically relying entirely on administrative data for the HIV- participants (which doesn’t bode well for this endeavour). That said, I’ll give the Edwards et al. (2013) paper a solid read and see what can be done!

If anyone else has any suggestions - feel free to share. Thanks again Bryan,