Hi Jiaqi
Thanks for clarifying. I won’t comment on epidemiologic methods themselves since I’m not qualified to do that. But I will say that I’m very skeptical in general about observational studies that try to establish chicken/egg relationships when certain types of clinical events are involved, especially if researchers are relying on administrative databases.
The accuracy of a lot of observational studies hinges on the accuracy, reliability, and completeness of diagnostic codes entered by doctors into a billing program after they see a patient in the clinic. But if you were to poll 100 doctors and ask them whether they would recommend that a researcher use their diagnostic codes to generate high quality research, most would probably laugh hysterically.
Here’s how a family physician in my country (Canada) would code a typical patient appointment:
The patient brings a list of 6 separate health concerns to a 15 minute appointment (this is NOT unusual in family medicine). Those concerns include:
- Follow-up of congestive heart failure;
- Follow-up of type 2 diabetes;
- New urinary frequency;
- Back pain that has been occurring intermittently for the past 15 years;
- Forgetfulness noticed for the first time about 3 years ago by a spouse, but never mentioned to the doctor until now, since the spouse thinks it’s getting worse;
- Headaches that started about 2 years ago, but which had been manageable with Tylenol until the past 3-4 months.
At the end of this visit, the physician will pick ONE diagnostic code to apply to the whole visit. In Canada, the doctor doesn’t get any financial bonus for coding all 6 issues addressed during the appointment. The doctor gets paid about $37 by the government whether he/she addresses 2 issues or 8 issues in a visit (sad, I know…). Since there’s no extra pay for adding more diagnostic codes to the billing, we just randomly pick one of the six problems discussed during the visit when entering the diagnostic code.
As you can see, diagnostic codes (at least in my country) are only a very crude reflection of the issues actually discussed during a patient’s appointment. One physician might choose to enter the “diabetes” diagnostic code; another physician might choose the “congestive heart failure” diagnostic code; another physician might choose the “headache” diagnostic code…You get the picture. So if you’re a researcher, combing through an administrative database in Canada, and you somehow find out that the patient above has a diagnostic code for “stroke” on a certain date, and now you want to find out whether that patient also had a history of migraine headache at some point prior to the stroke, well good luck with that…Not only might a patient with migraines never actually have a diagnostic code for migraine on record (if they were in the habit of raising multiple issues in each visit, the doctor might have ended up always coding for one of the non-migraine issues), but on the off chance that there is a recorded diagnostic code for migraine, the date of that code in the database will often be a very poor indicator of when the migraines actually started (people often wait months or even years before reporting health problems to their doctor).
As an aside, I think a lot of physicians’ skepticism (?cynicism) about observational research stems from a view that people who do this type of research often aren’t really in a position to understand the limitations of the data they’re working with, simply because they don’t work in the health system itself. If you’re the one entering diagnostic codes and you know you do a piss poor job at it, then you’re not going to trust a study whose conclusions hinge on analysis of those codes…