I have been asked to provide some advice on a study examining differences in mortality rates between a cohort of individuals followed from the 1960s to now, with deaths recorded in every year. I want to get a sense of whether the overall mortality rate in my cohort differs from that of the general population. I have considered using death numbers from my national census as a comparison sample.

The previous analyst on this project has suggested using data from a single census (2010) as a comparison point and to compare the number of deaths in 2010 to the overall number of deaths in my cohort. This seems to me to be confounded by the fact that the cohort contains deaths from several different years and the census data contains deaths from only one year. I have tried to find articles or books that discuss the approach one should take when comparing mortality between a study cohort and the general population. But, I can find nothing. Can someone please point to any resources or provide advice. If I am not explaining myself well, I am happy to clarify.

Thanks in advance.

I think a mixed model might be a choice. Your current cohort is one time trajectory and a series of mortality rates in continuous census is another rime trajectory.

In the model, the outcome is the mortality rate is the outcome, and you can include time, indicator of the cohort to distinguish the cohort you are dealing with and the census. In the model, you can get the difference between the current cohort and the general population by estimating the egression coeffcient of the cohort indicator.

This would be my approach, I would calculate Standardized Mortality Ratios. The study period is long and the mortality rates could have differed over time; therefore, you need to divide the follow up time into intervals, e.g. 5 year intervals. You need to do the same thing for you general population, I recommend to use cdc wonder website, you can get mortalities happened in the general population over decades from there. Finally you fit the poisson regression to calculate the ratio of observed to expected cases. If you can not use R or SAS, there is another software package you can use for such analyses ocmap.

Thanks to both of you for the replies. this is most helpful. If I may, I would like to further complicate the question. What if my treatment group has multiple waves of entry (i.e. the first wave of people in 1964, the second wave in 1965, and so onâ€¦ Is it sufficient to adjust for this with an indicator variable in the regression model?

I think you can do that .