Pre-post study design with data not missing at random and multiple observations per patient

lacykadasi · November 1, 2022, 12:35pm

I am conducting an observational study with 150K subjects to compare the hospital 30 day readmission rate of two groups within 6 months before and after an intervention. To have a readmission the patient has to be admitted in the hospital in the first place which doesn’t happen for most of the sample. A sample of my data is:

ID	Timeperiod	Readmission Occurred
1	Pre	1
1	Pre	0
1	Pre	0
1	Post	0
2	Post	0
3	Pre	1
4	Post	1

From just the sample above if there were 4 admissions in the pre period and 3 admissions in the post period. In the pre period 2 out of 4 readmitted and in the post period 1 out of 3 readmitted. Another way to present my data is by subject:

ID	Timeperiod	Number of Admissions	Number of Readmissions	Readmission Rate
1	Pre	3	1	.33
1	Post	1	0	0
2	Pre	0	0	NA
2	Post	1	0	0
3	Pre	1	1	1
3	Post	0	0	NA
4	Pre	0	0	NA
4	Post	1	1	1

I need to summarize my data to find the readmission rate and then see if the difference is significant. The problems I have are:

The observations on each subject are not independent
There are a lot of patients with either no observations in the pre or post time period because they never went to the hospital
Do I compare the overall readmission rate (2/4 compared to 1/3) or the average readmission rate per patient (66% compared to 33% ignoring the NAs because of the 0 observations)?
What statistical test would be recommended to test the difference in proportions?