Cannot fit a logistic regression model correctly

Lin_Caijin · December 12, 2024, 3:32pm

Hi.
I’m trying to fit a logistic regression model using the following formula:

glm(resp_crpr ~ Liver_metastasis * TP53, data = tmp_df, family = ‘binomial’)

However, the output appears unusual because the coefficients and p-value for the interaction term (Liver_metastasis1:TP531) are NA. Here’s the output:

Coefficients: (1 not defined because of singularities)
                        Estimate Std. Error z value Pr(>|z|)
(Intercept)               0.6061     0.5075   1.194    0.232
Liver_metastasis1        -0.7289     0.6196  -1.176    0.239
TP531                     0.6336     0.6122   1.035    0.301
Liver_metastasis1:TP531       NA         NA      NA       NA

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 97.320  on 81  degrees of freedom
Residual deviance: 95.433  on 79  degrees of freedom
AIC: 101.43

Number of Fisher Scoring iterations: 4

To troubleshoot, I also attempted to fit the model using the rms package:

tmp_fit ← lrm(resp_crpr ~ Liver_metastasis * TP53, data = tmp_df)

But also it returned the following warning:

In lrm(resp_crpr ~ Liver_metastasis * TP53, data = tmp_df) :
  Unable to fit model using “lrm.fit”

Here’s the contingency table for the tmp_df dataset.

table(tmp_df$TP53, tmp_df$Liver_metastasis)

     0  1
  0 17  0
  1 49 16

It seems there are zero cases for the combination where TP53 = 0 and Liver_metastasis = 1. This might be causing the singularity issue or the NA values in the output.

Does anyone have suggestions on how to address this issue or alternative methods to fit the model given the sparse data?

Thanks!

f2harrell · December 13, 2024, 6:38pm

You can’t assess interaction when there is no data basis for estimating the interaction effect. Think of a 2x2 table of the two predictors. The smallest cell has to be large for you to be able to estimate the differential effect.

Please add major and minor categories to your post, and tags.