Odds Ratio confidence interval


Thank you for this fantastic set of spreadsheets!


Hello Yossi. I modified my code to also compute the Z and p-values. It confirms that in both tables, the one showing the coefficients, and the one showing the odds ratios, Z = B / SE(B). No Z-values are computed using the delta method SE of the OR. In the second table showing ORs, notice that OR / SE = 29.266509, not -0.66. So I don’t understand why you’re saying the Z-test is done using the delta method.

Here’s the revised code, followed by the output from the final two commands.

webuse lbw, clear
logit low age lwt i.race smoke
matrix tab = r(table)'
matrix list tab
svmat tab
generate double OR = exp(tab1) // tab1=B, the coefficient
generate double SEdm = exp(tab1)*tab2 // tab2 = SE(B)
generate double Z = tab1/tab2 // Z = B / SE(B)
generate double p = 2*(1-normal(abs(Z)))
generate double ORlower = exp(tab5) // tab5 = lower limit of 95% CI for B
generate double ORupper = exp(tab6) // tab6 = upper limit of 95% CI for B
list OR-ORupper if !missing(OR), clean noobs
logit, or noheader

. list OR-ORupper if !missing(OR), clean noobs

           OR        SEdm            Z           p     ORlower     ORupper  
    .97774432   .03340831    -.6587033   .51008631   .91440965   1.0454657  
    .98757614   .00630496   -1.9581977   .05020682   .97529563   1.0000113  
            1           .            .           .           .           .  
    3.4253717   1.7712809    2.3809617   .01726751   1.2432147   9.4377679  
       2.5692   1.0693013    2.2671657   .02338011   1.1363909    5.808555  
    2.8703458   1.0906704    2.7749776   .00552055   1.3629997   6.0446717  
    1.3911444   1.5408414    .29805397   .76566197   .15869942   12.194644  

. logit, or noheader
         low | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
         age |   .9777443   .0334083    -0.66   0.510     .9144097    1.045466
         lwt |   .9875761    .006305    -1.96   0.050     .9752956    1.000011
        race |
      black  |   3.425372   1.771281     2.38   0.017     1.243215    9.437768
      other  |     2.5692   1.069301     2.27   0.023     1.136391    5.808555
       smoke |   2.870346    1.09067     2.77   0.006        1.363    6.044672
       _cons |   1.391144   1.540841     0.30   0.766     .1586994    12.19464
Note: _cons estimates baseline odds.

I hope this clarifies what Stata’s -logit - command is doing.


Pavlos_Msaouel wrote: Thank you for this fantastic set of spreadsheets!

You’re welcome! Please feel free to share as much as you like - but the source should be acknowledged in any publication, especially any methodological one.


Absolutely - will make sure to cite appropriately. Very much appreciated.


When dealing with confidence intervals, what matters is the coverage probability, and both methods lead to a 95% coverage probability.

The simple coverage example I gave above shows the delta method does not lead to a true 95% coverage probability in a simple 2x2 case. The delta method fails to get coverage correct, so what is presented as a 95% confidence interval is simply not and therefore cannot be accepted as a true 95% confidence interval when used as an estimator with a finite sample.

An asymptotic proof is required for any decent estimator but understanding the behaviour of finite sample coverage can be just as important.

Oh, and there is one more thing - a narrower confidence interval is better. It is easy to see that the delta method yields a narrower confidence interval when comparing it to the other method.

This statement is internally inconsistent and further proof of the problem. If I have 2 coverage intervals claiming to be 95% coverage intervals and one is narrower than the other–one of them is wrong. They both can’t be right, so one is obviously not a 95% coverage interval. The results of the simulation I posted are sufficient to reject the validity of the delta method for providing a true 95% CI.

So one method leads you to non signifcant reult (and no publication :frowning: ) while the other one is statistically significant. So what would you do?

Picking the one that gives a better p-value for publication is going off the deep end into data dredging. The proper way is to use a method that has good coverage behaviour, and to pick that method a-priori and stick with it. Given the validation above of Stata’s approach as a reviewer I would accept “We used Stata for logistic regression and have the following result …”.