The GRADE approach (Grading of Recommendations Assessment, Development and Evaluation) is very widely used tool to evaluate the certainty of evidence in healthcare.

The GRADE approach is usually associated to meta-analyses pooling single point estimates.

The grade and statement goes as follows:

|**High**| = We are very confident that the true effect lies close to that of the estimate of the effect

|**Moderate**| = We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different

|**Low**| = Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect

|**Very low**| = We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

As an example from 1

*There were no differences in CSA of multifidus at the C2ā3 level between the chronic NSNP and control groups (two studies[38], [47]; SMD ā0.30 [95% CI ā1.12, 0.51], P = 0.47) ([Fig. 4]. The GRADE quality of evidence was very low.*

*There were no differences in disc degeneration between chronic NSNP and control, when assessed using a fourāpoint ordinal scale[49] and the fiveāpoint Pfirrman scale[50] (two studies[49], [50]; OR 0.84 [95% CI 0.57, 1.24], P = 0.39) ([Fig. 5]. The GRADE quality of evidence was moderate.*

(I have not idea about the scales in the second example, but I guess an ordinal scale is dichotomized to binary outcome, why OR is used.)

Stating that there is no difference is obvisouly wrong. In the first one the level of evidence is graded very low, which is reasonable since data is consistent with wide range of plausible values.

But then the second one. Authors claim that the quality of evidence if moderate indicating per definition ā*We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different*ā.

Again, CI for point estimate is consistent with wide range of values. It is not right to declare āno differenceā, it might be there. According to the statement, authors have confidence on ā*the effect estimate*ā. If the point estimate is not *exactly* 1.000, doesnĀ“t that mean that they believe that the other group have lower odds for disc degeneration (OR=0.84), ie. they reject the null?

On the other hand, if the *the effect estimate* includes interpretation of CIs I donĀ“t understand how you can have confidence on a point estimate and CI, since the statement IMO implies ā*no further studies needed, this is it for the population value*ā, which would again exempt the use of CIs.

To conclude, it seems counter intuitive to state there is ā*confident in the effect estimate*ā which, however, differs from 1.000 (OR, RR) or 0.000 (SMD, MD) while at the same time the CI for point estimate does not exclude 1 (or 0) indicating that null hypothesis cannot be rejected. This very often of course is considered as āno differenceā or āequalā which can be debated.

In other words, is there an inherent flaw in the use of GRADE system or am I missing the point here?