Interpreting restricted cubic spline contrasts in light of joint null

lachlan · April 23, 2022, 1:56am

I have modeled the association between a continuous exposure X and a continuous outcome Y using linear regression. The X term is modeled using a restricted cubic spline with 3 degrees of freedom.

There are two key questions to answer:

Is there any evidence for an association between the exposure X and outcome Y?
To estimate the mean difference in Y between two meaningful (but not pre-specified) levels of the exposure X. I.e., E(Y | X = a) - E(Y | X = b)

For the former, a p value of 0.08 indicates weak evidence against the null hypothesis of no association between X and Y. Conversely, for the latter, the confidence interval contains values that range from somewhat small positive (but non-zero) to large positive mean differences.

In the report I intend to present the p value for the overall test of association between the X and Y variables and the estimated contrast. My question is: Should the results from the latter be given less weight given the results of the former? I.e., should the conclusions from the estimated contrast be weakened given the weak evidence against the null of no association between X and Y? Particularly given the levels of the contrast were not pre-specified. Thanks!

Vattaka · April 23, 2022, 2:42am

I don’t think the issue itself is that the 1st part has no association but the issue more would be making a decision of the 2nd test based on the first.

You don’t need to do a global test to assess pairwise contrasts. The problem is since its continuous there are essentially an uncountably infinite number of possible contrasts you can do and there was no prespecification.

I think maybe Scheffe or Tukey correction on the 2nd one would be ok since it wasn’t prespecified? Though I’m not sure how to do that for continuous.

f2harrell · April 23, 2022, 11:13am

Reporting of estimates should be independent of “statistical significance”. Provide the estimated curve and confidence bands. Even better than pointwise bands would be simultaneous confidence bands. The contrast function (full name contrast.rms) in the R rms package will automatically know that with 3 d.f. any contrast after the first 4 contrasts will be linearly redundant. (If the predictor were binary you’d need one distinct contrast; if it were linear, any two contrasts would chararacterize all possible contrasts. With linear + 2 nonlinear terms you need 4 contrasts). When you ask contrast for simultaneous confidence intervals the rank of the contrasts should be 4 and you don’t get penalized for computing confidence limits over 200 values of X.

lachlan · April 25, 2022, 2:34am

I agree completely with these points. I fully intend to present the estimated curve and the contrast of interest, whatever the result of the global test. I suppose my question was more ‘should less interpretative weight be given to the contrast, given the weak evidence in the global test’. An alternative way of thinking about this is, as Vattaka puts it, should the contrast be penalized for all the possible contrasts that could have been made (within the limits of redundancy)?

Thank you for introducing the contrast function. The way redundancy is handled clarified this issue for me.

To put it very clearly, I am trying to discern what is the most appropriate (and appropriately humble) way to present these findings. Two possible options are (ignoring discussion of uncontrolled bias):

In a global 3 d.f test, there was weak evidence for an association between exposure X and outcome Y (p = 0.08). An increase in X from X=a to X=b was associated with an estimated increase of Y of +100 units (simultaneous 95% CI = 20, 180). In conclusion, there was weak evidence that increasing X is associated with greater levels of Y, though the findings are ambiguous and further evidence is required’.
An increase in X from X=a to X=b was associated with an estimated increase of Y of +100 units (simultaneous 95% CI = 20, 180), though there was only weak evidence for an association between X and Y using a global 3 d.f test (p = 0.08). In conclusion, given the model assumptions, these results are consistent with an increase in X from meaningful a to meaningful b being associated with greater levels of Y.

My concern with approach (2) is that reviewers will note the weak/ambiguous evidence in the global test and use that to say that, in light of that result, the final conclusion regarding the contrast should not have been reached.

I feel that I am stuck between a) not wanting to speak beyond the evidence and b) not wanting to fall into the statistical significance trap (particularly misinterpreting p > 0.05 as ‘no association’).

EDIT: I find presentation of findings quite complicated by the field having one foot in and one foot out of the ‘statistical significance’ water. You want to present findings in a way that NHST people won’t object to (they are still the majority of reviewers) but is simultaneously an honest/appropriate representation of the results. I’m still working on this and still slowly learning.

f2harrell · April 25, 2022, 3:00am

Very well put. I’m not sure on whether to mention the statistical test or not. As you said it will turn off dichotomous thinkers. In general we’d be better of with fewer p-values and more confidence intervals (or even better, with Bayesian posterior distributions for unknown parameters).