Computing Different Standard Errors and Confidence Intervals

rafdoodle1 · May 7, 2025, 4:57am

Hello Dr. Harrell,

Hope you are doing well. To compute confidence intervals for Shap values (or even relative explained variation for that matter), I would need their standard errors if I can recall. The iml R package gives me both the Shap values (phi) and their variances (phi.var). Since the Shap values are computed for a single individual, would the standard error of each of the individual’s Shap values simply be the square root of each variance? I would assume this formula of square rooting the variance would apply to compute the standard error of any individual-level statistic as well, right?

shap_df <- as.data.frame(male_shap_values$results) %>%
  mutate(
    std.err = sqrt(phi.var),                         # Compute standard error (individual)
    feature = reorder(feature, phi)                  # Reorder for plotting
  )

On the contrary and more on a general note, if I were computing standard errors for aggregate summary statistics (e.g., mean Shap value), I would square root the mean variance and then divide that value (i.e., the standard deviation) by the square root of the sample size, correct?

mean_shap <- mean(shap_filtered$phi)
se_shap <- sd(shap_filtered$phi) / sqrt(nrow(shap_filtered))

Finally, can I convert (model predicted) probability confidence intervals into odds ratio confidence intervals by simply turning the bounds of the probability confidence intervals into odds (p / 1 - p), and then dividing both new odds bounds by the reference odds?

preds <- ggpredict(model, terms = c("clc_age", var)) %>%
      group_by(x) %>%
      arrange(group) %>%
      mutate(
        predicted_ref = first(predicted),
        odds = predicted / (1 - predicted),
        odds_ref = predicted_ref / (1 - predicted_ref),
        or = odds / odds_ref,
        or_low = (conf.low / (1 - conf.low)) / odds_ref,
        or_high = (conf.high / (1 - conf.high)) / odds_ref
      )

Thanks for your advice as always! Appreciated!

Raf

f2harrell · May 7, 2025, 11:33am

Please provide a description of how SHAP variances are computing and how they take into account the full uncertainties of feature identification and effect estimation.

For the question at the end, you have to compute the variance of the difference in log odds ratios before getting the correct confidence interval for an odds ratio. Correct in a Wald sense that is. Profile likelihood intervals are better. The R rms::contrast.rms function computes both.

rafdoodle1 · May 7, 2025, 6:19pm

Hello Dr. Harrell,

Thanks for the response! Very much appreciated! According to iml source code on GitHub, each SHAP value is estimated from default 100 Monte Carlo samples and computed as mean(value), with SHAP value variance being computed as var(value). Assuming the SHAP variances correctly account for full uncertainties, would my rationale above for computing standard errors differently for individual-level (not dividing sqrt(variance) by sqrt(n)) and aggregate statistics (dividing sqrt(variance) by sqrt(n)) still be correct? I’m asking this question in general, not for SHAP necessarily.

Also, for the last question, when you say I need to compute the variance of the “difference in log odds ratio”, did you mean to say “difference in log odds” instead? I’m pretty sure the “difference in log odds” equals the log odds ratio (i.e., log(OR)). Would I have to compute this variance manually or can I use rms::contrast.rms on glm models to give me the needed variance for the OR confidence intervals?

Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 8, 2025, 11:23am

Thanks for the info. That sounds promising for SHAP (but note SHAP’s deficiency when predictors are correlated) but I can’t see making the variance smaller by dividing by n. The original variance reflects the difficulty of the task at the current sample size.

Yes I should have said difference in log odds. You can do that manually if you form the design matrices for the two conditions to compare and compute what contrast does: \text{var}((X_{1}-X_{2})\hat{\beta}).

rafdoodle1 · May 8, 2025, 4:26pm

Hello Dr. Harrell,

Thanks again for the response! Very much appreciated! I have removed collinear predictors to improve SHAP’s use, though I should have specified earlier that the aggregate statistic I want to compute is a SHAP-adjusted odds ratio for a predictor. It involves exponentiating the difference between the mean SHAP value of the predictor’s risk category/feature and the mean SHAP value of the predictor’s reference category/feature. These means are computed across all individuals in the entire sample size, rather than only one individual. That’s why I am wondering if I should divide sqrt(var) by sqrt(n) for the standard error of this aggregate statistic as I didn’t do so for the standard errors of individual SHAP values for only one individual.

ref_category <- min(unique_values)
ref_shap <- shap_filtered %>% filter(feature.value == ref_category)
ref_mean <- mean(ref_shap$phi)
ref_se <- sd(ref_shap$phi) / sqrt(nrow(ref_shap))
 
cat_shap <- shap_filtered %>% filter(feature.value == cat)
cat_mean <- mean(cat_shap$phi)
cat_se <- sd(cat_shap$phi) / sqrt(nrow(cat_shap))
      
mean_diff <- cat_mean - ref_mean
se_diff <- sqrt(ref_se^2 + cat_se^2)
      
S_OR <- exp(mean_diff)
CI <- exp(c(mean_diff - 1.96 * se_diff, mean_diff + 1.96 * se_diff))
      
results[[as.character(cat)]] <- list(S_OR = S_OR, CI = CI)

Please let me know if the original variance still reflects the difficulty of the task at the current sample size. Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 9, 2025, 11:33am

SHAP should be computed on the basis of a performance metric such as R^2, MSE, c-index, classification accuracy (a poor one), etc.

If you have to remove predictors to make a method work, SHAP is the wrong method.

rafdoodle1 · May 9, 2025, 4:55pm

Hello Dr. Harrell,

Thanks for the reply! I will keep this in mind going forward! Just one last general question, do I need to divide standard deviation (i.e., sqrt(variance)) by sqrt(n) if I am calculating standard error for an aggregate-level statistic (e.g., mean) as opposed to an individual-level statistic (e.g., data point for one person in table)?

Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 10, 2025, 10:48am

Variable importance should not be tied to an effect measure such as an odds ratio. It should be tied to an information measure such as a logarithmic accuracy score (similar to deviance in the log likelihood sense). So I don’t see how the issue can arise.

rafdoodle1 · May 10, 2025, 4:44pm

Hello Dr. Harrell,

Thanks for the reply! I will keep this in mind going forward! But my latest question on individual vs. aggregate standard errors had nothing to do with SHAP or variable importance, but rather just basic statistics.

Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 10, 2025, 8:10pm

Please restate the question in simplest possible terms without refer to SHAP in any way.

rafdoodle1 · May 10, 2025, 8:25pm

Hello Dr. Harrell,

Thanks for the reply! No worries! Here is the question: do I need to divide standard deviation (i.e., sqrt(variance)) by sqrt(n) if I am calculating standard error for an aggregate-level statistic (e.g., a mean) as opposed to an individual-level statistic (e.g., a data point for one person in table)?

Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 11, 2025, 11:35am

If I understand the question, when using ordinary sample means or in your case weighted means with aggregation you use standard errors by dividing by \sqrt{n}. For odds ratios things may be different. Start by stating the estimand in terms of parameters.

rafdoodle1 · May 12, 2025, 2:21am

Hello Dr. Harrell,

You are correct! I am dividing by √n when calculating standard errors for means. When calculating standard error for the mean difference, I do √(SE1^2 + SE2^2).

Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 12, 2025, 11:29am

But please state the precise estimand in terms of parameters so we can check things.

rafdoodle1 · May 12, 2025, 5:08pm

The estimand is the population log(OR) (i.e., the mean difference). I am using SHAP values to obtain ORs as SHAP values are log odds. log(SHAP ORs) are always equal to sqrt(mean(category SHAP value) - mean(reference SHAP value)), like how this study does it.

ref_shap <- shap_filtered %>% filter(shap_category == ref_category)
ref_mean <- mean(ref_shap$phi)
ref_se <- sd(ref_shap$phi) / sqrt(nrow(ref_shap)) # Standard error formula for population

results <- list()
    
for (cat in sort(unique(shap_filtered$shap_category [unique(shap_filtered$shap_category) != ref_category])) {
      cat_shap <- shap_filtered %>% filter(shap_category == cat)
      cat_mean <- mean(cat_shap$phi)
      cat_se <- sd(cat_shap$phi) / sqrt(nrow(cat_shap))
      
      mean_diff <- cat_mean - ref_mean
      se_diff <- sqrt(ref_se^2 + cat_se^2)
      
      S_OR <- exp(mean_diff)
      CI <- exp(c(mean_diff - 1.96 * se_diff, mean_diff + 1.96 * se_diff))

      results[[paste0(predictor, "=", cat)]] <- list(S_OR = S_OR, CI = CI)
    }

Thanks again for your advice as always! Appreciated!

Raf

f2harrell · May 12, 2025, 5:59pm

From that I have no idea what is at the heart of what you are interested in estimating, and it appears that those are not valid odds ratio estimates.