Validation of a Simulation-Based Methodology for Power and Sample Size Calculation Using the Wilcoxon Test

Based on the article Simulation-based power calculation and the definition of Power as

“The probability of detecting a difference if one truly exists”

I applied a similar principle to construct a function that calculates the power of the Wilcoxon test for use in sample size determination. Below is the R script I developed:

# Function to calculate power using simulation for Wilcoxon test
calculate_power <- function(n_per_group, mu_group1, mu_group2, sd_group1, sd_group2, iterations = 1000, alpha = 0.05) {
  set.seed(2024)
  significant_results <- 0
  
  for (i in 1:iterations) {
    group1 <- rnorm(n_per_group, mean = mu_group1, sd = sd_group1)
    group2 <- rnorm(n_per_group, mean = mu_group2, sd = sd_group2)
    p_value <- wilcox.test(group1, group2, alternative = "two.sided")$p.value
    if (p_value < alpha) {
      significant_results <- significant_results + 1
    }
  }
  
  significant_results / iterations
}

# Parameters for the simulation
mcid <- 0.8          # Minimum clinically important difference
sd_group1 <- 1   # Standard deviation for group 1
sd_group2 <- 1   # Standard deviation for group 2
alpha <- 0.05      # Significance level
desired_power <- 0.8  # Target power
iterations <- 1000  # Number of simulations per sample size

# start params
sample_size <- 10  # Initial sample size per group
power <- 0

# Loop until desired power is achieved
while (power < desired_power) {
  power <- calculate_power(
    n_per_group = sample_size,
    mu_group1 = 3, ## according to ref
    mu_group2 = 3+mcid, ## mcid according to PI
    sd_group1 = sd_group1,
    sd_group2 = sd_group2,
    iterations = iterations,
    alpha = alpha
  )
  cat("Sample size per group:", sample_size, "- Power:", power, "\n")
  if (power < desired_power) {
    sample_size <- sample_size + 1
  }
}

cat("Final sample size per group to achieve power", desired_power, "is:", sample_size, "\n")
cat("Accounting for 10% Drop-out rate: ", ceiling(sample_size/0.9))

This script produced the same sample size as calculated using PASS software. I would appreciate your expert feedback on whether my methodology is appropriate and aligns with standard practices and scientific rationales.

@f2harrell I deeply value your insightful opinion.

Check it against the methods detailed in BBR.

1 Like