Covariates and Power

covariates
power
simulation
Published

May 14, 2013

We run some simple simulations to demonstrate the effect of covariates on power.

Example: Binary covariate associated with outcome

mean_response <- expand.grid(treatment = c('placebo', 'active'),
                         gender = c('male', 'female'))
mean_response$mean_response <- ifelse(mean_response$treatment == 'active', 0.75, 0.25)
mean_response$mean_response <- mean_response$mean_response + ifelse(mean_response$gender == 'female', 0.25, 0)
ggplot(mean_response, aes(treatment, mean_response, fill = gender)) + 
  geom_bar(position = 'dodge', stat = 'identity') +
  ggtitle("Assumed mean response pattern")

set.seed(20130514)
nobs <- 120
sim_data <- cbind(mean_response, data.frame(id = 1:nobs))
sim_data$noise <- rnorm(n=nobs)
sim_data$Y <- with(sim_data, mean_response + noise)
t.test(Y~treatment, data = sim_data)

    Welch Two Sample t-test

data:  Y by treatment
t = -1.8682, df = 116.35, p-value = 0.06425
alternative hypothesis: true difference in means between group placebo and group active is not equal to 0
95 percent confidence interval:
 -0.70109150  0.02046901
sample estimates:
mean in group placebo  mean in group active 
            0.4848632             0.8251744 
summary(lm(Y~treatment + gender, data = sim_data))

Call:
lm(formula = Y ~ treatment + gender, data = sim_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4407 -0.6779  0.1592  0.6109  2.8580 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)  
(Intercept)       0.3263     0.1564   2.086   0.0391 *
treatmentactive   0.3403     0.1806   1.885   0.0620 .
genderfemale      0.3172     0.1806   1.757   0.0816 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.989 on 117 degrees of freedom
Multiple R-squared:  0.05369,   Adjusted R-squared:  0.03751 
F-statistic: 3.319 on 2 and 117 DF,  p-value: 0.03963
ggplot(sim_data, aes(treatment, Y, fill = gender)) + 
  geom_boxplot(position = 'dodge') +
  ggtitle("Example simulated dataset")

set.seed(20130514)
nsim <- 10000
t_test_result <- lm_result <- matrix(NA, nsim, 2)
for(i in 1:nsim){
  sim_data$noise <- rnorm(n=nobs)
  sim_data$Y <- with(sim_data, mean_response + noise)
  tt1 <- t.test(Y~treatment, data = sim_data)
  lm1 <- summary(lm(Y~treatment + gender, data = sim_data))
  t_test_result[i, ] <- c(tt1$estimate %*% c(-1,1), tt1$p.value)
  lm_result[i, ] <- lm1$coef['treatmentactive', c('Estimate', 'Pr(>|t|)')]
}
S=10,000 Simulations Simulated Power Average treatment effect
t-test 77.26 0.5000603
linear model with gender 77.73 0.5000603