Converting test statistics to Cohen's D - Private-Projects237/Statistics GitHub Wiki

Overview

This wiki page will be discussing how to convert test statistics into Cohen's D. We will have several examples from t-statistics and f-statistics. Additionally we will address any limitations.

t-test (two-independent groups)

Step 1: Generate dummy data

# Set seed
set.seed(123)

# Generate dummy data
n1 <- 100
n2 <- 100
mean1 <- 110
mean2 <- 105
sd1 <- 8
sd2 <- 8

group1 <- rnorm(n=n1, mean = mean1, sd = sd1)
group2 <- rnorm(n=n2, mean = mean2, sd = sd2)

dat <- data.frame(group = c(rep("group1",n1), rep("group2",n2)),
                  score = c(group1, group2))

Step 2: Calculate the t-statistics and Cohen's d

We will use built in functions to do this but it can be done by hand.

# Calculate t-statisti and Cohen's d
library(effsize)
(t_df <- data.frame(t.statistic = t.test(score~group, data = dat, var.equal = TRUE)$statistic,
           cohen_d = cohen.d(score~group, data = dat)$estimate))

Calculated t-statistic and Cohen's d

Step 3: Convert the t-statistic into Cohen's d (when means not reported)

We will use the following formula to do this:


d = t \times \sqrt{\frac{n_1 + n_2}{n_1 \times n_2}}

# Calculating Cohen's d from the t-statistic and sample size
t_df$cohen_d_from_t  <- t_df$t.statistic * sqrt((n1 + n2) / (n1 * n2))
t_df

Comparing Cohen'd from t-statistic calc to Cohen'd from means

Step 4: Limitations

There are three parameters that go into t-test, these are:

The sample size (from group 1 and group 2)
The mean difference (from group 1 and group 2)
The standard deviations (from group 1 and group 2)

We need to test if calculating Cohen's d from a t-statistic is still accurate if there are differences in group sample sizes, means, or standard deviations. This will inform us of limitations with this approach

... to be continued

f-test (two independent groups)

We will use the same data as above.

Step 1: Generate dummy data

# Set seed
set.seed(123)

# Generate dummy data
n1 <- 50
n2 <- 100
mean1 <- 110
mean2 <- 105
sd1 <- 8
sd2 <- 8

group1 <- rnorm(n=n1, mean = mean1, sd = sd1)
group2 <- rnorm(n=n2, mean = mean2, sd = sd2)

dat <- data.frame(group = c(rep("group1",n1), rep("group2",n2)),
                  score = c(group1, group2))

Step 2: Calculate the f-statistics and Cohen's d

We will use built in functions to do this but it can be done by hand.

# Calculate f-statistic and Cohen's d
library(effsize)
(f_df <- data.frame(f.statistic = summary(aov(score~group, data = dat))[1](/Private-Projects237/Statistics/wiki/1)$"F value"[1],
                    cohen_d = cohen.d(score~group, data = dat)$estimate))

Calculated f-statistic and Cohen's d

Step 3: Convert the f-statistic into Cohen's d (when means not reported)

We will use the following formula to do this:


d = \sqrt{\frac{ F \times (n_1 + n_2)}{n_1 \times n_2}}

# Calculating Cohen's d from the f-statistic and sample size
f_df$cohen_d_from_f  <- sqrt(f_df$f.statistic * (n1 + n2) / (n1 * n2))
f_df

Comparing Cohen'd from f-statistic calc to Cohen'd from means

f-test (three independent groups)

Step 1: Generate dummy data

# Set seed for reproducibility
set.seed(123)

# Step 1: Generate dummy data
n_per_group <- 30  # Sample size per group
mean1 <- 50        # Mean for Group A
mean2 <- 53        # Mean for Group B
mean3 <- 56        # Mean for Group C
sd <- 10           # SD for all groups

# Simulate normally distributed data
group1 <- rnorm(n_per_group, mean = mean1, sd = sd)
group2 <- rnorm(n_per_group, mean = mean2, sd = sd)
group3 <- rnorm(n_per_group, mean = mean3, sd = sd)

# Combine into a data frame
data <- data.frame(
  value = c(group1, group2, group3),
  group = factor(rep(c("A", "B", "C"), each = n_per_group))
)

Step 2: Calculate the f-statistic and Cohen's d (for all group pairs)

# Perform a one-way ANOVA
anova_result <- aov(value ~ group, data = data)
anova_summary <- summary(anova_result)
f_stat <- anova_summary[1](/Private-Projects237/Statistics/wiki/1)$`F value`[1]
df_between <- anova_summary[1](/Private-Projects237/Statistics/wiki/1)$Df[1] 
df_within <- anova_summary[1](/Private-Projects237/Statistics/wiki/1)$Df[2]  

# Soft code for calculating Cohen's d
groups <- unique(data$group)
pairs <- combn(groups, 2, simplify = FALSE)

# Subset data by group pairs and calculate Cohen's d for each one
results <- lapply(pairs, function(pair) {
  sub_data <- subset(data, group %in% pair)
  sub_data$group <- as.character(sub_data$group )
  d <- cohen.d(value ~ group, sub_data)$estimate
  list(cohen_d = d)
})

# Create a data frame with the statistics we calculated
(f_df <- data.frame(f.statistic = f_stat,
                   pair_comparison = sapply(pairs, function(x) paste(as.character(x), collapse = "-")),
                   cohen_d <- do.call(rbind, results)))

Anova Summary Table

Calculated f-statistic and Cohen's d

Step 3: Convert the f-statistic into Cohen's d (when means not reported)

We will use the following two formulas to do this:


\eta^2 = \frac{df_{between} \times F}{df_{between} \times F + df_{within}}

d = \sqrt{\frac{2 \times \eta^2}{1 - \eta^2}} \times \sqrt{\frac{N}{n}}