This tutorial shows how you can two sample t-test in R. Note that, Two sample t-test is also called independent sample t-test or unpaired sample t-test.
Method 1: Vector format
t.test(vector_1, vector_2, var.equal = TRUE)
Method 2: Data frame format
t.test(Y ~ group, data = df_name , var.equal = TRUE)
Data and Hypothesis
Suppose you want to test whether women and men differ in their attitudes toward a brand, and the attitude is measured on a 7-point scale (1= Not like at all, 7 = Like it a lot).
The following is the hypothetical data, one column for men’s attitudes and another one for women’s attitudes toward the brand.
Men’s Attitudes | Women’s Attitudes |
---|---|
4 | 4 |
6 | 3 |
7 | 4 |
7 | 5 |
6 | 2 |
7 | 1 |
The following are the null and alternative hypotheses for two sample t-test.
- H0 (Null Hypothesis): Men and women have the same attitudes.
- Ha (Null Hypothesis): Men and women do not have the same attitudes.
Example for Method 1
# vectors of men and women
men_data<-c(4,6,7,7,6,7)
women_data<-c(4,3,4,5,2,1)
# equal variance
res1 <- t.test(men_data, women_data, var.equal = TRUE)
res1
# unequal variance
res2 <- t.test(men_data, women_data, var.equal = FALSE)
res2
The following is the output. The first part “Two Sample t-test” is for the equal variance. As we can see, the p-value is smaller than 0.05 and thus we reject the null hypothesis and conclude that men and women differ in attitudes.
The second part “Welch Two Sample t-test” is for unequal variance. We can see that t statistic is the same as the first part. (Note that, as long as the sample numbers are the same for two groups, t statistics are always the same. See the discussion here. )
Two Sample t-test data: men_data and women_data t = 3.9094, df = 10, p-value = 0.002916 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.290146 4.709854 sample estimates: mean of x mean of y 6.166667 3.166667 Welch Two Sample t-test data: men_data and women_data t = 3.9094, df = 9.5124, p-value = 0.003208 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.27819 4.72181 sample estimates: mean of x mean of y 6.166667 3.166667
Example for Method 2
This example of two sample t-test is for situation where data is a data frame. In the following, we can create a data frame first. Then, use the same t.test() function to do the two sample t-test in R.
# vectors of men and women
men_data<-c(4,6,7,7,6,7)
women_data<-c(4,3,4,5,2,1)
# Create a data frame
df_combined <- data.frame(
group = rep(c("Woman", "Man"), each = 6),
attitudes = c(women_data, men_data))
# Compute two sample t-test in R
results_2 <- t.test(attitudes ~ group, data = df_combined , var.equal = TRUE)
results_2
The following is the output of two sample t-test in R. We can see that the result is the same as in the last example.
Two Sample t-test data: attitudes by group t = 3.9094, df = 10, p-value = 0.002916 alternative hypothesis: true difference in means between group Man and group Woman is not equal to 0 95 percent confidence interval: 1.290146 4.709854 sample estimates: mean in group Man mean in group Woman 6.166667 3.166667