Method 1 for paired t-test is for a situation group 1 and group 2 are two separate vectors, whereas Method 2 is for two groups of data in the same column.
Method 1:
t.test(group_1, group_2, paired = TRUE)
Method 2:
t.test(y~group, data=my_data, paired = TRUE)
Data and Hypothesis
Suppose we would like to test whether students perform differently in Exam 1 and Exam 2. Since a student is measured at time 1 (Exam 1) and time 2 (Exam 2), that is why it is paired data. The following is the hypothetical data.
Exam 1 | Exam 2 |
---|---|
65 | 75 |
70 | 71 |
75 | 90 |
80 | 98 |
68 | 65 |
95 | 99 |
The following is the null and alternative hypotheses for the paired t-test.
- Null Hypothesis: Exam 1 and Exam 2 are not significantly different.
- Alternative Hypothesis: Exam 1 and Exam 2 are significantly different.
R code Example for Method 1
# Exam 1 and Exam 2 data
Exam1<-c(65,70,75,80,68,95)
Exam2 <-c(75,71,90,98,65,99)
# The following writes Exam1 and Exam2 separately within t.test():
t.test(Exam1, Exam2, paired = TRUE)
Output:
Paired t-test data: Exam1 and Exam2 t = -2.2361, df = 5, p-value = 0.07559 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -16.121994 1.121994 sample estimates: mean of the differences -7.5
P-value of 0.756 is greater than 0.05. That means that we failed to reject the null hypothesis. Therefore, we conclude that Exam 1 and Exam 2 scores are not significantly different.
R code Example for Method 2
We are going to use the same data in example 1, but to use a different data format. It will also again use t.test()
but using the column name of group
in the dataframe of Exam_data.
Note, Exam_data is a dataframe that is created in the following code.
# Exam 1 and Exam 2 data
Exam1<-c(65,70,75,80,68,95)
Exam2 <-c(75,71,90,98,65,99)
# Create a data frame for paired t-test in R
Exam_data <- data.frame(
group = rep(c("Exam1", "Exam2"), each = 6),
score = c(Exam1, Exam2))
# print out the dataframe:
print(Exam_data)
# The following uses data frame format:
t.test(score~group, data=Exam_data, paired = TRUE)
Output of the data frame:
group score 1 Exam1 65 2 Exam1 70 3 Exam1 75 4 Exam1 80 5 Exam1 68 6 Exam1 95 7 Exam2 75 8 Exam2 71 9 Exam2 90 10 Exam2 98 11 Exam2 65 12 Exam2 99
The following is output of paired t-test. Since we use the same data and get the same result, the interpretation and conclusion are exactly the same as in Example 2.
Paired t-test data: score by group t = -2.2361, df = 5, p-value = 0.07559 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -16.121994 1.121994 sample estimates: mean of the differences -7.5