This tutorial explains the definition of One Variable Chi-square, and provides examples for it.
Two types of one variable chi-square
There are two possible ways of using chi-square tests:
- Chi-square goodness of fit test: It tests the difference between the observed count values and the expected count values. (see discussion here.)
- Chi-square test of independence: It tests if there is an equal (or unequal) number of observations across different categories within a single grouping variable. (see discussion here.)
Chi-square test of independence is a special case of Chi-square goodness of fit test. In this tutorial, we focus on Chi-square test of independence.
Examples of one variable chi-square
An instructor would like to see whether the number of male students is equal to the number of female students. Here, the grouping variable is gender, which has two categories, male and female.
Another example, the school is offering 3 courses, and you would like to see whether the number of students enrolled into each course is equal across these 3 courses. For this, the grouping variable is Course, which has 3 categories, namely 3 different courses.
Null and alternative hypotheses for one variable chi-square
We would like to see whether the number of students enrolled in each course is equal across these 3 courses.
Course Names | Student Counts |
---|---|
Math | 25 |
Computer | 50 |
English | 30 |
The following shows the null and alternative hypotheses for this one variable chi-square test.
- Null Hypothesis: All these 3 courses have the same enrollment student numbers.
- Alternative Hypothesis: At least one course has an enrollment number that is different from the average enrollment number (i.e., the mean of 3 enrollment numbers).
Formula and manual calculation variable chi-square
The following is the formula of one variable chi-square.
\[ \chi^2=\sum \frac{ (O_i-E_i)^2}{E_i} \]
Where,
\( O_i \): observed values (actual values)
\( E_i \): expected values
We can calculate the expected values, namely the mean in this case.
\[ E=\frac{25+50+30}{3}=35 \]
Thus, we can expand the table above.
Course Names | Student Count (Observed Value) | Expected Value | \( O_i-E \) | \( (O_i-E)^2 \) | \( \frac{(O_i-E)^2}{E} \) |
---|---|---|---|---|---|
Math | 25 | 35 | -10 | 100 | 100/35 |
Computer | 50 | 35 | 15 | 225 | 225/35 |
English | 30 | 35 | -5 | 25 | 25/35 |
Thus, we can sum them up and get the following:
\[ \chi^2=\sum \frac{ (O_i-E)^2}{E} = \frac{100}{35} +\frac{225}{35}+\frac{25}{35} = 10 \]
For one variable chi-square test, the degree of freedom (df) is the number of groups subtracted by one (i.e., df=n-1). Thus, we have the degree of freedom of 3-1=2.
We can check the chi-square table here based on degree of freedom and the cut-off alpha level of 0.05. The critical chi-square value is 5.991. Thus, we can reject the null hypothesis.
We can then conclude that that at least one pair of these 3 courses differ from each other in terms of enrollment numbers.