This tutorial shows how to write null and alternative hypothesis for two-way ANOVA. It is an extension of my other tutorial on this same topic.
1. Introduction
A two-way ANOVA is used to test whether the means from the two or more categorieal variables are significantly different from one another. For instance, below, there are two categorical variables, namely city (city 1 and city 2) and store (store 1 and store 2). Suppose that the dependent variable is sales. With these two independent variables, there are 4 cells.
City 1 | City 2 | |
Store 1 | Sales11 | Sales12 |
Strore 2 | Sales21 | Sales22 |
2. Null and Alternative Hypothesis for Two-Way ANOVA
The following is the non-directional hypothesis for the factor of City, Store, and its interaction City*Store.
2.1 For the factor of City (i.e., main effect of City):
Null Hypothesis (H0): Meancity 1 =
(or,Meancity 2
).Meancity 1
-
=0Meancity 2
Alternative Hypothesis (H1):
(or,Meancity 1 ≠
Meancity 2
Meancity 1
-Meancity 2
≠
.0
)
2.2 For the factor of Store (i.e., main effect of Store):
Null Hypothesis (H0): Meanstore 1 =
(or,Meanstore 2
).Meanstore 1
-
=0Meanstore 2
Alternative Hypothesis (H1):
(or,Meanstore 1 ≠
Meanstore 2
.
)Meanstore 1
-
≠0Meanstore 2
2.3 For the interaction effect City*Store:
There are two differennt ways to write null and alternative hypothesis for the interaction effect. Essentially, they are testing the same thing. Typically, it depends on your theory to choose which version to write.
City 1 | City 2 | ||
Store 1 | Sales11 | Sales12 | Md_S1 =Sales11-Sales12 |
Strore 2 | Sales21 | Sales22 | Md_S2 =Sales21-Sales22 |
Md_C1 =Sales11-Sales21 | Md_C1 =Sales12-Sales22 |
Version 1:
Null Hypothesis (H0): Md_S1 =
(or,Md_S2
).
-Md_S1
=0Md_S2
Alternative Hypothesis (H1):
Md_S1
(or,≠
Md_S2
).
-Md_S1
Md_S2
0≠
Version 2:
Null Hypothesis (H0): Md_C1 = Md_C1
(or,Md_C1 -
).Md_C1
=0Alternative Hypothesis (H1):
Md_C1
(or,
Md_C1≠
Md_C1 -
).Md_C1
0≠
Note that, if you want to write the null and alternative hypothesis for the interaction in the way shown above, you need to come with the table as well. Otherwise, people would not be able to know the meanings of Md_S1
Md_S2
Md_C1
Md_C1
. For interaction effect, it is always a bit complicated. There is some nuance regarding how to state it and interpret it. Alternatively, you can write version 1 and version 2 in the following way. As you can see, they are quite mouthful. But, you get the basic idea. Interaction effect is a bout the difference on the top of the difference.
Alternatively, you can write the version 1 in the following way.
Null Hypothesis (H0):
There is no difference between the difference of store 1 sales in city 1 and city 2 and the difference of store 2 sales in city 1 and city 2.Alternative Hypothesis (H1):
There is a difference between the difference of store 1 sales in city 1 and city 2 and the difference of store 2 sales in city 1 and city 2.
Alternatively, you can write the version 2 in the following way.
Null Hypothesis (H0):
There is no difference between the difference of city 1 sales in store 1 and store 2 and the difference of city 2 sales in store 1 and store 2.Alternative Hypothesis (H1):
There is a difference between the difference of city 1 sales in store 1 and store 2 and the difference of city 2 sales in store 1 and store 2.
3. Example of Testing Hypothesis
Suppose we have the following data. For a complete process of how to do it in Python, please refer to this tutorial.
cities stores sales 0 City1 store1 10 1 City1 store2 20 2 City1 store1 20 3 City1 store2 50 4 City1 store1 30 5 City2 store2 10 6 City2 store1 5 7 City2 store2 4 8 City2 store1 12 9 City2 store2 4
We conduct the two-way ANOVA in Python.
import statsmodels.api as sm
from statsmodels.formula.api import ols
# the following model statement
model = ols('sales ~ C(cities) + C(stores) + C(cities):C(stores)', data=df_x).fit()
# setting typ as Type III ANOVA in Python
aov_table = sm.stats.anova_lm(model, typ=3)
print(aov_table)
The following is the output:
sum_sq df F PR(>F) Intercept 1200.00 1.0 10.307802 0.018354 C(cities) 158.70 1.0 1.363207 0.287277 C(stores) 270.00 1.0 2.319256 0.178611 C(cities):C(stores) 183.75 1.0 1.578382 0.255694 Residual 698.50 6.0 NaN NaN
The following is the hypothesis testing process for two-way ANOVA in Python.
- For the factor of City (i.e., main effect of City): Since the p-value is 0.287, which is greater than 0.05, we fail to reject the null hypothesis.
- For the factor of Store (i.e., main effect of Store): Since the p-value is 0.179, which is greater than 0.05, we fail to reject the null hypothesis.
- For the interaction effect City*Store: Since the p-value is 0.256, which is greater than 0.05, we fail to reject the null hypothesis.
Further Reading
- How to write null and alternative hypotheses
- Type 1, Type 2, and Type 3 ANOVA
- How to Perform Two-Way ANOVA (Python, R)