How to Write Null and Alternative Hypothesis for Two-Way ANOVA

This tutorial shows how to write null and alternative hypothesis for two-way ANOVA. It is an extension of my other tutorial on this same topic.

1. Introduction

A two-way ANOVA is used to test whether the means from the two or more categorieal variables are significantly different from one another. For instance, below, there are two categorical variables, namely city (city 1 and city 2) and store (store 1 and store 2). Suppose that the dependent variable is sales. With these two independent variables, there are 4 cells.

City 1City 2
Store 1Sales11Sales12
Strore 2Sales21Sales22
Null and Alternative Hypothesis for Two-Way ANOVA

2. Null and Alternative Hypothesis for Two-Way ANOVA

The following is the non-directional hypothesis for the factor of City, Store, and its interaction City*Store.

2.1 For the factor of City (i.e., main effect of City):

  • Null Hypothesis (H0): Meancity 1 = Meancity 2 (or, Meancity 1 - Meancity 2 =0).
  • Alternative Hypothesis (H1): Meancity 1 ≠ Meancity 2(or, Meancity 1 - Meancity 2  0).

2.2 For the factor of Store (i.e., main effect of Store):

  • Null Hypothesis (H0): Meanstore 1 = Meanstore 2 (or, Meanstore 1 - Meanstore 2 =0).
  • Alternative Hypothesis (H1): Meanstore 1 ≠ Meanstore 2(or, Meanstore 1 - Meanstore 2 ≠0).

2.3 For the interaction effect City*Store:

There are two differennt ways to write null and alternative hypothesis for the interaction effect. Essentially, they are testing the same thing. Typically, it depends on your theory to choose which version to write.

City 1City 2
Store 1Sales11Sales12Md_S1 =Sales11-Sales12
Strore 2Sales21Sales22Md_S2 =Sales21-Sales22
Md_C1 =Sales11-Sales21Md_C1 =Sales12-Sales22
Null and Alternative Hypothesis for Two-Way ANOVA

Version 1:

  • Null Hypothesis (H0): Md_S1 = Md_S2 (or, Md_S1 - Md_S2 =0).
  • Alternative Hypothesis (H1): Md_S1  Md_S2 (or, Md_S1 - Md_S2 0).

Version 2:

  • Null Hypothesis (H0): Md_C1 = Md_C1 (or,Md_C1 - Md_C1=0).
  • Alternative Hypothesis (H1): Md_C1  Md_C1 (or,Md_C1 - Md_C1 0).

Note that, if you want to write the null and alternative hypothesis for the interaction in the way shown above, you need to come with the table as well. Otherwise, people would not be able to know the meanings of Md_S1 Md_S2 Md_C1 Md_C1. For interaction effect, it is always a bit complicated. There is some nuance regarding how to state it and interpret it. Alternatively, you can write version 1 and version 2 in the following way. As you can see, they are quite mouthful. But, you get the basic idea. Interaction effect is a bout the difference on the top of the difference.

Alternatively, you can write the version 1 in the following way.

  • Null Hypothesis (H0): There is no difference between the difference of store 1 sales in city 1 and city 2 and the difference of store 2 sales in city 1 and city 2.
  • Alternative Hypothesis (H1): There is a difference between the difference of store 1 sales in city 1 and city 2 and the difference of store 2 sales in city 1 and city 2.

Alternatively, you can write the version 2 in the following way.

  • Null Hypothesis (H0): There is no difference between the difference of city 1 sales in store 1 and store 2 and the difference of city 2 sales in store 1 and store 2.
  • Alternative Hypothesis (H1): There is a difference between the difference of city 1 sales in store 1 and store 2 and the difference of city 2 sales in store 1 and store 2.

3. Example of Testing Hypothesis

Suppose we have the following data. For a complete process of how to do it in Python, please refer to this tutorial.

  cities  stores  sales
0  City1  store1     10
1  City1  store2     20
2  City1  store1     20
3  City1  store2     50
4  City1  store1     30
5  City2  store2     10
6  City2  store1      5
7  City2  store2      4
8  City2  store1     12
9  City2  store2      4

We conduct the two-way ANOVA in Python.

import statsmodels.api as sm
from statsmodels.formula.api import ols
# the following model statement 
model = ols('sales ~ C(cities) + C(stores) + C(cities):C(stores)', data=df_x).fit()
# setting typ as Type III ANOVA in Python
aov_table = sm.stats.anova_lm(model, typ=3)
print(aov_table)

The following is the output:

                      sum_sq   df          F    PR(>F)
Intercept            1200.00  1.0  10.307802  0.018354
C(cities)             158.70  1.0   1.363207  0.287277
C(stores)             270.00  1.0   2.319256  0.178611
C(cities):C(stores)   183.75  1.0   1.578382  0.255694
Residual              698.50  6.0        NaN       NaN

The following is the hypothesis testing process for two-way ANOVA in Python.

  • For the factor of City (i.e., main effect of City): Since the p-value is 0.287, which is greater than 0.05, we fail to reject the null hypothesis.
  • For the factor of Store (i.e., main effect of Store): Since the p-value is 0.179, which is greater than 0.05, we fail to reject the null hypothesis.
  • For the interaction effect City*Store: Since the p-value is 0.256, which is greater than 0.05, we fail to reject the null hypothesis.

Further Reading