Why Type I ANOVA is Sequential Sum of Squares (R code example)

Type 1 ANOVA is also called sequential sum of squares, because it considers the order effect of entering factors into the model. If you change the order of the factors in the model, the results will be different. The following uses an example in R to explain this.

Step 1: Prepare the data

The following is the data that will be using later, which has two IVs (cities and stores). Both IVs are categorical variable. The DV is sales.

# data will be used 
x_1 = rep(c('City1','City2'),each=5)
x_2 = rep(c('store1','store2'), 5)
sales=c(10,20,20,50,30,10,5,4,12,4)

df <- data.frame (cities  = x_1,
                  stores = x_2,
                  sales=sales)
print(df)

Output:

   cities stores sales
1   City1 store1    10
2   City1 store2    20
3   City1 store1    20
4   City1 store2    50
5   City1 store1    30
6   City2 store2    10
7   City2 store1     5
8   City2 store2     4
9   City2 store1    12
10  City2 store2     4

Step 2: Use aov() to test Type I ANOVA in R

Model 1: cities entering the model first

  • SS(cities) for factor cities
  • SS(stores | cities) for factor stores
  • SS(cities* stores | cities, stores) for interaction cities* store
# Model 1: cities entering the model first
result_model1<-aov(sales ~ cities*stores, data = df)
summary(result_model1)

Output:

              Df Sum Sq Mean Sq F value Pr(>F)  
cities         1  902.5   902.5   7.752 0.0318 *
stores         1   93.8    93.8   0.805 0.4041  
cities:stores  1  183.7   183.7   1.578 0.2557  
Residuals      6  698.5   116.4         

Model 2: stores entering the model first

  • SS(stores) for factor stores
  • SS(cities | stores) for factor cities
  • SS(stores*cities | stores, cities) for interaction stores*cities
# Model 2: stores entering the model first
result_model2<-aov(sales ~ stores*cities, data = df)
summary(result_model2)

Output:

              Df Sum Sq Mean Sq F value Pr(>F)  
stores         1   12.1    12.1   0.104 0.7581  
cities         1  984.1   984.1   8.454 0.0271 *
stores:cities  1  183.7   183.7   1.578 0.2557  
Residuals      6  698.5   116.4              

As we can see, the Sum of Squares are different between these two models, due to the different orders of entering the model. That is why Type 1 ANOVA is also called sequential sum of squares.


Further Reading