This tutorial shows how to calculate Sum of Squares Total (SST) in R. The following is the data being used. The hypothetical data being used has two categorical IVs (cities and stores) and one DV (sales).
Note that, while it has two IVs, the calculation of SST actually does not need to use these two IVs.
x_1 = rep(c('City1','City2'),each=5)
x_2 = rep(c('store1','store2'), 5)
sales=c(10,20,20,50,30,10,5,4,12,4)
df <- data.frame (cities = x_1,
stores = x_2,
sales=sales)
print(df)
Output:
cities stores sales 1 City1 store1 10 2 City1 store2 20 3 City1 store1 20 4 City1 store2 50 5 City1 store1 30 6 City2 store2 10 7 City2 store1 5 8 City2 store2 4 9 City2 store1 12 10 City2 store2 4
Example 1: Use built-in functions
sum((sales - mean(sales))^2)
Output:
[1] 1878.5
Thus, we can conclud that the Sum of Squares Total (SST) is 1878.5.
Example 2: Use intercept
Another way of calculating SST is to set the model to only include 1 as the only predictor.
# model with intercept only in R
sales_intercept <- lm(sales ~ 1, data = df)
anova(sales_intercept)
Output:
Response: sales Df Sum Sq Mean Sq F value Pr(>F) Residuals 9 1878.5 208.72
Thus, we can conclud that the Sum of Squares Total (SST) is 1878.5.