What is One-way ANOVA? Formula and Example

One-Way ANOVA is to compare the means of different groups, to see whether the mean difference is statistically significant. For instance, you would like to compare the average household size of three cities. You can collect 3 samples from these three cities and conduct a one-way ANOVA to check the difference.

Formulas of One-way ANOVA

The full name of ANOVA is Analysis of Variance. Thus, ANOVA is about partitioning the variance into different parts. Sum of Square Total (SSB) is the total variance of all the observations. SSB can be separated into Sum of Squares Between (SSB) and Sum of squares Error (SSE).

\[SST=SSB+SSE\]

The formulas of SSB and SSE are as follows.

\[SSB=\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2\]

\[SSE=\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2\]

We also need to consider the degree of freedom, which leads to mean squares, namely Mean Square Between (MSB) and Mean Square Error (MSE).

\[MSB=\frac{SSB}{k-1}\]

\[MSE=\frac{SSE}{n-k}\]

Finally, the F value is the ratio of MSB and MSE.

\[F(k-1,n-k)=\frac{MSB}{MSE}=\frac{\frac{SSB}{k-1}}{\frac{SSE}{n-k}}=\frac{\frac{\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2}{k-1}}{\frac{\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2}{n-k}}\]

Manual Calculation Example

Suppose we would like to see whether 3 cities differ in terms of household size. We sample 5 households from each city. The null hypothesis and alternative hypothesis for one-way ANOVA are as follows.

\[H_0: \mu_{city1}=\mu_{city2}=\mu_{city3}\]

\[H_1: \mu_{city1},\mu_{city2},\mu_{city3} \ are \ not \ all \ equal.\]

GroupHousehold SizeGroup MeanOverall Mean
City 1643.4
City 1243.4
City 1343.4
City 1443.4
City 1543.4
City 2233.4
City 2133.4
City 2333.4
City 2433.4
City 2533.4
City 343.23.4
City 313.23.4
City 323.23.4
City 343.23.4
City 353.23.4

\[SSB=\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2=5 \times(4-3.4)^2+5\times(3-3.4)^2+5 \times (3.2-3.4)^2=2.8\]

\[\begin{equation}
\begin{aligned}
SSE=\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2= & (6-4)^2+(2-4)^2+(3-4)^2+(4-4)^2+(5-4)^2+\\ &(2-3)^2+(1-3)^2+(3-3)^2+(4-3)^2+(5-3)^2+\\
&(4-3.2)^2+(1-3.2)^2+(2-3.2)^2+(4-3.2)^2+(5-3.2)^2 \\
= &30.8
\end{aligned}
\end{equation}\]

\[MSB=\frac{SSB}{k-1}=\frac{2.8}{3-1}=1.4\]

\[MSE=\frac{SSE}{n-k}=\frac{30.8}{15-3}=2.57\]

Finally, we can calculate the F-value by calculating the ratio of MSB and MSE.

\[F(k-1,n-k)=\frac{MSB}{MSE}=F(2,12)=\frac{1.4}{2.57}=0.55\]

We can then check the F(2,12) critical value table, and it is 3.89. Since the calculated F(2,12) = 0.55 and it is smaller than 3.89, we fail to reject the null hypothesis. Thus, we conclude that we do not have evidence to reject the claim that all these three cities have the same household size.