There are 3 assumption for ANOVA:
- Normality – The responses for each factor level have a normal population distribution.
- Equal variances (Homogeneity of Variance) – These distributions have the same variance.
- Independence – The data are independent.
You can use R to test the assumptions of normality and equality variances (The following are the two tutorials). In contrast, for independence, you would judge based on study design.
Below, I provide a theoretical discussion on ANOVA assumptions, especially why there is a need for normality assumption for ANOVA.
Short Discussion: Why Normality Assumption for ANOVA
The following is the ANOVA formula that you would often see for one-way ANOVA.
\[ \frac{MSB}{MSE}=\frac{\frac{SSB}{k-1}}{\frac{SSE}{n-k}}=\frac{\frac{\sum_{i=1}^kn_i(\bar{x_i}-\bar{x})^2}{k-1}}{\frac{\sum_{i=1}^{k}\sum_{j=1}^{n_i}(x_{ij}-\bar{x_i})^2}{n-k}} \sim F(k-1,n-k)\]
It can be rewritten as follows.
\[ \frac{ \frac{Q_B}{k-1}}{ \frac{Q_E}{n-k}} \sim F(k-1,n-k) \]
Where,
\( Q_B \) and \( Q_E \) are chi-square distributions with respective degrees of freedom \( k-1 \) and \( n-k \).
Finally, Chi-square distribution is the square of standard normal distribution, and that is why there is a connection between normality assumption and ANOVA test.
Theoretical Background: Cochran’s Theorem
However, to fully understand why normality assumption in ANOVA, we need to have a basic idea of Cochran’s Theorem.
Let \( X_1, X_2, … X_n \) be independent \( N(0, \sigma^2) \) distributed random variables, and suppose that:
\[ \sum_{i=1}^n X_i^2 = Q_1 +Q_2+…+Q_k \]
Where, \( Q_1, Q_2, …, Q_k\) are positive semi-definite quadratic forms in the random variables \( X_1, X_2, …, X_m \), that is,
\[ Q_i = X^{‘} A_i X, i = 1, 2, …, k \]
Set Rank \( A_i = r_i, i =1, 2, …, k \).
If
\[ r_1 + r_2 + … + r_k = n \]
then,
- \( Q_1, Q_2, …, Q_k \) are indepenent.
- \( Q_i \sim \sigma^2 \chi ^2 (r_i) \)
Theoretical Background: Assumptions for One-way ANOVA
After knowing Cochran’s Theorem, we can further discuss the theoretical background of assumptions for one-way ANVOA.
Let’s assume that we have a data sample \( x_{ij} \), where
- \( i \) is from 1 to \( k \), namely we have \( k \) groups.
- \( j \) is from 1 to \( n_i \), such that each \( i \) group has \( n_i \) observations.
Thus, we can get the total sample size as follows.
\[ n=\sum_{i=1}^k n_i \]
Assume that \( i^{th} \) group follows the normal distribution \( N(b_i, \sigma^2) \). Here, we are assuming that all these \( k \) groups are independent and have the same variance.
Side Note:
Here, you see why there are assumptions of Equal variances and Independence for ANOVA.
Thus, the null hypothesis for one-way ANOVA is:
\[ H_0 : b_1 = b_2 = …=b_k \]
We can rewrite \( b_i \) as follows.
\[ b_i = \mu +a_i \]
where,
\[ \mu = \frac{1}{n} \sum_{i=1}^k n_i b_i \]
\[ a_i = b_i – \mu \]
\( a_i \) is the difference between \( i^{th} \) group mean and the overall mean \( \mu \). We can also get the following.
\[ \sum_{i=1}^k n_i a_i =0 \]
With the notation above, we can write the following.
\[ x_{ij} = \mu +a_i + \epsilon_{ij} \]
Where,
\[ \epsilon_{ij} \sim N(0, \sigma^2) \]
Side Note:
Here, you see why there is normality assumption for ANOVA. Further, you should be also aware that the normality test is on the residuals \( \epsilon_{ij} =x_{ij}-\bar{x_i} \) rather than the original data sample \(x_{ij} \).
We can then use the least square to estimate parameters.
\[ \sum_{i=1}^k \sum_{j=1}^{n_i} \epsilon_{ij} ^2 = \sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\mu-a_i)^2 \]
Thus, the results of least square estimates are
\[ \hat{\mu} =\bar{x}\]
\[ \hat{a_i} =\bar{x_i}-\bar{x}\]
where,
\[ \bar{x} =\sum_{i=1}^k \sum_{j=1}^{n_i} x_{ij} \]
\[ \bar{x_i} =\frac{1}{n_i}\sum_{j=1}^{n_i} x_{ij} \]
Thus, we can get the folllowing.
\[ \sum_{i=1}^k \sum_{j=1}^{n_i} \epsilon_{ij} ^2 = \sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\mu-a_i)^2=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\bar{x_i})^2 \]
Thus, we can get SSE as follows.
\[ SSE=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\bar{x_i})^2 \]
We now need to go back to the original idea of decomposing variance, starting with total variance Q.
\[ Q=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\bar{x})^2 \]
We can expand Q into two components, namely between-groups (SSB) and within-groups (SSE).
\[ \begin{aligned} Q_T &=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\bar{x})^2 \\ &= \sum_{i=1}^k \sum_{j=1}^{n_i} [(x_{ij}-\bar{x_i})+(\bar{x_i}-\bar{x})]^2 \\ &=\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\bar{x_i})^2 +\sum_{i=1}^k \sum_{j=1}^{n_i} (\bar{x_i}-\bar{x})^2 \\ &= \sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij}-\bar{x_i})^2 + \sum_{i=1}^k n_i (\bar{x_i}-\bar{x})^2 \\ &= SSE + SSB \\ &= Q_E +Q_B \end{aligned} \]
Based on Fisher-Cochran theorem, we can get the following.
\[ Q_E \sim \sigma^2 \chi ^2 (n-k) \]
\[ Q_B \sim \sigma^2 \chi ^2 (k-1) \]
Given that these two follow chi-square distribution, we can write the following.
\[ \frac{MSB}{MSE}=\frac{\frac{SSB}{k-1}}{\frac{SSE}{n-k}} = \frac{\frac{Q_B}{k-1}}{\frac{Q_E}{n-k}} \sim F(k-1, n-k)\]
Side Note:
(1) Here, you can see the connection between one-way ANOVA and Cochran theorem. That is, under the null hypothesis (all group means are equal), \( x_{ij}-\bar{x} \sim N(0, \sigma^2)\). (However, you can NOT test if \( x_{ij}-\bar{x} \sim N(0, \sigma^2)\), as it is based on null hypothesis. If your groups means are actually not equal, \( x_{ij}-\bar{x} \) does not follow normal distribution of \( N(0, \sigma^2)\)).
(2) Further, if \( x_{ij}-\bar{x_i} \sim N(0, \sigma^2) \), then \(\bar{x_i}-\bar{x} \sim N(0, \sigma^2) \). Thus, you only need to test if \( \epsilon_{ij} =x_{ij}-\bar{x_i} \sim N(0, \sigma^2) \). There is a discussion on Stackexchange about this, and I will add the link down below.
Reference
- ANOVA assumption normality/normal distribution of residuals (StackExchange.com)
- ANOVA Assumptions (Penn State website)
- ANOVA Assumptions (Slides from University Alberta)