How to Calculate Mean Squared Deviation in R

Mean Squared Deviation (MSD) often is synonymic with Mean Squared Error (MSE). MSD can be used to compare our estimated values and observed values in a model.

For MSD, there are two possible situations, unbiased MSD and biased MSD. Both of them are correct. The following are the formulas of MSD.

Method 1: Unbiased MSD

\[ MSD =\frac{SSR}{n-p-1}=\frac{\sum_{i=1}^{n} (\hat{y_i}-y_i)^2 }{n-p-1}\]

Method 2: biased MSD

\[ MSD =\frac{SSR}{n}=\frac{\sum_{i=1}^{n} (\hat{y_i}-y_i)^2 }{n}\]

Why and when distinguish biased and unbiased MSD?

n-p-1 is the degree of freedom of the model. n is the biased, whereas n-p-1 is unbiased (see the discussions on Stack Exchange here and here).

When n is large, unbiased MSD and biased MSD will generate very similar results. You can use either either one, and just need to report to your readers which one you use.

p stands for the numbers of parameters you estimate in the model (excluding intercept). If you do not estimate any parameter, p will be zero.

How to Calculate MSD in R

R can be used to calculate MSD. The following is the core syntax for both biased MSD and unbiased MSD.

Method 1: Unbiased MSD

sum(residuals(fit)^2)/fit$df.residual

Method 2: biased MSD

mean(residuals(fit)^2)

The following are 2 examples showing how to calculate unbiased and biased MSD respectively for linear regression models in R.

Example 1: Unbiased MSD

mtcarts is a built-in sample dataset in R. We can have a linear regression model of mpg as the DV and hp as the IV. We can use lm() to estimate the regression coefficients.

After getting the fit, we use the sum(residuals(fit)^2)/fit$df.residual to calculate unbiased MSD.

# use lm() to estimate regression coefficinets
fit <- lm(mpg~hp, data=mtcars)

# calculate unbiased Mean Squared Deviation(MSD)
sum(residuals(fit)^2)/fit$df.residual

Output:

[1] 14.92248

Thus, the unbiased Mean Squared Deviation(MSD) for the regression model is 14.92.

Example 2: biased MSD

After getting the fit, we use the mean(residuals(fit)^2) to calculate biased MSD.

# use lm() to estimate regression coefficinets
fit <- lm(mpg~hp, data=mtcars)

# calculate biased Mean Squared Deviation(MSD)
mean(residuals(fit)^2)

Output:

13.98982

Thus, the biased Mean Squared Deviation(MSD) for the regression model is 13.99.


Further Reading