MSE stands for Mean Squared Error. MSE is used to compare our estimated Y (DV) and observed Y in a model.
This tutorial shows how you can calcuate biased and unbiased MSE in Python using 4 examples.
Biased MSE and unbiased MSE
The following is the formulas for biased MSE and unbiased MSE.
Biased MSE
\[ MSE=\frac{\sum_{i=1}^{n} (\hat{y_i}-y_i)^2 }{n}\]
Unbiased MSE
\[ MSE=\frac{\sum_{i=1}^{n} (\hat{y_i}-y_i)^2 }{n-p-1}\]
How to Calculate MSE in Python
Method 1: Use Python Numpy
- Biased MSE: np.square(np.subtract(Y_Observed,Y_Estimated)).mean()
- Unbiased MSE: sum(np.square(np.subtract(Y_Observed,Y_Estimated)))/(n-p-1)
Method 2: Use sklearn.metrics
- Biased MSE: mean_squared_error(Y_Observed,Y_Estimated)
- Unbiased MSE: (n/(n-p-1))*mean_squared_error(Y_Observed,Y_Estimated)
Example 1: Use Numpy for biased MSE
The following Python code calculate biased MSE using Numpy.
import numpy as np
# Obseved values
Y_Observed = [5,4,3,5,1,4,5]
# Estimated values
Y_Estimated = [4.4,5.2,2.5,4.5,2,4,4.5]
# Use Numpy to calculate biased Mean Squared Error (MSE)
np.square(np.subtract(Y_Observed,Y_Estimated)).mean()
Output:
0.5071428571428571
Example 2: Use Numpy for unbiased MSE
Suppose we only estimate 1 parameter. Thus, the degree of freedom is 7-1-1=5. The following Python code calculate unbiased MSE using Numpy.
import numpy as np
# Obseved values
Y_Observed = [5,4,3,5,1,4,5]
# Estimated values
Y_Estimated = [4.4,5.2,2.5,4.5,2,4,4.5]
# Use Numpy to calculate unbiased Mean Squared Error (MSE)
sum(np.square(np.subtract(Y_Observed,Y_Estimated)))/(7-1-1)
Output:
0.71
Example 3: Use sklearn.metrics for biased MSE
The folllowing Python codes uses sklearn.metrics
mean_squared_error to calculate biased MSE.
from sklearn.metrics import mean_squared_error
import numpy as np
# Obseved values
Y_Observed = [5,4,3,5,1,4,5]
# Estimated values
Y_Estimated = [4.4,5.2,2.5,4.5,2,4,4.5]
#Use sklearn.metrics
mean_squared_error to calculate biased MSE
mean_squared_error(Y_Observed,Y_Estimated)
Output:
0.5071428571428571
Example 4: Use sklearn.metrics for unbiased MSE
Suppose we only estimate 1 parameter. Thus, the degree of freedom is 7-1-1=5. The folllowing Python codes uses mean_squared_error to calculate unbiased MSE.
from sklearn.metrics import mean_squared_error
import numpy as np
# Obseved values
Y_Observed = [5,4,3,5,1,4,5]
# Estimated values
Y_Estimated = [4.4,5.2,2.5,4.5,2,4,4.5]
#Use sklearn.metrics
mean_squared_error to calculate unbiased MSE
(7/5)*mean_squared_error(Y_Observed,Y_Estimated)
Output:
0.71
Reference
- Errors and residuals (Wikipedia)
- Mean squared error and the residual sum of squares function (Stack Exchange)