Maximum Likelihood Estimation for Linear Regression in R

To use maximum likelihood estimation (MLE) in linear regression in R, you can use either optim() or mle() function. The following provides detailed R code examples.

Data and Model

Suppose we want to test the effect of teaching method (new vs. old) on students’ test scores. The following is the data. In the R code, 0 represents ‘old’ and 1 represents ‘new’ for Teaching_method.

Student IDTeaching Method Test Score
1Old90
2Old90
3New100
4Old70
5New90
6New95
7New75
8New75
9Old60
10Old65
11Old65
12New70
Data Example for Linear Regression using MLE in R


The following is the model statement.

Test Score = β0 + β1 * Teaching Method + ϵ

where:

  • β0 and β1 are the coefficients representing the intercept and the effect of the teaching method, respectively.
  • ϵ is the error term.

Note that, we typically assume ε to follow a normal distribution with mean 0, namely N(0, σ2). We can extend this as follows.

ϵ ∼ N(0, σ2)

Based on that, we can get the following.

Test Score ∼ N(β0 + β1 * Teaching Method, σ2)


Method 1: optim()

In method 1, We first define a function to calculate negative log likelihood. Then, we use the optim() function to optimize it by minimizing the negative log likelihood.


# input data for x and y
Teaching_Method <- c(0,0,1,0,1,1,1,1,0,0,0,1)
Test_Score <- c(90,90,100,70,90,95,75,75,60,65,65,70)

# Define the model using the dnorm() function
lm_MLE <- function(par, y, x1, x2){
  
  beta_0 <- par[1]
  beta_1 <- par[2]
  sigma <- par[3]
  
  -sum(dnorm(Test_Score, beta_0+beta_1*Teaching_Method, sigma, log = TRUE))
}

# use optim() for MLE in linear regression model
result_0<-optim(lm_MLE,par=c(beta_0 = 1, beta_1=1, sigma = 19) )

# print out the estimated parameters
result_0$par

# value output from optim()
result_0$value

# Explain the meaning of "value" from optim() output
-sum(dnorm(Test_Score, result_0$par["beta_0"]+result_0$par["beta_1"]*Teaching_Method,  result_0$par["sigma"], log = TRUE))

# print out if it is converged:  0 indicates successful completion
result_0$convergence

The following is the output, including the estimated parameters, calculated value, and convergence status. For the value from optim(), it is the final value defined in the function(). Convergence from optim() indicates if the iteration is converged, and 0 indicates successful completion.

# print out the estimated parameters
> result_0$par
  beta_0   beta_1    sigma 
73.33608 10.82952 11.73821 
 
# value output from optim()
> result_0$value
[1] 46.58414

# Explain the meaning of "value" from optim() output
> -sum(dnorm(Test_Score, result_0$par["beta_0"]+result_0$par["beta_1"]*Teaching_Method,  result_0$par["sigma"], log = TRUE))
[1] 46.58414
 
# print out if it is converged:  0 indicates successful completion
> result_0$convergence
[1] 0

Method 2: mle()

We can also use mle() to estimate regression coefficients and residual variance. The R code includes 2 parts.

First, define a function using function(). Second, use mle() function. We need to manually calculate the square of sigma to get the estimated variance.

# input data for x and y
Teaching_Method <- c(0,0,1,0,1,1,1,1,0,0,0,1)
Test_Score <- c(90,90,100,70,90,95,75,75,60,65,65,70)

# Define the model using the dnorm() function
LL <- function(beta_0, beta_1, sigma){
  -sum(dnorm(Test_Score,(beta_0+beta_1*Teaching_Method), sigma, log = TRUE))
}

# use mle() function to calculate regression coefficients 
library(stats4)
result_1 <- mle(LL, start = list(beta_0 = 1, beta_1=1, sigma = 19))
summary(result_1)

# calculate sigma squared
(result_1@coef[3])^2

We can see that for linear regression, optim() and mle() produce similar results. In particular, When using optim() (i.e., in method 1), the estimated parameters are β0 (73.34), β1(10.83), and σ (11.73). When using mle(), the estimated parameters are β0 (73.43), β1(10.30), and σ (11.93), see below.

> summary(result_1)
Maximum likelihood estimation

Call:
mle(minuslogl = LL, start = list(beta_0 = 1, beta_1 = 1, sigma = 19))

Coefficients:
       Estimate Std. Error
beta_0 73.42516   4.871012
beta_1 10.30336   6.892004
sigma  11.93112   2.495832

-2 log L: 93.18286 

> # calculate sigma squared
> (result_1@coef[3])^2
   sigma 
142.3517 

Further Reading

If you have questions regarding how to derive the normal distribution for Y in MLE, you can refer my following 2 tutorials.

Further, you can also refer to the following 2 pages. The firstone is about optim() function and the second one provides a tutorial for it.

Leave a Comment