# Maximum Likelihood Estimation for Linear Regression in R

To use maximum likelihood estimation (MLE) in linear regression in R, you can use either optim() or mle() function. The following provides detailed R code examples.

## Data and Model

Suppose we want to test the effect of teaching method (new vs. old) on students’ test scores. The following is the data. In the R code, 0 represents ‘old’ and 1 represents ‘new’ for Teaching_method.

The following is the model statement.

Test Score = β0 + β1 * Teaching Method + ϵ

where:

• β0 and β1 are the coefficients representing the intercept and the effect of the teaching method, respectively.
• ϵ is the error term.

Note that, we typically assume ε to follow a normal distribution with mean 0, namely N(0, σ2). We can extend this as follows.

ϵ ∼ N(0, σ2)

Based on that, we can get the following.

Test Score ∼ N(β0 + β1 * Teaching Method, σ2)

## Method 1: optim()

In method 1, We first define a function to calculate negative log likelihood. Then, we use the optim() function to optimize it by minimizing the negative log likelihood.

``````
# input data for x and y
Teaching_Method <- c(0,0,1,0,1,1,1,1,0,0,0,1)
Test_Score <- c(90,90,100,70,90,95,75,75,60,65,65,70)

# Define the model using the dnorm() function
lm_MLE <- function(par, y, x1, x2){

beta_0 <- par[1]
beta_1 <- par[2]
sigma <- par[3]

-sum(dnorm(Test_Score, beta_0+beta_1*Teaching_Method, sigma, log = TRUE))
}

# use optim() for MLE in linear regression model
result_0<-optim(lm_MLE,par=c(beta_0 = 1, beta_1=1, sigma = 19) )

# print out the estimated parameters
result_0\$par

# value output from optim()
result_0\$value

# Explain the meaning of "value" from optim() output
-sum(dnorm(Test_Score, result_0\$par["beta_0"]+result_0\$par["beta_1"]*Teaching_Method,  result_0\$par["sigma"], log = TRUE))

# print out if it is converged:  0 indicates successful completion
result_0\$convergence
``````

The following is the output, including the estimated parameters, calculated value, and convergence status. For the value from optim(), it is the final value defined in the function(). Convergence from optim() indicates if the iteration is converged, and 0 indicates successful completion.

```# print out the estimated parameters
> result_0\$par
beta_0   beta_1    sigma
73.33608 10.82952 11.73821

# value output from optim()
> result_0\$value
[1] 46.58414

# Explain the meaning of "value" from optim() output
> -sum(dnorm(Test_Score, result_0\$par["beta_0"]+result_0\$par["beta_1"]*Teaching_Method,  result_0\$par["sigma"], log = TRUE))
[1] 46.58414

# print out if it is converged:  0 indicates successful completion
> result_0\$convergence
[1] 0```

## Method 2: mle()

We can also use mle() to estimate regression coefficients and residual variance. The R code includes 2 parts.

First, define a function using `function()`. Second, use `mle()` function. We need to manually calculate the square of sigma to get the estimated variance.

``````# input data for x and y
Teaching_Method <- c(0,0,1,0,1,1,1,1,0,0,0,1)
Test_Score <- c(90,90,100,70,90,95,75,75,60,65,65,70)

# Define the model using the dnorm() function
LL <- function(beta_0, beta_1, sigma){
-sum(dnorm(Test_Score,(beta_0+beta_1*Teaching_Method), sigma, log = TRUE))
}

# use mle() function to calculate regression coefficients
library(stats4)
result_1 <- mle(LL, start = list(beta_0 = 1, beta_1=1, sigma = 19))
summary(result_1)

# calculate sigma squared
(result_1@coef[3])^2
``````

We can see that for linear regression, optim() and mle() produce similar results. In particular, When using optim() (i.e., in method 1), the estimated parameters are β0 (73.34), β1(10.83), and σ (11.73). When using mle(), the estimated parameters are β0 (73.43), β1(10.30), and σ (11.93), see below.

```> summary(result_1)
Maximum likelihood estimation

Call:
mle(minuslogl = LL, start = list(beta_0 = 1, beta_1 = 1, sigma = 19))

Coefficients:
Estimate Std. Error
beta_0 73.42516   4.871012
beta_1 10.30336   6.892004
sigma  11.93112   2.495832

-2 log L: 93.18286

> # calculate sigma squared
> (result_1@coef[3])^2
sigma
142.3517 ```