Consider an i.i.d. sample of size n from a N(μ, σ²) distributed random variable X.
(a) Determine the maximum likelihood estimator for μ under the assumption that σ²= 1.
(b) Now determine the maximum likelihood estimator for μ for an arbitrary σ².
(c) What is the maximum likelihood estimate for σ²?
(a) The probability density function of a normal distribution equates to
f\left(x\right)=\frac{1}{\sigma \sqrt{2\pi } } exp\left(-\frac{\left(x-\mu \right)^{2} }{2\sigma^{2} } \right)
with ∞ < x < ∞,−∞ < μ < ∞, σ² > 0. The likelihood function is therefore
L\left(x_{1},x_{2}, \ldots , x_{n}\mid \mu ,\sigma ^{2}\right)=\left(\frac{1}{\sqrt{2\pi \sigma ^{2}} } \right)^{2}exp \left(-\sum\limits_{i=1}^{n}{\frac{\left(x_{i}-\mu \right)^{2}}{2\sigma^{2}} } \right) .
To find the maximum of L\left(x_{1},x_{2}, \ldots , x_{n}\mid \mu ,\sigma ^{2}\right), it is again easier to work with the log-likelihood function, which is
l =\ln L\left(x_{1},x_{2}, \ldots , x_{n}\mid \mu ,\sigma ^{2}\right)= -\frac{n}{2} \ln 2\pi -\frac{n}{2} \ln \sigma^{2}-\sum\limits_{i=1}^{n}{\left(\frac{\left(x_{i}-\mu \right)^{2}}{2\sigma^{2} } \right) }.
Assuming σ² to be 1, differentiating the log-likelihood function with respect to μ, and equating it to zero gives us
\frac{\partial l}{\partial \mu } =2\sum\limits_{i=1}^{n}{\left(\frac{x_{i}-\mu}{1^{2}} \right) } =0 \Leftrightarrow n\mu =\sum\limits_{i=1}^{n}{x_{i}}.
The ML estimate is therefore \hat{\mu } =\bar{x}.
(b) Looking at the differentiated log-likelihood function in (a) shows us that the ML estimate of μ is always the arithmetic mean, no matter whether σ² is 1 or any other number.
(c) Differentiating the log-likelihood function from (a) with respect to σ² yields
\frac{\partial l}{\partial \sigma^{2} } =-\frac{n}{2} \frac{1}{\sigma^{2}}+\frac{1}{2\sigma^{4}}\sum\limits_{i=1}^{n}{\left(x_{i}-\mu\right)^{2} } =0.
Using \hat{\mu } =\bar{x} we calculate \frac{\partial l}{\partial \sigma^{2} }=0 as
\hat{\sigma }^{2} =\frac{1}{n} \sum\limits_{i=1}^{n}{\left(x_{i}-\hat{\mu }\right)^{2} } =\frac{1}{n} \sum\limits_{i=1}^{n}{\left(x_{i}-\bar{x} \right)^{2} }.
Since the parameter θ = (μ, σ²) is two-dimensional, it follows that one needs to solve two ML equations, where \hat{\mu }_{ML} is estimated first, and \hat{σ}^{2}_{ ML} second (as we did above). It follows further that one needs to look at the positive definiteness of a matrix (the so-called information matrix) when checking that the second-order derivatives of the estimates are positive and therefore the estimates yield a maximum rather then a minimum. However, we omit this lengthy and time-consuming task here.