Next: Information in a Sample Up: Reduction of Data Previous: The Exponential Family of   Contents

# Likelihood

Definition 7..5
Let be a random sample from and the corresponding observed values. The likelihood of the sample is the joint probability function (or the joint probability density function, in the continuous case) evaluated at , and is denoted by .

Now the notation emphasizes that, for a given sample x, the likelihood is a function of . Of course

The likelihood function is a statistic, depending on the observed sample x. A statistical inference or procedure should be consistent with the assumption that the best explanation of a set of data is provided by , a value of that maximizes the likelihood function. This value of is called the maximum likelihood estimate (mle). The relationship of a sufficient statistic for to the mle for is contained in the following theorem.

Theorem 7..2
Let be a random sample from . If a sufficient statistic for exists, and if a maximum likelihood estimate of also exists uniquely, then is a function of .

Proof
Let be the pdf of . Then by the definition of sufficiency, the likelihood function can be written

 (7.5)

where h( ) does not depend on . So and as functions of are maximized simultaneously. Since there is one and only one value of that maximizes and hence , that value must be a function of . Thus the mle is a function of the sufficient statistic .

Sometimes we cannot find the maximum likelihood estimator by differentiating the likelihood (or of the likelihood) with respect to and setting the equation equal to zero. Two possible problems are:

(i)
The likelihood is not differentiable throughout the range space;
(ii)
The likelihood is differentiable, but there is a terminal maximum (that is, at one end of the range space).

For example, consider the uniform distribution on . The likelihood, using a random sample of size is

 (7.6)

Now is decreasing in over the range of positive values. Hence it will be maximized by choosing as small as possible while still satisfying . That is, we choose equal to , or , the largest order statistic.

Example 7..8
Consider the truncated exponential distribution with pdf

The Likelihood is

Hence the likelihood is increasing in and we choose as large as possible, that is, equal to .

Further use is made of the concept of likelihood in Hypothesis Testing (Chapter 3), but here we will define the term likelihood ratio, and in particular monotone likelihood ratio.

Definition 7..6
Let and be two competing values of in the density , where a sample of values X leads to likelihood, . Then the likelihood ratio is

This ratio can be thought of as comparing the relative merits of the two possible values of , in the light of the data X. Large values of would favour and small values of would favour . Sometimes the statistic has the property that for each pair of values , , where , the likelihood ratio is a monotone function of . If it is monotone increasing, then large values of tend to be associated with the larger of the two parameter values. This idea is often used in an intuitive approach to hypothesis testing where, for example, a large value of would support the larger of two possible values of .

Definition 7..7
A family of distributions indexed by a real parameter is said to have a monotone likelihood ratio if there is a statistic such that for each pair of values and , where , the likelihood ratio is a non-decreasing function of .

Example 7..9
Let be a random sample from a Poisson distribution with parameter . Determine whether ( ) has a monotone likelihood ratio (mlr).

Here the likelihood of the sample is

Let , be values of with . Then for given

Note that so this ratio is increasing as increases. Hence ( ) has a monotone likelihood ratio in .

Next: Information in a Sample Up: Reduction of Data Previous: The Exponential Family of   Contents
Bob Murison 2000-10-31