One strategy for interpreting data without stringent assumptions, is to use order statistics.
Read HC 4.6. In the following, we will use the notation of HC where the pdf of the random variable is denoted by , rather than .
pdf of the random variable is denoted by , rather than .
Let , , ... denote a random sample from a continuous distribution with pdf . Let be the smallest of these, the next in order of magnitude, etc. Then is called the ith order statistic of the sample, and the vector of order statistics. We may write . The following alternative notation is also common; .
Order statistics are non-parametric and only rely upon the weak assumption that the data are samples from a continuous distribution. we pick up information by ordering the data. If we know the underlying distribution, we can combine that knowledge with the rank of the order statistic of interest. For instance, if the underlying distribution is normal, from a sample of size 101 will have a higher probability of being near the median than or . But without ordering, the same could not be said for . So the ordering gives us extra information and we shall now explore the densities of order statistics, denoted etc.
Example 1 Suppose you were required to assess the ability to handle a crowd at a railway station with regard to stair width, staff etc. The statistic of interest is .
Example 2 An oil product freezes at C and the company ponders whether it should market it in a cold climate. We would require the density of the minimum order statistic, to assess the risk of the product failing.
Examples of other Other situations where an order statistic is of interest are listed below.
Order statistics are useful for summarizing data but may be limited for detailed descriptions of some process which has been measured. Order statistics are also ingredients for higher level statistical procedures.
Figure 5.1 shows the sample cdf as a step function increasing by at each order statistic. We can make statements about individual order statistics by borrowing information provided by the entire set. Remember that all we assumed about the original data was that it were continuous; there were no assumptions about the distribution. But now that the data are ordered, we can use the extra information provided by the ordering to derive density functions.
The data, might be independent but the ordered data , are not.
To begin our study of order statistics we first want to find the joint distribution of .