next up previous contents
Next: Empirical quantiles Up: Descriptive statistics for univariate Previous: Key attributes of sample   Contents

Resistant statistics

Observations often contain rogue outlier values that lie far away from the bulk of the data. These can be caused by measurement or recording errors or can be due to genuine freak events. Especially when dealing with small samples, outliers can bias the previous summary statistics away from values representative for majority of the sample.

This problem can be avoided either by eliminating or downweighting the outlier values in the sample (quality control), or by using statistics that are resistant to the presence of outliers. Note that the word robust should not be used to signify resistant since it is used in statistics to refer to insensitivity to choice of probability model or estimator rather than data value. Because the range is based on the extreme minimum and maximum values in the sample, it is a good example of a statistic that is not at all resistant to the presence of an outlier (and so should be interpreted very carefully !).

David Stephenson 2005-09-30