next up previous contents
Next: Further reading Up: Accuracy and bias of Previous: Example 1: The sample   Contents

Example 2: The sample variance

The sample variance $ s^2=\frac{1}{n}\sum_{i=1}^{n}(x_i-\overline{x})^2$ underestimates the population variance $ \sigma^2$. Using the same approach as in the previous example (try it !), it is possible to show that $ E(s^2)=\sigma^2(n-1)/n$, and therefore the bias $ E(s^2)-\sigma^2=-\sigma^2/n$. This underestimate of the true population variance is greatest when the sample size is very small, for example, the mean sample variance is only 2/3 of the true population variance when $ n=3$. To obtain an unbiased variance estimate, the sample variance is sometimes defined with $ n-1$ in the denominator instead of $ n$ i.e. $ s^2=\frac{1}{n-1}\sum_{i=1}^{n}(x_i-\overline{x})^2$. However, it should be noted that this larger estimator also has larger variance than $ s$, and is therefore a less efficient estimator. It is also worth noting that although this estimator gives by design an unbiased estimate of the population variance, $ \hat{s}$ still remains a biased (over)estimate of the population standard deviation.



David Stephenson 2005-09-30