next up previous contents
Next: Confidence intervals Up: Parameter estimation Previous: Sampling distributions   Contents

Sampling errors

For infinitely large sample sizes, the spread of the sampling distribution (sampling uncertainity) tends to zero. However, for finite samples, there is always uncertainty in the estimate due to the finite spread of the sampling distribution (except in the unlikely event that the sample is the whole finite population).

A traditional physicist approach to providing an estimate of this sampling uncertainty is to quote the standard error, which is defined as the standard deviation $ s_t$ of a sample statistic $ t$ (i.e. the spread of the sampling distribution). For example, the heights of meteorologists in Table 2.1 have a sample mean of 174.3cm and a sample standard deviation of 7.9cm, and therefore an estimate of the population mean would be 174.3cm with a standard error of 2.4cm ( $ s_{\overline{x}}=s/\sqrt{n}$). Physicists write this succintly as $ t\pm s_t$ e.g. $ 174.3\pm 2.4$cm. The interval $ [t-s_t,t+s_t]$ is known as an error bar and it is often stated that a ``measurement without an error bar is meaningless''. In other words, to interpret a estimate meaningfully you need to have an idea of how uncertain the estimate may be due to sampling.

Sampling errors of linear combinations of independent random variables can easily be estimated by summing sampling variances. If random variable $ Z$ is a linear combination $ aX+bY$ of two independent and normally distributed variables $ X\sim N(\mu_X, \sigma^2_X)$ and $ Y\sim N(\mu_Y, \sigma^2_Y)$, then $ Z$ is also normally distributed $ Z\sim N(\mu_Z, \sigma^2_Z)$ with mean $ \mu_Z=a\mu_X+b\mu_Y$ and variance $ \sigma_Z^2=a^2\sigma_X^2+b^2\sigma_Y^2$. Therefore, the standard error $ s_Z$ of $ Z=aX+bY$ is $ \sqrt{a^2 s_X^2+b^2s_Y^2}$, and so, for example, the standard error of the difference of two sample statistics $ Z=X-Y$ is simply $ \sqrt{s_X^2+s_Y^2}$ - the quadrature sum of the standard errors of $ X$ and $ Y$.


next up previous contents
Next: Confidence intervals Up: Parameter estimation Previous: Sampling distributions   Contents
David Stephenson 2005-09-30