An error bar is a simple example of what statisticians refer to as an interval estimate. Instead of estimating a point value for a parameter, the sample data is used to estimate a range of estimates that are likely to cover the true population parameter with a prespecified probability known as the confidence level.
A
confidence interval (C.I.) contains the true
value of the population parameter
with probability
(the confidence level). The interval
is defined by
lower and upper confidence limits
and
,
which are functions of the data. In other words, if C.I.s were
calculated for many different samples drawn from the full population
then a
fraction of the C.I.s would cover the true
population value. These intervals are shown schematically in Fig. 5.1. To be precise, if
is the
sampling cumulative distribution function that depends on
then
and the two
inequalities can be rearranged to give
for some
and
. In classical (but not
Bayesian) statistics, the true population parameter is considered to
be a fixed constant and not a random variable, hence it is the C.I.s
that randomly overlap the population parameter rather than the
population parameter that falls randomly in the C.I.
Statisticians most often quote 95% confidence intervals,
which should cover the true value in all but 5% of repeated
samples. For normally distributed sample statistics,
the 95% confidence interval is about twice as wide
as the error bar used by physicists (see example below).
The
error bar corresponds to the 68.3% confidence
interval for normally distributed sample statistics.
In addition to its more precise probabilistic definition,
another advantage of the C.I. over the error bar is that it
is easily extended to skewed statistics such as sample
variance.