Next: Linear regression
Up: Basic Linear Regression
Previous: Basic Linear Regression
Contents
Models provide compact ways of summarising observed relationships
and are essential for making predictions and inferences.
Models can be considered to be maps (representations) of reality and like any map
can not possibly describe everything in reality (nor do they have to !).
This is nicely summarised in the famous quotation:
``All models are wrong, but some are useful'' - G.E.P. Box
Good modellers are aware of their models' strengths and weaknesses
and use the models appropriately.
The process of choosing the most appropriate models is very
complex and involves the following stages:
- Identification
By analysing the data critically using descriptive techniques
and thinking about underlying processes, a class of potentially
suitable models can be identified for further investigation.
It is wise in this stage to consider first the simplest possible
models (e.g. linear with few parameters) that may be appropriate
before progressing to more complex models (e.g. neural networks).
- Estimation
By fitting the model to the sample data, the model parameters and
their confidence intervals are estimated by usually using either
least-squares or maximum likelihood methods. Estimation is
generally easier and more precise for parsimonius models
that have the least number of parameters.
- Evaluation
The model fit is critically assessed by carefully analysing
the residuals (errors) of the fit and other diagnostics.
This is sometimes referred to as validation and/or verification
by the atmospheric science community.
- Prediction
The model is used to make predictions in new situations (i.e. independent data to
that used in making the fit). The model predictions are then
verified to test whether the model has any real skill.
Predictive skill is the ultimate test of any model. There
is no guarantee that a model which provides a good fit, will also
produce good predictions. For example, non-parsimonious models
having many parameters that provide excellent fits to the original
data often fail to give good predictions when applied to new data
(over-fitting).
By iterating at any stage in this process, it is possible with much
skill and patience to find the most appropriate models.
Next: Linear regression
Up: Basic Linear Regression
Previous: Basic Linear Regression
Contents
David Stephenson
2005-09-30