MSc Projects
The module descriptor for the project is here.
Please see
me or send me mail at
R.M.Everson@exeter.ac.uk if you would like more details or want to talk about a completely different project.
Clustering is an important unsupervised method for visualising,
modelling and understanding data. There is a wide variety of methods
for obtaining clusters from data, though many of them are very ad
hoc. When the data themselves change with time the problem is harder
as clusters may move, split and merge. This project aims to produce
principled methods of clustering non-stationary data, probably using
particle filter methods. Particular application areas include clustering
neuronal spike trains and segmentation of image sequences.
The project would suit someone who is comfortable with
mathematical manipulations.
GTM is a principled alternative to Self Organising Maps useful for
modelling and visualising high-dimensional data. The parameters of
the GTM model are conventionally learned using the
Expectation-Maxisation algorithm to find maximum likelihood solution.
This project would develop a Bayesian Markov Chain Monte Carlo method
for the GTM model, bringing the benefits of automatic model
selection, robustness and confidence intervals. Although useful for
visualisation, the main application is to nonlinear Independent
Component Analysis. This project provides the opportunity of
learning and applying Bayesian MCMC methods to deal with nonlinear
problems. The project would suit someone who is comfortable with
mathematical manipulations.
It is often important to visualise high-dimensional data and a number
of tools, such as the Generative Topographic Mapping, Self Organising
Maps, Locally Linear Embedding and Isomap have been developed to map
manifolds in high dimensions down into a few dimensions.
Additionally one would often like to know the curvature of
the manifold as it provides information on how rapidly quantities
vary.
This project is to investigate and develop methods of quantifying
the curvature of manifolds specified only by data points in a high
dimensional space. It provides an opportunity to learn about and
apply various visualisation methods and to investigate ideas about
curvature on manifolds.
Python is a
relatively new, free, object-oriented scripting language. It has
the ability to easily components from other languages and is
developing a wide base of users. Building on the numerical
capabilities of Python, this project would build a suite of
Pattern Recognition tools for use in research. The
objected-oriented, modular nature of Python should permit a very
powerful toolbox. I anticipate that the project would include
analogues of the routines in Netlab together with some more
advanced methods.
Decision trees are a widely used as classifiers that are easily
implemented, rapidly trained and allow some insight to be gained
into the way in which a classification is reached.
Classifiers can be combined to give an average answer by
voting or by some method that gives each classifier a
vote proportional to how reliable the classifier thinks
it is. This project would explore ways of combining
decision trees using the Bayesian evidence.
Swarm intelligence (SI) is an artificial intelligence technique
involving the study of collective behaviour in decentralised
systems. Such systems are made up by a population of simple agents
interacting locally with one other and with their
environment. Swarm-like algorithms, such as particle swarm
optimisation (PSO), have already been applied successfully to solve
real-world optimisation problems, and has many features in common
with evolutionary algorithms (EA).
The PSO heuristic has traditionally been applied to optimisation of
single objective problems, however in 2002 a number of papers
introduced extensions to the general model to allow search in
multi-objective domains. Due to this transition however it is
obvious that there is no longer a single "best" solution for the
particles to fly towards, but rather a set representing a curve or
surface. Current work in the field is greatly concerned with
ensuring coverage of this set and investigating the most efficient
way to alter a particle's trajectory in order to promote rapid
convergence to the "best" solutions. This project aims to
investigate the area of multi-objective PSO, by investigating a
number current points of concern: Should a particle fly toward fit
areas in objective space that are close in decision space or
further away, or combination of the two? Should they be concerned
with flying away from unfit regions of decision space as well as
flying towards fit regions? How does the search progress with
different swarm sizes in relation to convergence speed?
This project would be in conjunction with Jonathan Fieldsend.