|
|
A variable metric probabilistic k-nearest-neighbours classifierR.M. Everson and J.E. FieldsendIn: Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'04), Z.R. Yang, R. Everson and H. Yin (Eds.)3177, 659-664, Springer, 2004.
Abstract
The k-nearest neighbour (k-nn) model is a simple, popular classifier. Probabilistic (k-nn is a more powerful variant in which the model is cast in a Bayesian framework using (reversible jump) Markov chain Monte Carlo methods to average out the uncertainty over the model parameters. The (k-nn classifier depends crucially on the metric used to determine distances between data points. However, scalings between features, and indeed whether some subset of features is redundant, are seldom known a priori. Here we introduce a variable metric extension to the probabilistic (k-nn classifier, which permits averaging over all rotations and scalings of the data. In addition, the method permits automatic rejection of irrelevant features. Examples are provided on synthetic data, illustrating how the method can deform feature space and select salient features, and also on real-world data.
Gzipped postscript (266 kb) PDF (340 kb)
|