Charles Mallah, James Cope, and James Orwell
Pattern Recognition, Plant Leaves Classification, k-Nearest Neighbours, Density Estimators, Combining Features
Plant species classification using leaf samples is a challenging and important problem to solve. This paper introduces a new data set of sixteen samples each of one-hundred plant species; and describes a method designed to work in conditions of small training set size and possibly incomplete extraction of features. This motivates a separate processing of three feature types: shape, texture, and margin; combined using a probabilistic framework. The texture and margin features use histogram accumulation, while a normalised description of contour is used for the shape. Two previously published methods are used to generate separate posterior probability vectors for each feature, using data associated with the k-Nearest Neighbour apparatus. The combined posterior estimates produce the final classification (where missing features could be omitted). We show that both density estimators achieved a 96\% mean accuracy of classification when combining the three features in this way (training on 15 samples with unseen cross validation). In addition, the framework can provide an upper bound on the Bayes Risk of the classification problem, and thereby assess the accuracy of the density estimators. Lastly, the high performance of the method is demonstrated for small training set sizes: 91\% accuracy is observed with only four training samples.
Important Links:
Go Back