Ensemble Classifier based on Misclassified Streaming Data

J.W. Ryu, M. Kantardzic, and C. Walgampaya (USA)


Ensemble, Streaming data, Classifier, Misclassified data, and Concept drift


The classification problem for streaming data requires classifying samples where data distribution may change continuously. In this paper, we propose an ensemble classifier method that maintains an ensemble dynamically according to the distribution of streaming data. The proposed method builds a new classifier using misclassified streaming data. Also, whenever a new streaming sample is classified, it evaluates the outputs of each classifier in the ensemble and then combines them. We experimented with the intrusion detection data of KDD’99 Cup. We have compared the results with existing ensemble methods: simple voting, averaging probability, and weighted ensemble. The existing ensemble methods use data within predefined time interval as a training data set to generate a new classifier for an ensemble. On average, the proposed method produced 6% higher accuracy than existing methods for the same configurations. For a similar accuracy, the proposed method built about 99% lesser number of new classifiers for an ensemble than existing methods, and their learning time is reduced by about 77%.

Important Links:

Go Back