Hang Yang, Simon Fong, Andy Ip, and Sabah Mohammed
Decision Tree, Stream Mining, Biomedical Classification
Two major families of biomedical data exist in bioinformatics, namely case-based data which are historical record archival, and stream-based data which are dynamic signals usually collected from sensors or monitors. Traditional decision tree classification has proven its use in data-mining over static case-based data for revealing interesting patterns. However, data mining over biomedical data streams have not been explored by previous researchers, despite biomedical signal processing techniques existed for decades but they are mainly for pattern detection rather than prediction or classification. In this paper, we shed light into the impacts of data stream mining techniques on biomedical data streams. We illustrate the two different workflows of case-based and stream-based data mining for bio-medical classification. For comparison of the two (case-based and stream-based) mining techniques, a simulation is programmed for conducting experiments over these two types of biomedical data. From the results we observed that: case-based classification has a higher accuracy but slower in running time because of the multi-scans over a database. Stream-based classification has a high speed but achieves a relatively lower accuracy unless the data size reaches certain large size. As a novel contribution in this paper, we propose a method to solve the problem of long boosting step in stream mining.
Important Links:
Go Back