S. Draghici, K. Uygun, X. Wang, M. Chatterjee, and M.A. Tainsky (USA)
: proteomic profile, data mining, neu ral networks, cancer prediction, protein microarrays.
In this work, data mining techniques were applied on proteomics data collected from two different groups of people: cancer patients and control (healthy) subjects, with the ultimate goal of constructing a classifier that is able to predict cancer by analyzing the proteomic profile of tumor antigenicity of a person. However, the analysis may also enhance our understanding of the relationship between each antigen's gene and the etiology of cancer. A dimensionality reduction was necessary in order to address the curse of dimensionality problem. Three different classifiers were constructed and tested: Pruned Decision Tree, Voted Perceptron and Neural Network. These machine learning algorithms were assessed on a variety of performance tests and the results were compared statistically. The results suggest that neural networks can be effectively used in prediction of cancer from proteomic data, providing an accuracy of around 95% on validation data. This work differs from previous studies of molecular profiling of cancer in that here we classify individuals prior to the onset of cancer when there is no apparent tumor to sample. The impact of the methods will provide a novel approach to the early detection of cancer, thus greatly improving the patient's chances of survival.
Important Links:
Go Back