A Workflow for Preprocessing and Proteomic Biomarker Identification on Mass-Spectrometry Data

Michael Handler, Johannes D. Pallua, Georg Schäfer, Michael Netzer, Melanie Osl, Michael Seger, Bernhard Pfeifer, Michael Becker, Stephan Meding, Sandra Rauser, Axel Walch, Helmut Klocker, Georg Bartsch, Christian W. Huck, Christian Baumgartner, and Günther K. Bonn


Proteomics, biomarker discovery, mass spectrometry, data preprocessing, data mining, human disease


A core technology in proteomics is mass spectroscopy (MS) that permits the measurement of thousands of proteins/peptides simultanously. Sophisticated data mining methods are necessary to identify highly predictive proteomic biomarker candidates in generated MS spectra that are specific to a certain disease. However, before analysis can be started the preprocessing of raw mass spectra is an essential task, mainly due to the presence of back ground signals in the spectra such as electrical and chemical noise. In this work we present a new data mining workflow for the identification of proteomic biomarker candidates using mass spectrometry data. The workflow includes two major steps: 1) the preprocessing of raw spectra, and 2) the identification of highly discriminating candidate masses using a 3-step feature selection approach by combining the advantages of efficient filter and effective wrapper techniques. With the proposed workflow we were able to identify putative candidate biomarkers in a life-threatening human disease using matrix-assisted laser desorption/ionization imaging MS (MALDI-IMS).

Important Links:

Go Back