S. Van Looy, J. Meeus, B. Wyns, B. Vander Cruyssen, F. De Keyser, and L. Boullart (Belgium)
Feature selection, Support vector machine, Recursive feature elimination, Real life data, Rheumatoid arthritis, TNF-α blockers
Rheumatoid arthritis (RA) is a chronic inflammatory joint disease that leads to irreversible joint destruction. To prevent this, new biological therapies, such as Infliximab, have been developed. The present analysis is based on an expanded access program in which 511 RA patients with chronic refractory disease were treated with Infliximab. They received a standard dose of 3 mg/kg on weeks 0, 6, 14 and 22. On week 22, the treating rheumatologist had to evaluate the progress of every patient and decide whether the current dose should be increased or not. To predict this decision, 76 features were measured. Four of these features have been found to predict the rheumatologist’s decision well. This conclusion was made upon specific clinical knowledge. This paper shows that a well performing feature subset can also be selected without any domain knowledge at all, using a method based on recursive feature elimination (RFE). Three feature subsets are proposed and classification performance using these sets is compared to the performance using the 4 original features. The impact of missing data – a very common phenomenon in real life data – on the RFE method is also evaluated. Missing data does not influence the RFE method much in this application, however a tendency towards the selection of features with a low proportion of missings was observed.
Important Links:
Go Back