Two-stage hybrid feature selection algorithms for diagnosing erythemato-squamous diseases
1 School of computer science, Shaanxi Normal University, Xi’an 710062, China
2 School of Information Engineering, Shenzhen University, Shenzhen 518060, China
3 CAS Research Centre of Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing 100080, China
4 School of Information systems, Computing and Mathematics, Brunel University, London UB8 3PH, UK
Health Information Science and Systems 2013, 1:10 doi:10.1186/2047-2501-1-10Published: 30 May 2013
This paper proposes two-stage hybrid feature selection algorithms to build the stable and efficient diagnostic models where a new accuracy measure is introduced to assess the models. The two-stage hybrid algorithms adopt Support Vector Machines (SVM) as a classification tool, and the extended Sequential Forward Search (SFS), Sequential Forward Floating Search (SFFS), and Sequential Backward Floating Search (SBFS), respectively, as search strategies, and the generalized F-score (GF) to evaluate the importance of each feature. The new accuracy measure is used as the criterion to evaluated the performance of a temporary SVM to direct the feature selection algorithms. These hybrid methods combine the advantages of filters and wrappers to select the optimal feature subset from the original feature set to build the stable and efficient classifiers. To get the stable, statistical and optimal classifiers, we conduct 10-fold cross validation experiments in the first stage; then we merge the 10 selected feature subsets of the 10-cross validation experiments, respectively, as the new full feature set to do feature selection in the second stage for each algorithm. We repeat the each hybrid feature selection algorithm in the second stage on the one fold that has got the best result in the first stage. Experimental results show that our proposed two-stage hybrid feature selection algorithms can construct efficient diagnostic models which have got better accuracy than that built by the corresponding hybrid feature selection algorithms without the second stage feature selection procedures. Furthermore our methods have got better classification accuracy when compared with the available algorithms for diagnosing erythemato-squamous diseases.