The Impact Of Feature Selection On Diabetes Prediction


  • Mohammed Shantal Computer Science Department, College of Technology Science, Sebha, Libya
  • Almahdi Alshareef Computer Science Department, Sebha University, Sebha, Libya
  • Omar Ahmid Information Development Center, Sebha University - Sebha, Libya


Feature Selection, Diabetes prediction, Filter selection, wrapper selection, f_classif, chi2, RFE


Diabetes, a chronic metabolic condition characterized by abnormal blood glucose levels due to either ineffective insulin utilization or inadequate production, has prompted numerous academic efforts to devise dependable prediction models using machine learning (ML) algorithms. Removing redundant features from massive datasets is of paramount importance in improving the efficiency of data-driven predictive models. This work aims to study the impact of feature selection (FS) methods on classifier training and accuracy. Three FS methods, f_classif, chi2, and RFE, were compared with the full dataset. Additionally, nine classifiers ( Naïve Bayes, k-NN, Decision Tree, Random Forest, Support Vector Machine, Gradient Boosting, Multilayer perceptron, AdaBoost, and ExtraTrees) were employed to evaluate the accuracy of each FS method. From the results, RFE obtained the best accuracy across other FS strategies, with most classifiers achieving their best results using RFE, with 5 out of 9 classifiers obtaining their best results using RFE.




How to Cite

Mohammed Shantal, Almahdi Alshareef, & Omar Ahmid. (2024). The Impact Of Feature Selection On Diabetes Prediction. African Journal of Advanced Pure and Applied Sciences (AJAPAS), 373–377. Retrieved from