The Impact Of Feature Selection On Diabetes Prediction
الكلمات المفتاحية:
Feature Selection، Diabetes prediction، Filter selection، wrapper selection، f_classif، chi2، RFEالملخص
Diabetes, a chronic metabolic condition characterized by abnormal blood glucose levels due to either ineffective insulin utilization or inadequate production, has prompted numerous academic efforts to devise dependable prediction models using machine learning (ML) algorithms. Removing redundant features from massive datasets is of paramount importance in improving the efficiency of data-driven predictive models. This work aims to study the impact of feature selection (FS) methods on classifier training and accuracy. Three FS methods, f_classif, chi2, and RFE, were compared with the full dataset. Additionally, nine classifiers ( Naïve Bayes, k-NN, Decision Tree, Random Forest, Support Vector Machine, Gradient Boosting, Multilayer perceptron, AdaBoost, and ExtraTrees) were employed to evaluate the accuracy of each FS method. From the results, RFE obtained the best accuracy across other FS strategies, with most classifiers achieving their best results using RFE, with 5 out of 9 classifiers obtaining their best results using RFE.