A MACHINE LEARNING BASED ENSEMBLE MODELING APPROACH FOR HEART DISEASE PREDICTION
Keywords:
Heart disease prediction, Random Forest, ensemble learning, feature importance, ROC, clinical dataAbstract
Cardiac disease is a serious global health problem. If the heart’s condition is recognized and treated early on, it will help prevent potential severe complications from developing. This study investigates how ensemble machine learning techniques can be used to predict heart disease from clinical data. A number of ensemble models for such an approach were developed: Random Forest, Gradient Boosting, AdaBoost and Voting Classifier. Four of them were trained on 200 patient records with clinical information that included 12 relevant features. Preprocessing of the data involved using a StandardScaler in a ColumnTransformer pipeline. The best performance of Random Forest classifier was obtained with an accuracy = 0.99 and ROC-AUC = 0.9995. The feature importance analysis identified slope, resting blood pressure, type of chest pain, and number of major vessels as the most important predictors. The findings highlight the efficacy of ensemble methodologies for predicting clinical risk and indicate potential paths for validation with larger multisite datasets.