Comparative Performance Analysis of Ensemble Models for Breast Cancer Classification
Machine Learning for Breast Cancer Classification
DOI:
https://doi.org/10.70774/ijist.v2i2.27Keywords:
Breast Cancer Classification, Machine Learning (ML), Ensemble Learning, Data Mining in Healthcare, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Logistic Regression, Random Forest.Abstract
Breast cancer remains a prevalent global health issue, accounting for approximately 2.3 million new cases and 670,000 deaths worldwide in 2022. Early detection and accurate diagnosis are crucial to improving patient outcomes, as delayed identification can lead to severe complications. Advances in machine learning (ML) have facilitated improvements in cancer diagnosis, with various algorithms enhancing predictive accuracy. This study proposes a novel ensemble model for breast cancer classification, utilizing 31 features from the University of Wisconsin Breast Cancer dataset. We applied six algorithms—K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression, Random Forest, Gradient Boosting, and XGBoost—and combined them with ensemble techniques, specifically Hard Voting, to develop a high-accuracy model. The model was evaluated on classification performance metrics, achieving improvements in accuracy, precision, recall, and F1 score. Results indicate that the proposed ensemble model outperforms individual classifiers and other ensembles, showing potential as a reliable tool for early breast cancer detection.