An approach to enhance the performance of the Xgboost Classifier

dc.contributor.author Muhwezi, Raymond Mugisha
dc.date.accessioned 2025-12-29T15:03:25Z
dc.date.available 2025-12-29T15:03:25Z
dc.date.issued 2025
dc.description A dissertation submitted to the Directorate of Graduate Training in partial fulfillment of the requirements for the award of the Degree of Master of Statistics of Makerere University
dc.description.abstract XGBoost is a dominant machine learning model for prediction and classification tasks. The XGBoost algorithm is an ensemble that often outperforms other machine learning models due to its enhanced predictive performance, efficiency, and regularization technique that prevent overfitting and underfitting. However, its heavy reliance on hyperparameter tuning creates computational weaknesses due to the intensive resource requirements of traditional methods like grid and random search. Furthermore, the raw features used for classification tasks may contain complex, non-linear relationships, not explicitly captured by XGBoost’ s base leaners. This study proposed an improved alternative by combining k-means clustering with Bayesian-optimized XGBoost. To validate this approach, the study utilised the red wine dataset from the UCI data repository. We first derived objective quality clusters from physicochemical attributes (like acidity, sugar, alcohol content) using k-means. Thereafter, two hyperparameter tuning approaches were then compared: (1) traditional hyperparameters, (2) Bayesian optimization. This study demonstrates that combining k-means clustering with Bayesian-optimized XGBoost significantly improves model classification accuracy compared to the use of traditional hyperparameters. When evaluated, the cluster-based model with Bayesian optimization achieved a 97.9% accuracy, F1-score of 97.4% and recall of 98.05%. On the other hand, the baseline model achieved 93.1% accuracy, 96.18% F1-score and 97.2% recall. This study demonstrates that the integration of k-means clustering with Bayesian optimization significantly enhances the performance of the XGBoost classifier. Consequently, we recommend deploying this validated model in real-world applications, such as automated wine quality grading, as well as in other industrial domains that require scalable and accurate classification solutions.
dc.identifier.citation Muhwezi, R. M. (2025). An approach to enhance the performance of the Xgboost Classifier; Unpublished Masters dissertation, Makerere University, Kampala
dc.identifier.uri https://makir.mak.ac.ug/handle/10570/16037
dc.language.iso en
dc.publisher Makerere University
dc.title An approach to enhance the performance of the Xgboost Classifier
dc.type Other
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
MUHWEZI-COBAMS-Masters-2025.pdf
Size:
1.27 MB
Format:
Adobe Portable Document Format
Description:
Masters dissertation
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
462 B
Format:
Item-specific license agreed upon to submission
Description: