An interpretable machine learning approach to predict loss to follow-up among people living with HIV

Date
2025
Authors
Ssevvume, Solomon
Journal Title
Journal ISSN
Volume Title
Publisher
Makerere University
Abstract
The advancement of AI and machine learning has led to wide adoption in different fields and sectors such as education, engineering, and health. Given this adoption, multiple countries have adopted different machine-learning techniques and algorithms to improve patient care and health service delivery. Uganda, as a country, has piloted the use and adoption of different machine learning models to especially in health, to detect conditions such as cancer of the cervix, and screening for malaria, among others. Lost to follow-up is one of the major challenges affecting service provision for HIV, especially in low- and middle-income countries such as Uganda. HIV clients in Uganda get lost to follow up due to a number of reasons, among them is the high mobility of clients who keep moving from one location to another, some treatment centers are located far away from the clients who may not have transport facilitation to and from facilities, and the adverse psychosocial issues affecting these clients without the necessary support. These reasons are unfortunately only known after the client is lost to follow-up, thus a reactive approach. A prediction algorithm that predicts the client’s likelihood of dropping out will help improve patient care treatment outcomes so that the client is followed up with before they get lost. This research implements an interpretable predictive algorithm that predicts the patient outcome and provides insight/explanation as to why the outcome has been made. This work differs from the existing implementation by explaining the traditional black box models implemented for a prediction. Longitudinal client-level data has been collected and used in this research including social demographic information as well as patient medical history data, to find patterns that inform the prediction outcome. The collected data was augmented using three major data augmentation techniques to eliminate class bias typical of medical data. These techniques included random under-sampling, which randomly reduces the instances of the majority class. Random oversampling is another technique that was employed, where new samples were added to the minority class, thereby balancing the dataset. Synthetic samples were added to the dataset through the Synthetic Minority Over-sampling technique (SMOTE) as another technique to balance the dataset. The models were trained on all three datasets from the augmentation techniques, including the original dataset. Interpretability using LIME was then added, and the results are presented. The research shows XGBoost model using an over-sampled dataset produced the best results for the classification of clients who are lost to follow-up. This research provides an interpretable machine learning model that predicts clients likely to drop out with an explanation or insight into why they are likely to drop out of care. This research provides a new standard for the use and adoption of Artificial intelligence by providing justifications for the outcome.
Description
A dissertation submitted to the School of Computing and Informatics Technology for the study leading to a project report in partial fulfilment of the requirements for the award of the Degree of Masters of Science in Computer Science of Makerere University.
Keywords
Citation
Ssevvume, S. (2025). An interpretable machine learning approach to predict loss to follow-up among people living with HIV (Unpublished master’s dissertation). Makerere University, Kampala, Uganda.