Show simple item record

dc.contributor.authorNansereko, Brendah
dc.date.accessioned2022-12-19T09:57:27Z
dc.date.available2022-12-19T09:57:27Z
dc.date.issued2022
dc.identifier.citationNansereko, B. (2022). Predicting switch to second-line antiretroviral therapy regimen: A comparison of the traditional linear classification methods and advance nonlinear machine learning algorithms. (Unpublished master's dissertation). Makerere University, Kampala, Uganda.en_US
dc.identifier.urihttp://hdl.handle.net/10570/11152
dc.descriptionA dissertation submitted to the School of Public Health in partial fulfillment of the requirement for the Degree of Master of Biostatistics Makerere University Kampalaen_US
dc.description.abstractIntroduction: Due to changes in data patterns, self-learning approaches have been adopted in research which is commonly known as Machine Learning (ML). ML has been used previously to predict health outcomes such as early virological among others. These deterministic methodologies use a wide array of features to identify hidden patterns in the data to predict health outcomes. These methodologies incorporate chance and variation arising from fluctuations in the environment including factors not explicitly included in the model. On the other hand, the classical linear methods have been associated with several limitations such as the assumption of linearity, failure to fully incorporate heterogeneity of effects, and they are limited by the growing dimensionality of data since they cannot include so many predictors. This study is aimed to compare the linear and the nonlinear advanced classification algorithms to predict switching to ART second-line regimen. Objectives: The objective of this study was to compare the linear logistic regression analysis method which is parametric to the non-parametric advanced ML algorithms which include random forests (RF) and K nearest neighbor (KNN) machine learning algorithms to correctly classify patients switching to second-line Antiretroviral Therapy (ART) regimens. Methods: This study used secondary HIV patient data considering HIV patients from 15 HIV clinics under RHSP. We used the R, STATA, and python software for data management and analysis. The logistic regression, random forest models, and K nearest neighbor models were fitted. The models were compared by assessing the discriminative ability of the models. The models were also evaluated on the average performance metrics which included Area under Curve (AUC), sensitivity, F1 score measure, and overall accuracy. Results: The majority of the patients were females with 62.4% and most of the patients (52.1%) were aged between 20-34years at enrollment. Out of the 7818 patients, 5% had switched to a second-line ART regimen. Results from the comparison of the fitted models indicated that all x the models performed better with balanced data as compared to the imbalanced data models. The Area under Curve (AUC) for the balanced data logistic classifier 68.8% (95% CI 68.0 – 69.2) was significantly higher than the RF 56.9% (95% CI 53.4 – 58.6) and the KNN balanced data models 65.1% (95% CI 64.3 – 65.6). There was no significant statistical difference in the F1 measure for all three. However, the balanced data logistic classifier has the highest AUC and recall score as compared to the rest of the models. Conclusion: This study indicated that linear classifiers which are parametric such as logistic regression classifiers are good predictors of the switch to a second-line ART regimen with the application of appropriate resampling strategies such as Synthetic Minority Oversampling Technique (SMOTE) which balance data across classes when the data is imbalanced.en_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectAntiretroviral Therapyen_US
dc.subjectRegimenen_US
dc.subjectTraditional Linearen_US
dc.subjectclassification Methodsen_US
dc.subjectnonlinearen_US
dc.subjectMachine Learningen_US
dc.subjectAlgorithmsen_US
dc.subjectHIV/AIDSen_US
dc.titlePredicting switch to second-line antiretroviral therapy regimen: A comparison of the traditional linear classification methods and advance nonlinear machine learning algorithmsen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record