Developing a prediction model to detect the likelihood of early-stage breast cancer using machine learning techniques at the Uganda Cancer Institute

Kinene, Andrew

dc.contributor.author	Kinene, Andrew
dc.date.accessioned	2021-09-23T08:54:16Z
dc.date.available	2021-09-23T08:54:16Z
dc.date.issued	2021-06-02
dc.identifier.citation	Kinene, A. (2021). Developing a prediction model to detect the likelihood of early-stage breast cancer using machine learning techniques at the Uganda Cancer Institute (Unpublished master’s dissertation). Makerere University, Kampala, Uganda.	en_US
dc.identifier.uri	http://hdl.handle.net/10570/8916
dc.description	A dissertation submitted to MAKSPH in partial fulfillment of the requirements for the Master’s Degree in Health Informatics of Makerere University.	en_US
dc.description.abstract	Background: Breast cancer is the most common malignancy affecting women worldwide, with over one million cases occurring annually. It is the second most common cause of cancer-related death in the world. In Uganda, it is the second most common cancer among women after cancer of the cervix. Most women present after developing incurable and metastatic tumors. Although early diagnosis increases the chances of survival, screening facilities are limited. With advanced computing technology, Machine Learning (ML) has been extended in the biomedical field to diagnose various health outcomes. Therefore, this study aimed to develop a web-based application to predict a woman's likelihood of developing breast cancer using machine learning. Methods: This was a retrospective study that involved retrieval and review of 1897 patients' files with 22 variables at the Uganda Cancer Institute. A six-stage Cross Industry Standard Process for Data Mining (CRISP-DM) methodology was adopted and applied on twenty-five different classification algorithms using the Weka tool. The classifier categories used included; Bayes, Mata, Functions, Lazy Trees, and rules classifiers. Models were quantified and compared based on performance.The best performing model was integrated into a web application to make predictions on breast cancer. Results: The experimental results showed that random forest and Logistic Model Tree had comparable results. However, when models were further evaluated on the accuracy, F-score and ROC curve metrics using 10-fold cross-validation (CV) analysis, random forest outperformed other models with (99.68%, 0.997 and 1.0) for the respective metrics while LMT had (99.47%, 0.995 and 0.997) for the same performance metrics. Tree classifier had a better performance than other classifiers since Random forest and LMT algorithms were from this classifier. Random forest algorithm was integrated into a web application to enhance screening of women at risk of developing breast cancer. Conclusions: ML techniques are essential in the medical field because they enhance early identification of high-risk individual based on known clinical risk factors. Therefore, random forest model can be integrated into health care to help health workers during breast cancer patient management and while assigning a therapy.	en_US
dc.language.iso	en	en_US
dc.publisher	Makerere University	en_US
dc.subject	Prediction model	en_US
dc.subject	Breast cancer	en_US
dc.subject	Machine learning techniques	en_US
dc.title	Developing a prediction model to detect the likelihood of early-stage breast cancer using machine learning techniques at the Uganda Cancer Institute	en_US
dc.type	Thesis	en_US

Files in this item

Name:: Kinene-chs-mphi.pdf
Size:: 1.943Mb
Format:: PDF
Description:: Master's Dissertation

View/Open

This item appears in the following Collection(s)

School of Public Health (Public-Health) Collections

Show simple item record