Classification techniques for clinical diagnosis of lung cancer: A Case of Uganda Cancer Institute- Mulago

dc.contributor.author Kisawuzi, Ian
dc.date.accessioned 2021-05-18T14:33:31Z
dc.date.available 2021-05-18T14:33:31Z
dc.date.issued 2019-07
dc.description A dissertation submitted to the directorate of graduate research training in partial fulfilment of the requirements for the award of a Degree of Master of Statistics of Makerere University en_US
dc.description.abstract This study evaluated classification techniques in order to predict lung cancer based on signs, symptoms and risk factors. The study utilized data from the Uganda Cancer Institute with 354 patients’ records. To identify high risk factors for lung cancer (such as persistent cough, dry cough, difficulty in breath, family history of cancer), entropy and information gain data mining approaches were used. To optimally detect lung cancer, Decision Trees (DT), Naïve Bayes (NB) and Classification Rules (CR) data mining techniques were used with the aid of WEKA data mining tool. To test the reliability of the different data mining techniques amenable to clinical diagnosis of lung cancer, the confusion matrix results and the number of correctly classified instances for each data mining technique were used. The high risk factors for lung cancer that were identified, with their respective information gains, include chest pain (0.4), persistent cough (0.3), plural effusion (0.3), diabetes (0.3), allergy (0.2), difficulty in breath (0.2), family history of cancer (0.2) and night sweats (0.2). Tests on reliability of the different classification techniques revealed that all techniques performed well, though Naïve Bayes registered the best performance with 97% of its instances correctly classified compared to 96% for Decision Trees and 95% for Classification Rules. Naïve Bayes also had the best accuracy rate of 0.97 compared to 0.96 for Decision Trees and 0.95 for Classification Rules. The study therefore recommends the use of Naïve Bayes data mining technique for clinical diagnosis of lung cancer. Also, in future this technique can be automated into a computerized system for clinical diagnosis of lung cancer. Further research can be done on how to merge the three data mining techniques into a single robust algorithm for comparison purposes. en_US
dc.identifier.citation Kisawuzi, I. (2019). Classification techniques for clinical diagnosis of lung cancer, a case of Uganda Cancer Institute – Mulago. Unpublished master’s thesis, Makerere University. en_US
dc.identifier.uri http://hdl.handle.net/10570/8642
dc.language.iso en en_US
dc.publisher Makerere University en_US
dc.subject Clinical diagnosis en_US
dc.subject Lung cancer en_US
dc.subject Mulago en_US
dc.subject Uganda Cancer Institute en_US
dc.title Classification techniques for clinical diagnosis of lung cancer: A Case of Uganda Cancer Institute- Mulago en_US
dc.type Thesis en_US
Files