Show simple item record

dc.contributor.authorBukombi, Brian
dc.date.accessioned2021-04-14T06:51:47Z
dc.date.available2021-04-14T06:51:47Z
dc.date.issued2021-02
dc.identifier.citationBukombi, B. (2021). Malware detection using static analysis with PCA, mRMR and machine learning (Unpublished master’s dissertation). Makerere University, Kampala, Uganda.en_US
dc.identifier.urihttp://hdl.handle.net/10570/8331
dc.descriptionA dissertation submitted to the Directorate of Research and Graduate Training for the study leading to a dissertation in partial fulfillment of the requirements for the award of the Degree of Masters of Science in Computer Science of Makerere University.en_US
dc.description.abstractMalicious software (malware) is software that harbors malicious intent and is harmful to computer systems. The number of malware being developed is increasing rapidly, and despite the use of anti-malware software, the timely detection of malware still remains a challenge today, with disastrous consequences that may result into losses valued in millions of dollars. Most anti-malware software today uses signature based detection techniques to protect legitimate users from malware attacks. Signatures are byte sequences that uniquely identify malicious software. However, this method fails to detect new types of malware, and new variants of existing malware for which no signatures exist in the signature databases. To address the short comings of signature based detection, researchers have proposed the use of statistical based detection, utilizing statistical properties of program features, and dynamic based detection that monitors the behavior of programs during execution. These techniques are used in conjunction with machine learning models that are trained on the selected features. Selecting individually good features does not necessarily translate into optimal classification results. There is therefore need to select optimal sets of features to use in building the machine learning models used in the detection of unknown malware. In this research, we evaluate Principle Component Analysis and Maximum Relevance and Minimum Reduction dimensionality reduction algorithms for the selection of optimal feature sets to use in building the machine learning models for detection of unknown malware. We evaluate different sets of features to determine the most parsimonious model with the lowest classification error. We show that the highest area under a receiver operating curve was 91% and was achieved with the Decision Tree classifier using 20 features selected using Maximum Relevance and Minimum Reduction.en_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectMalwareen_US
dc.subjectMalicious softwareen_US
dc.titleMalware detection using static analysis with PCA, mRMR and machine learningen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record