Hybridizing machine learning and static malware detection using the PE header

Kipsang, Jacob

View/Open

Masters science in Computer Science thesis (1.079Mb)

Date

2021-11-23

Author

Kipsang, Jacob

Metadata

Show full item record

Abstract

Cyber crime cases currently involve demanding payment after infecting a victimized organization’s computers with ransomware or impairing operations through a distributed denial-of-service attack which significantly impacts the confidentiality, integrity and availability of data. Recent researchers show that hybridizing techniques can detect malware or benign effectively. Our research provides an experimental study on hybridizing machine learning and signature-based techniques to detect malware based on the PE header information. The dataset was sliced randomly into training 80% and testing 20% sets. The classifiers we used were Random Forest, Gradient Boosting and Ada boost to train and test the dataset. We evaluated our models using the evaluation metrics. Results showed overall achieved accuracy is high for the cleaned dataset ranging from 99.70% to 99.77%, for the uncleaned dataset range from 93.83% to 96.83%. The VirusTotal file report API had a high Average detection rate for unclean datasets ranging from 0.00% to 12.57% and a low average detection rate of 0.00% on a cleaned dataset. Random Forest emerged as the best classifier for both cleaned and uncleaned datasets with an average detection rate for static analysis of 0.00%.

URI

http://hdl.handle.net/10570/10125

Collections

Academic submissions (CoCIS)