School of Computing and Informatics Technology (CIT) Collection
Permanent URI for this collection
Browse
Recent Submissions
1 - 5 of 570
-
ItemExplainable ensemble machine learning for SQL injection attack detection(Makerere University, 2025)SQL injection (SQLi) remains a major cybersecurity threat that exploits weaknesses in database-driven web applications to gain unauthorized access to sensitive data. Existing detection systems often rely on static rule sets and opaque machine learning models that lack interpretability, adaptability, and robustness against new attack variations. To address these limitations, this study developed an explainable hybrid ensemble machine learning model for SQL injection detection. The proposed framework integrates transformer-based semantic understanding with statistical query profiling to enhance both accuracy and interpretability. A dataset of 22,470 SQL queries collected from two production systems at Makerere University, namely the Makerere University E-Learning Environment (MUELE) and the Electronic Human Resource Management System (EHRMS) was used for model development and evaluation. The dataset included six major SQLi categories: tautology-based, union query, piggy-backed, comment-based, illegal/logically incorrect, and blind SQLi, allowing for comprehensive performance analysis across diverse attack types. Feature engineering played a central role in the model’s success. Contextual features were extracted using Bidirectional Encoder Representations from Transformers (BERT), capturing the semantic meaning of SQL syntax and revealing obfuscated injection patterns undetectable by traditional methods. These semantic embeddings were combined with handcrafted statistical indicators such as query length, special-character frequency, and keyword density, enabling detection of structural anomalies indicative of SQL injection behavior. This hybrid representation provided a multidimensional understanding of both syntactic and semantic query characteristics, improving model sensitivity and interpretability. Multiple classifiers including Decision Tree, Random Forest, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Naïve Bayes, Gradient Boosting, LightGBM, and CatBoost were trained and evaluated. Ensemble techniques such as bagging, boosting, and voting were applied to enhance generalization performance. Therefore, the proposed boosting-based ensemble model achieved an accuracy of 99.49%, with balanced F1-scores of 96.87% for benign queries and 99.72% for malicious queries. Explainability was incorporated through SHAP (SHapley Additive Explanations) and LIME (Local Interpretable Model-Agnostic Explanations). SHAP analysis revealed that BERT embeddings contributed approximately 45% of the model’s predictive power, while features such as tautological conditions and comment-based patterns were key indicators of SQLi attacks. The final model was deployed as a RESTful FastAPI microservice, capable of processing over 10 queries per second with average response times of 150–200 ms. The study demonstrates that combining semantic embeddings with statistical features in an explainable ensemble framework yields a robust, interpretable, and production-ready solution for SQL injection detection. Keywords: Machine learning, SQL Injection Attack Detection
-
ItemAn interpretable machine learning approach to predict loss to follow-up among people living with HIV(Makerere University, 2025)The advancement of AI and machine learning has led to wide adoption in different fields and sectors such as education, engineering, and health. Given this adoption, multiple countries have adopted different machine-learning techniques and algorithms to improve patient care and health service delivery. Uganda, as a country, has piloted the use and adoption of different machine learning models to especially in health, to detect conditions such as cancer of the cervix, and screening for malaria, among others. Lost to follow-up is one of the major challenges affecting service provision for HIV, especially in low- and middle-income countries such as Uganda. HIV clients in Uganda get lost to follow up due to a number of reasons, among them is the high mobility of clients who keep moving from one location to another, some treatment centers are located far away from the clients who may not have transport facilitation to and from facilities, and the adverse psychosocial issues affecting these clients without the necessary support. These reasons are unfortunately only known after the client is lost to follow-up, thus a reactive approach. A prediction algorithm that predicts the client’s likelihood of dropping out will help improve patient care treatment outcomes so that the client is followed up with before they get lost. This research implements an interpretable predictive algorithm that predicts the patient outcome and provides insight/explanation as to why the outcome has been made. This work differs from the existing implementation by explaining the traditional black box models implemented for a prediction. Longitudinal client-level data has been collected and used in this research including social demographic information as well as patient medical history data, to find patterns that inform the prediction outcome. The collected data was augmented using three major data augmentation techniques to eliminate class bias typical of medical data. These techniques included random under-sampling, which randomly reduces the instances of the majority class. Random oversampling is another technique that was employed, where new samples were added to the minority class, thereby balancing the dataset. Synthetic samples were added to the dataset through the Synthetic Minority Over-sampling technique (SMOTE) as another technique to balance the dataset. The models were trained on all three datasets from the augmentation techniques, including the original dataset. Interpretability using LIME was then added, and the results are presented. The research shows XGBoost model using an over-sampled dataset produced the best results for the classification of clients who are lost to follow-up. This research provides an interpretable machine learning model that predicts clients likely to drop out with an explanation or insight into why they are likely to drop out of care. This research provides a new standard for the use and adoption of Artificial intelligence by providing justifications for the outcome.
-
ItemA computer vision approach towards glare mitigation and image quality enhancement in license plate recognition(Makerere University, 2025)License Plate Recognition (LPR) systems play a crucial role in Intelligent Transportation Systems (ITS), facilitating automated vehicle identification for applications such as traffic monitoring, law enforcement, and toll collection. However, these systems often suffer from glare-induced distortions caused by intense light sources such as sunlight, vehicle headlights, and reflections. These distortions obscure license plate details, leading to reduced Optical Character Recognition (OCR) accuracy and compromised system reliability. This research addresses this critical challenge by developing a unified computer vision framework that integrates Autoencoders (AE) and Noise2Clean Generative Adversarial Networks (N2C-GAN) to mitigate glare and improve image quality. The study aimed to achieve four key objectives: access and utilize an existing dataset of glare-induced license plate images, image pre-processing, model implementation, and rigorous model evaluation. The proposed model demonstrated significant advances in glare mitigation, achieving a Peak Signal-to-Noise Ratio (PSNR) of 38.8 dB, a Structural Similarity Index Measure (SSIM) of 0.987, and a Visual Information Fidelity (VIF) of 0.8896. Furthermore, the model improved the accuracy of OCR to 99.9% using Google Cloud Vision OCR, underscoring its effectiveness in restoring license plate readability under glare conditions. Computational efficiency was a key focus, with a compact model size of 298 kB and a runtime of 0.7263 s, making it scalable for real-world deployment. Despite encountering limitations such as dataset bias and computational constraints, this research provides valuable insights and lays the groundwork for future advances in glare mitigation, image processing, and machine learning-based LPR enhancements. The findings have broad implications for transportation management, public safety, and automated enforcement, offering a robust solution to improve the performance and reliability of LPR systems in diverse real-world applications.
-
ItemA model driven serious game generator : a case of Android quiz game(Makerere University, 2025)Computer games are used in several areas encompassing fun and serious contexts. Game development however is a complex technical undertaking with resulting financial costs exploding into millions of dollars for the most complex games. As a result , Model Driven Development (MDD) has been used in several studies to simplify software development and develop domain specific artefacts. This study explored the use of Model Driven Development (MDD) in serious game development with a focus on quiz games. Quiz games are a simple game domain that can easily be modelled. Corresponding graphical and other tooling can also be developed to ease game development. With existing evidence of the effectiveness of games in effecting positive learning outcomes and policy shifts allowing learners to have digital devices in schools, serious games hold great potential to shape positive learning outcomes. This study identified serious game features, identified the features missing from the existing quiz game modelling languages, modelled select features missing from the existing languages and developed a model editor to facilitate a no code approach. The study found out that while there were existing modelling languages for quiz games, they did not incorporate several serious games features despite the applicability of the said features to the quiz game domain. With the new features added to the existing modelling language AskMe, five games were prototyped to reflect the applicability of the features to quiz games. A formal verification tool (Microsoft z3) was also used to check the generated models for correctness covering both positive and negative outcomes. Finally this study developed a model editor for the extended language. Through analysis of quiz game requirements specified in existing serious quiz games, this study found out that the extended language covered the specified requirements in a more robust manner. The major contributions of this study include the extension of the quiz game modelling language AskMe to make it more suitable for serious game applications. This study introduced progress tracking, level based game challenges among others that were missing from AskMe. This study also introduced a graphical model editor as an additional tool to enable creation of quiz game models with little to no programming skills.
-
ItemEnterprise architecture approach to standardising digital health across Uganda’s health system(Makerere University, 2025)Digital Health (DH) has the potential to transform healthcare in Low- and Middle-Income Countries (LMICs) like Uganda by improving access, enhancing security, and reducing costs. Despite this promise, Uganda’s health system faces numerous DH challenges, including fragmentation, duplication of efforts, lack of interoperability among systems, and concerns regarding data security and privacy. These challenges can partly be attributed to lack of a standardised DH ecosystem. Enterprise Architecture (EA), which promotes a systems-thinking approach to digitalisation, is a vital approach to standardisation. Thus, the main goal of this study was to leverage the EA approach to facilitate the standardisation of DH across Uganda’s health system. It focused on four objectives: (1) determining the EA approach and framework that is applicable to standardise the DH ecosystem across Uganda’s health system, (2) establishing the EA requirements for standardising the DH ecosystem across Uganda's health system, (3) designing an EA framework to facilitate the standardisation of the DH ecosystem across Uganda's health system, and (4) validating the designed framework. The study adopted a pragmatism research paradigm and utilised the Design Science Research methodology to create a Digital Health Enterprise Architecture Framework (DHEAF), anchored on TOGAF 9.2 and aligned with the WHO-ITU eHealth model. A landscape analysis of Uganda’s DH ecosystem identified key issues related to governance, human resources, data quality, system interoperability, and infrastructure, which led to the derivation of 58 EA requirements. These requirements were validated by DH stakeholders in Uganda’s health system and over 85%, affirmed their relevance and importance in guiding the development an EA framework for DH. The DHEAF was designed and refined through three stakeholder workshops using the Delphi technique, involving Enterprise Architects, ICT specialists, Policymakers, DH practitioners, and Health informaticians. Validation by this diverse group confirmed that over 80% viewed DHEAF as comprehensive, adaptable, and user-friendly, capable of standardising DH interventions across Uganda’s health system. The DHEAF shows promise for broader application in similar LMIC contexts. Future research should explore integrating emerging technologies such as Artificial Intelligence (AI), Machine Learning (ML), and the Internet of Things (IoT). Additionally, studies should examine the impact of DHEAF on health policies and regulations in Uganda and develop policy recommendations to support its adoption and implementation. Keyword: Digital health, Health system