• Login
    View Item 
    •   Mak IR Home
    • College of Computing and Information Sciences (CoCIS)
    • School of Computing and Informatics Technology (CIT)
    • School of Computing and Informatics Technology (CIT) Collection
    • View Item
    •   Mak IR Home
    • College of Computing and Information Sciences (CoCIS)
    • School of Computing and Informatics Technology (CIT)
    • School of Computing and Informatics Technology (CIT) Collection
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Gender bias mitigation and evaluation in Luganda to English machine translation models

    Thumbnail
    View/Open
    Masters Dissertation (4.523Mb)
    Date
    2023
    Author
    Suuna, Conrad
    Metadata
    Show full item record
    Abstract
    Recent research highlights the significant concern of gender bias in machine translation (MT) models, characterized by gender-stereotyped and discriminatory translations. Given the wide application of MT across domains, addressing this issue is crucial. While various approaches have been explored to mitigate gender bias in MT models, further understanding and solutions are needed. Our research focused on treating gender bias as a domain adaptation problem. To achieve this, we artificially crafted a parallel dataset of 446 occupation sentences in the format; the [occupation] finished [his/her] work. We used this to debias the AI-Lab-Makerere/lg en (original) model. We also collected and annotated data for six Personally Identifiable Information (PII) entities; userid, person, location, norp, org, and date. This was used to develop, evaluate and compare the performance of six named entity recognition models for PII anonymisation. Afro-xlmr-base performed better compared to other models, with a 0.81 F1 score and 95% accuracy. We integrated this model into the Microsoft Presidio pipeline and used it to effectively sanitize gender bias test data for the MT model. We debiased the original model by fine-tuning it with the occupation dataset, adjusting hyperparameters and applying Knowledge Distillation to control catastrophic forgetting. When evaluated on the sanitised test set, the final distilled model performed better at translating gendered sentences with +0.3 and +0.27 higher BLEU and Translation Gender Bias Index scores respectively. The results suggest that our technique is a promising technique for mitigating gender bias in MT with less data collection involved.
    URI
    http://hdl.handle.net/10570/13185
    Collections
    • School of Computing and Informatics Technology (CIT) Collection

    DSpace 5.8 copyright © Makerere University 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     

    Browse

    All of Mak IRCommunities & CollectionsTitlesAuthorsBy AdvisorBy Issue DateSubjectsBy TypeThis CollectionTitlesAuthorsBy AdvisorBy Issue DateSubjectsBy Type

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    DSpace 5.8 copyright © Makerere University 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV