Topic classification for radio monitoring using deep neural  networks

Mukiibi, Jonathan

View/Open

Master's Dissertation (3.070Mb)

Date

2023

Author

Mukiibi, Jonathan

Metadata

Show full item record

Abstract

In rural communities in Uganda, radio is the main medium of public communication and information exchange. Radio is characterised by talk shows and phone-ins in which people at the grassroots level voice their opinions. Analysing these opinions, especially from the marginalised groups can help various stakeholders including the government to make inclusive policies. As a first step to analysing radio data, the United Nations and Makerere AI Lab have spearheaded efforts to build Speech-To-Text (STT) and Keyword Spotter (KWS) models for under-resourced languages for radio monitoring. These have been used to analyse radio broadcasts for various use cases like crop disease surveillance, health service delivery, and monitoring of humanitarian activities. However, building usable STT models requires transcribed data for training and this means hundreds or thousands of hours of speech data for deep learning-based STT models. KWS models only capture a few keyword mentions which may not be representative enough for a given use case that might be of high interest. In this research, we propose a deep neural topic classification model for Luganda radio data based on the Mel Spectrogram features using a ResNet architecture. We describe how we collected and annotated over 204 hours of radio data across 5 different topics of interest. We used a transfer learning approach by using variants from different pre-trained ResNet models and fine-tuning them to MelSpectrogram features extracted from radio data. Our best-performing model was a ResNet18 with an accuracy of 0.97 on a held-out test set. We deployed the topic classification model on radio data from 3 different popular radio stations. We obtained a precision-at-k of 0.25, 0.05 and 0.3 on data from the 3 stations with k values of 31, 40 and 40. Compared to past research discussed in the literature, our results show that topic classification models for speech are a good alternative to existing systems used in radio monitoring.

URI

http://hdl.handle.net/10570/11390

Collections

School of Computing and Informatics Technology (CIT) Collection