Application of random forest regressor algorithm to predict PM2.5 concentration levels in Kampala

Wabinyai, Fidel Raja

Application of random forest regressor algorithm to predict PM2.5 concentration levels in Kampala

Files

Wabinyai-CoCIS-MSc.pdf (6.73 MB)

Date

2018-12-06

Authors

Wabinyai, Fidel Raja

Abstract

As it happens in every society, it is every body's wish to live in a clean and fresh environment. However, this might not be achieved in every daylife but at least the level of pollution can be controlled. Air pollution is one of the leading global public health risks but its magnitude in many developing countries is not known. As is in many African cities, fine particulate matters (PM2.5) is dangerously high in Kampala. This thesis uses data mining algorithms to build a predictive model for the following days PM2.5 concentration level. The prediction of concentrations of pollutants can be a powerful tool in order to take preventive measures such as the reduction of emissions and alerting the affected population. This thesis presents a forecasting model to predict the daily average concentrationof PM2.5 for the next few days(i.e. 3 to 5days). The proposed model used in this thesis was Random Forests regression. Random Forests regressor was compared with 4 other regression models namely Extra Trees Regressor, Gaussian Process Regressor, XGBoost, and Elasticnet. The performance estimation is determined using the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE) and R-squared (R2). The results demonstrated that the Random Forests regressor algorithm outperformed other models. 6 pollution monitoring stations in Kampala measuring PM2.5 were selected. We found that the mean concentration of PM2.5 pollution was 3 times higher than the World Health Organization (WHO) recommended level.

Description

A dissertation submitted to the Directorate of Research and Graduate Training in partial fulfillment for the Award of the Degree of Master of Science in Computer Science of Makerere University.

Keywords

Air pollution PM 2.5, Random forest regression algorithm, Air quality index

Citation

Wabinyai, F. R. (2018). Application of random forest regressor algorithm to predict PM2.5 concentration levels in Kampala. Unpublished master’s thesis, Makerere University, Kampala, Uganda.

URI

http://hdl.handle.net/10570/7258

Collections

Academic submissions (CoCIS)

Full item page