Maize yield prediction using earth observation data at different phenological phases using machine learning. A case study of Lugore prison farm-Gulu District
Maize yield prediction using earth observation data at different phenological phases using machine learning. A case study of Lugore prison farm-Gulu District
Date
2026
Authors
Adero, Lydia
Journal Title
Journal ISSN
Volume Title
Publisher
Makerere University
Abstract
Accurate crop yield prediction is of great importance to global food production; however, its
inaccuracy remains a persistent and critical challenge for the agricultural sector. The increasing
availability of satellite-based earth observation data plays a pivotal role in crop yield prediction,
providing spatially extensive and temporal insights, enabling early and accurate yield
prediction. Currently, most maize yield predictions are statistical in nature and focus on maize
yield prediction based on aggregation of all the seasonal variables required for maize yield
prediction and therefore, do not account for the phase-specific dynamics since each growth
phase is characterized by unique physiological processes and environmental sensitivities that
ultimately determine yield potential. This research therefore, aims to explore the use of earth
observation to predict maize yield, specifically during the vegetative and reproductive phases
of the maize crop growth, using machine learning models and a case study of Lugore Prison
Farm from 2018 to 2024. The research utilised Sentinel-2 data for Vegetation Indices, MODIS
data for temperature and CHIRPS data for precipitation. The study utilised NDVI time series
curves smoothed with the Savitzky–Golay filter to determine the temporal patterns of the
vegetative and reproductive phases using the relative threshold method and three machine
learning algorithms: random forest, Gaussian Process Regression, and Extreme Gradient Boost
for maize yield prediction at the vegetative and reproductive phases of maize. The results
revealed a longer vegetative phase than the reproductive phase with interannual variations in
the onsets, durations and end of the different phases, but these were mainly dependent on the
prevailing meteorological factors. For the maize yield estimation, the Extreme Gradient Boost
model demonstrated the most superior performance with Root Mean Square (RMSE) of 50,010
kg and 4,270 kg in the vegetative and reproductive phase of season one, respectively and 3 kg
and 5 kg in the vegetative and reproductive phase of season two, respectively. The Gaussian
Process Regression model had the least accurate results with RMSE of 127,264 kg in the
vegetative phase and 127,924 kg in the reproductive phase of season one, and 74,163 kg in the
vegetative phase and 66,681 kg in the reproductive phase of season two. The study
demonstrates the potential of leveraging earth observation data and machine learning models
for accurate and phase-specific prediction of maize yield. The results from the study can be
used for strategic planning by policy makers and farmers, especially those at the vegetative
phase, since they can be attained earlier than the actual harvest time. Further research can use
the models on different crops, geographic locations and also use different machine learning
models, deep learning models and artificial intelligence for maize yield prediction.
Description
A Dissertation Submitted to the Directorate of Research and Graduate Training for the
Award of Master of Science in Geo-Information Science and Technology (MSGT) of
Makerere University.
Keywords
Citation
Adero, Lydia. (2026). Maize yield prediction using earth observation data at different phenological phases using machine learning. A case study of Lugore prison farm-Gulu District.