Modelling haemoglobin among pregnant women as a continuous variable: are the conclusions on its factors influenced by dichotomizing it?
Abstract
Risk factors for anemia are commonly identified with blood haemoglobin measured and modeled on a binary than continuous scale. This approach leads to information loss and reduced statistical efficiency. In this study, we modeled blood haemoglobin as a continuous response variable that is normally distributed and examined whether inferences on its factors among pregnant women (15-49 years) in Uganda are influenced by dichotomizing it. Data were sourced from the 2016 Uganda Demographic and Health Survey (UDHS). Descriptive statistics and data visualization helped to examine the distribution of blood haemoglobin, and descriptive characteristics of the pregnant women were generated. This was followed by fitting a simple linear regression model to assess the association between each covariate and blood haemoglobin and, in turn, identify explanatory variables for further analysis at the multivariable level. At the multivariable level, the multiple linear regression (MLR) model for blood haemoglobin on a continuous scale was compared to the binary logistic regression model for blood haemoglobin, modeled as a binary outcome to identify the most suitable model based on the minimum deviance and Bayesian Information Criterion (BIC) criterion to examine factors associated with blood haemoglobin among pregnant women in Uganda at a 0.05 level of significance. The study established that the MLR model of the blood haemoglobin offered a better precision of identification of risk factors associated with it than the binary logistic regression model. The MLR analysis revealed that older pregnant women, having had more than one pregnancy (multigravida) and having at least three meals in a day, had a significant positive relationship with blood haemoglobin level, whereas being overweight had a significant negative relationship with blood haemoglobin level. The findings proved that attention to distributional analysis and the use of the MLR regression model rather than the binary logistic regression model improves the precision of identification of risk factors associated with blood haemoglobin among pregnant women. Body mass index (BMI) is a modifiable risk factor for blood haemoglobin concentration that can be addressed through lifestyle interventions. We recommend the use of the MLR model for modelling the risk factors for blood haemoglobin