Prediction of Cerebral Palsy in Very Preterm Infants
Abstract
Background: The prevalence of cerebral palsy (CP) is ten times higher in preterm compared to term infants. Accurate and early identification of preterm infants at risk for CP would enable early referral to intervention programs with the potential to improve their functional mobility and quality of life. Large population-based studies of CP in preterm infants have only reported measures of association and did not develop prediction models of CP and assess their diagnostic properties. Furthermore, all these studies used conventional logistic regression for their models. Machine learning may provide more accurate predictions than logistic regression due to its ability to better handle complex relationships between predictors and the outcome. Machine learning methods have not been used yet to predict CP from clinical predictors in former preterm infants.
Objectives: The objective of this study was to develop prediction models for CP in very preterm infants (<31 weeks’ gestation) using the random forest (RF) ensemble method and logistic regression and to compare their accuracy in predicting CP.
Study Design: I used a population-based cohort of 777 very preterm survivors from the AC Allen Provincial Perinatal Follow-Up Program Database born between 2000 and 2014 in Nova Scotia. After randomly splitting the sample into training and testing datasets using a 70:30 ratio, clinical and demographic data from the infants and their mothers were used to develop prediction models of CP at three time points (prenatal, perinatal, and postnatal) in the training dataset using RF and logistic regression. Both models were then compared with regard to their discriminative ability (AUC) in the testing dataset.
Results: In this cohort, 86 infants (11%) developed CP. Predictive performance of the models at the prenatal and perinatal time points was poor, regardless of the method used. At the postnatal time point, both RF and logistic regression provided good discrimination of children with and without CP (AUC 0.84 [95% CI 0.74, 0.94] and AUC 0.81 [95% CI 0.74, 0.95], respectively).
Conclusion: Using clinical predictors, logistic regression was comparable to the RF ensemble method in prediction of CP in a population-based cohort of very preterm children. Both methods can be used for predicting CP in former very preterm infants at the time of discharge.