PREDICTING THE OUTCOME OF KIDNEY TRANSPLANTS USING MACHINE LEARNING METHODS
Abstract
The prediction of the survival of kidney grafts is based on the procedure of matching kidney donors and recipients. Machine learning can be effectively used to analyze the appropriate donor-recipient attributes from a high-dimensional transplantation dataset in developing the prediction models. In this study, we analyzed 52827 deceased donor cases from the year 2000-2017 using a large dataset of kidney transplant recipients. In our approach, we divided the patients into 3 different time-cohorts— patients with graft failure in year 1, between years 2-5, and more than 5 years. The intent was to investigate the changes in the significance of patient attributes towards graft success across multiple time-periods. We applied machine learning approaches to predict the status of the graft as either failed or survived in three different time-cohorts; and to predict the risk of graft failure as either high, medium or low following a kidney transplant surgery. We experimented with 5 classification algorithms (i.e. random forest, adaptive boosting, artificial neural network, logistic regression and support vector machine). In addition to developing the prediction models, we also analyzed the changes in the significance of the features over the study. Our results indicate that support vector machine and adaptive boosting combined with SMOTE provided the best area-under-the-receiver-operating-characteristic-curve (AUROC). The cross-validated AUROC scores for predicting the graft status were 85%, 66%, and 84% in the 1st and 2nd and 3rd cohort, respectively, whereas the F1-Micro score for the risk of graft failure was 62%. The feature importance scores were calculated using Gini impurity and permutation-based techniques to identify the important predictors and analyze their changing contribution in predicting the results for the three different time-cohorts; we noted a change in the significance of attributes across the three different time cohorts (e.g. the number of years on dialysis before transplant was an important attribute in the only 1st and 2nd time-cohorts, whereas, the recipient’s age and recipient’s diabetes status were important in only 3rd cohort).