A Graph-Based Machine Learning Approach for Feature Set Matching: Towards Multi-Factorial Donor-Recipient Matching for Kidney Transplantation

Majouni, Sheida

A Graph-Based Machine Learning Approach for Feature Set Matching: Towards Multi-Factorial Donor-Recipient Matching for Kidney Transplantation

Files

SheidaMajouni2026.pdf (8.36 MB)

Date

2026-03-10

Authors

Majouni, Sheida

Abstract

Many decision-support tasks require determining optimal correspondences between heterogeneous entities, such as services to clients or donors to recipients. These heterogeneous feature set matching (FSM) problems involve complex relationships, multiple criteria, and domain constraints that challenge traditional machine learning methods based on flat feature vectors. This thesis presents a graph-based representation-learning framework for predicting optimal matches. We construct a bipartite knowledge graph integrating domain knowledge, temporal dynamics, and multi-relational interactions, and introduce an edge-pruning strategy to improve scalability in dense graphs. FSM is formulated as a link prediction problem using a Graph Neural Network (GNN) that learns embeddings capturing both attribute similarity and relational context. A temporal-aware graph partitioning strategy prevents information leakage during evaluation. The framework is validated on donor-recipient matching in kidney transplantation using the SRTR dataset (27,000 patients and 49 million relationships), outperforming traditional ML models. Interpretability analysis and simulation-based waitlist evaluation further demonstrate the framework’s ability to reveal compatibility patterns and support real-world allocation decisions.

Description

Many decision-support tasks require determining optimal correspondences between two heterogeneous sets of entities, services to clients, and donors to recipients. Heterogeneous feature set matching (FSM) problems involve complex relationships, multiple criteria, and constraints that challenge traditional machine learning (ML) methods. These models usually use flat feature vectors and overlook interaction networks. We present a novel graph-based representation-learning framework that predicts optimal matches. The primary contribution is the construction of a bipartite knowledge graph integrating structured domain knowledge, temporal dynamics, and multi-relational interactions as diverse multi-typed edges. To address scalability and training instability in large, dense graphs, we introduce an edge-pruning strategy that preserves the connections. We formulate FSM as link prediction and develop a Graph Neural Network (GNN) architecture that learns embeddings for heterogeneous entities. The model captures attribute-level similarity and relational context, enabling predictions, rankings, and interpretable matching. A key component is a novel, temporal-aware graph partitioning technique that addresses limitations of standard graph splitting, preventing information leakage. We validate this framework through a large-scale case study on donor-recipient matching (DRM) problem in kidney transplantation (KT), involving 27,000 patients and 49 million relationships, using the Scientific Registry of Transplant Recipients (SRTR) dataset. The framework outperforms well-established ML models (e.g., XGBoost, random forest, logistic regression) and shows the ability to discover hidden compatibility structure. Beyond predictive accuracy, this thesis emphasizes interpretability. By integrating domain knowledge into the graph structure and using attention-based message passing, our framework examines how compatibility is acquired. Through layer-wise embedding analysis, attention weight inspection, and neighborhood influence tracing, we demonstrate which donor and recipient features contribute most to the model’s predictions. This provides transparency into the learned matching function, revealing compatibility patterns that align with domain expectations. Additionally, we design a simulation-based waitlist evaluation protocol to replicate real allocation scenarios by reconstructing counterfactual waitlists and ranking candidate matches for each donor. These simulated waitlists enable evaluating whether the model suggests better matches than historical allocations. Overall, this work introduces a temporally aware framework for complex, constrained FSM by contributing: (i) a generalizable multi-relational graph construction framework; (ii) a scalable and temporally consistent GNN-based matching model; and (iii) a comprehensive evaluation methodology that combines predictive accuracy, interpretability, and real-world inference tasks. This provides a novel solution for decision support in healthcare and other critical domains.

Keywords

Graph Neural Networks, Donor Recipient Matching, Kidney Transplantation, Graph Attention Networks, Edge Prediction, Knowledge-Driven Graph

URI

https://hdl.handle.net/10222/85853

Collections

Faculty of Graduate Studies Online Theses

Full item page

A Graph-Based Machine Learning Approach for Feature Set Matching: Towards Multi-Factorial Donor-Recipient Matching for Kidney Transplantation

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections