A Graph-Based Machine Learning Approach for Feature Set Matching: Towards Multi-Factorial Donor-Recipient Matching for Kidney Transplantation

Majouni, Sheida

A Graph-Based Machine Learning Approach for Feature Set Matching: Towards Multi-Factorial Donor-Recipient Matching for Kidney Transplantation

dc.contributor.author	Majouni, Sheida
dc.contributor.copyright-release	Not Applicable
dc.contributor.degree	Doctor of Philosophy
dc.contributor.department	Faculty of Computer Science
dc.contributor.ethics-approval	Not Applicable
dc.contributor.external-examiner	Dr. Jose Manuel Juarez Herrero
dc.contributor.manuscripts	Not Applicable
dc.contributor.thesis-reader	Dr. Hassan Sajjad
dc.contributor.thesis-reader	Dr. Samina Abidi
dc.contributor.thesis-supervisor	Dr. Syed Sibte Raza Abidi
dc.contributor.thesis-supervisor	Dr. Karthik Tennankore
dc.date.accessioned	2026-03-11T12:29:35Z
dc.date.available	2026-03-11T12:29:35Z
dc.date.defence	2026-03-06
dc.date.issued	2026-03-10
dc.description	Many decision-support tasks require determining optimal correspondences between two heterogeneous sets of entities, services to clients, and donors to recipients. Heterogeneous feature set matching (FSM) problems involve complex relationships, multiple criteria, and constraints that challenge traditional machine learning (ML) methods. These models usually use flat feature vectors and overlook interaction networks. We present a novel graph-based representation-learning framework that predicts optimal matches. The primary contribution is the construction of a bipartite knowledge graph integrating structured domain knowledge, temporal dynamics, and multi-relational interactions as diverse multi-typed edges. To address scalability and training instability in large, dense graphs, we introduce an edge-pruning strategy that preserves the connections. We formulate FSM as link prediction and develop a Graph Neural Network (GNN) architecture that learns embeddings for heterogeneous entities. The model captures attribute-level similarity and relational context, enabling predictions, rankings, and interpretable matching. A key component is a novel, temporal-aware graph partitioning technique that addresses limitations of standard graph splitting, preventing information leakage. We validate this framework through a large-scale case study on donor-recipient matching (DRM) problem in kidney transplantation (KT), involving 27,000 patients and 49 million relationships, using the Scientific Registry of Transplant Recipients (SRTR) dataset. The framework outperforms well-established ML models (e.g., XGBoost, random forest, logistic regression) and shows the ability to discover hidden compatibility structure. Beyond predictive accuracy, this thesis emphasizes interpretability. By integrating domain knowledge into the graph structure and using attention-based message passing, our framework examines how compatibility is acquired. Through layer-wise embedding analysis, attention weight inspection, and neighborhood influence tracing, we demonstrate which donor and recipient features contribute most to the model’s predictions. This provides transparency into the learned matching function, revealing compatibility patterns that align with domain expectations. Additionally, we design a simulation-based waitlist evaluation protocol to replicate real allocation scenarios by reconstructing counterfactual waitlists and ranking candidate matches for each donor. These simulated waitlists enable evaluating whether the model suggests better matches than historical allocations. Overall, this work introduces a temporally aware framework for complex, constrained FSM by contributing: (i) a generalizable multi-relational graph construction framework; (ii) a scalable and temporally consistent GNN-based matching model; and (iii) a comprehensive evaluation methodology that combines predictive accuracy, interpretability, and real-world inference tasks. This provides a novel solution for decision support in healthcare and other critical domains.
dc.description.abstract	Many decision-support tasks require determining optimal correspondences between heterogeneous entities, such as services to clients or donors to recipients. These heterogeneous feature set matching (FSM) problems involve complex relationships, multiple criteria, and domain constraints that challenge traditional machine learning methods based on flat feature vectors. This thesis presents a graph-based representation-learning framework for predicting optimal matches. We construct a bipartite knowledge graph integrating domain knowledge, temporal dynamics, and multi-relational interactions, and introduce an edge-pruning strategy to improve scalability in dense graphs. FSM is formulated as a link prediction problem using a Graph Neural Network (GNN) that learns embeddings capturing both attribute similarity and relational context. A temporal-aware graph partitioning strategy prevents information leakage during evaluation. The framework is validated on donor-recipient matching in kidney transplantation using the SRTR dataset (27,000 patients and 49 million relationships), outperforming traditional ML models. Interpretability analysis and simulation-based waitlist evaluation further demonstrate the framework’s ability to reveal compatibility patterns and support real-world allocation decisions.
dc.identifier.uri	https://hdl.handle.net/10222/85853
dc.language.iso	en
dc.subject	Graph Neural Networks
dc.subject	Donor Recipient Matching
dc.subject	Kidney Transplantation
dc.subject	Graph Attention Networks
dc.subject	Edge Prediction
dc.subject	Knowledge-Driven Graph
dc.title	A Graph-Based Machine Learning Approach for Feature Set Matching: Towards Multi-Factorial Donor-Recipient Matching for Kidney Transplantation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: SheidaMajouni2026.pdf
Size:: 8.36 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.12 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Graduate Studies Online Theses