Data-centric Prediction Explanation and Model Editing for Deep Neural Networks

Sarvmaili, Mahtab

View/Open

MahtabSarvmaili2024.pdf (17.94Mb)

Date

2024-08-23

Author

Sarvmaili, Mahtab

Metadata

Show full item record

Abstract

Over the past decade, complex black-box models have excelled in various tasks, but their lack of transparency undermines trust in their predictions. This study contributes to Explainable AI (XAI) by introducing data-centric post-hoc explainers. We present two frameworks, FEHAN and DICTA, for locally explaining text classifiers through interpretable surrogate models. Experimental evaluations on four datasets demonstrate their effectiveness, with a focus on simplifying the explanation process. Additionally, we explore the explainability of Graph Convolutional Networks (GCNs) applied to molecular structures, offering multiple perspectives on their predictions. We also introduce HD-Explain, a post-hoc, model-aware, example-based explanation method for neural classifiers. HD-Explain uses Kernelized Stein Discrepancy (KSD) to identify influential training data points and potential distribution mismatches. This research advances the understanding of data contributions to machine learning models and addresses the emerging challenge of Machine Unlearning (MU) by leveraging insights into data-model interactions.

URI

http://hdl.handle.net/10222/84476

Subject

Collections

Faculty of Graduate Studies Online Theses

Find Full text