Interpreting Deep Learning Models
Model interpretability is a requirement in many applications in which crucial decisions are made by users relying on a model's outputs. The recent movement for “algorithmic fairness” also stipulates explainability, and therefore interpretability of learning models. The most notable is “a right to explanation” enforced in the widely-discussed provision of the European Union General Data Privacy Regulation (GDPR), which became enforceable beginning 25 May 2018. And yet the most successful contemporary Machine Learning approaches, the Deep Neural Networks, produce models that are highly non-interpretable. Deep Neural Networks have achieved huge success in a wide spectrum of applications from language modeling and computer vision to speech recognition. However, nowadays, good performance alone is not sufficient to satisfy the needs of practical deployment where interpretability is demanded for cases involving ethics and mission critical applications. The complex models of Deep Neural Networks make it hard to understand and reason the predictions, which hinders its further progress. In this thesis, we attempt to address this challenge by presenting two methodologies that demonstrate superior interpretability results on experimental data and one method for leveraging interpretability to refine neural nets. The first methodology, named CNN-INTE, interprets deep Convolutional Neural Networks (CNN) via meta-learning. In this work, we interpret a specific hidden layer of the deep CNN model on the MNIST image dataset. We use a clustering algorithm in a two-level structure to find the meta-level training data and Random Forests as base learning algorithms to generate the meta-level test data. The interpretation results are displayed visually via diagrams, which clearly indicate how a specific test instance is classified. In the second methodology, we apply the Knowledge Distillation technique to distill Deep Neural Networks into decision trees in order to attain good performance and interpretability simultaneously. The experiments demonstrate that the student model achieves a significantly higher accuracy performance (about 1% to 5%) than conventional decision trees at the same level of tree depth. In the end, we propose a new method, Quantified Data Visualization (QDV) to leverage interpretability for refining deep neural nets. Our experiments show empirically why VGG19 has better classification accuracy than Alexnet on the CIFAR-10 dataset through quantitative and qualitative analyses on each of their hidden layers. This approach could be applied to refine the architectures of deep neural nets when their parameters are altered and adjusted.