Active Neural Learners for Text with Dual Supervision
Shama Sastry, Chandramouli
MetadataShow full item record
Dual supervision for text classification and information retrieval, which involves training the machine with class labels augmented with text annotations that are indicative of the class, has been shown to provide significant improvements, both in and beyond active learning (AL) settings. Annotations in the simplest form are highlighted portions of the text that are indicative of the class. They can range from unranked document-specific phrases to ranked corpus-level class-indicating terms and are an easy way to better engage users in training process. In this work, we aim to identify and realize the full potential of unsupervised pre-trained word embeddings for text-related tasks in AL settings by training Neural Nets -- specifically, Convolutional and Recurrent Neural Nets -- through dual supervision. The proposed solution involves the use of gradient-based feature attributions for constraining the machine to follow the user annotations; further, we discuss methods for overcoming the architecture-specific challenges in the optimization. Our results on the sentiment classification task show that one annotated and labeled document can be worth up to 7 labeled documents, giving accuracies of up to 70% for as few as 10 labeled and annotated documents, and shows promise in significantly reducing user effort for total-recall information retrieval task in Systematic Literature Reviews.