Structural Embedding of Constituency Trees in the Attention-based Model for Machine Comprehension
Date
2023-08-18
Authors
Anand, Mayank
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Incorporating hierarchical structures for various Natural Language Processing (NLP)
tasks, which involves training the model with syntactic information of constituency
trees, has been shown to be very effective. Constituency trees in the simplest form
are graph representations of sentences that capture and illustrate syntactic hierarchical
structure of a sentence by showing how words are grouped into constituents.
However, the majority of research in NLP using Deep Learning to incorporate structural
information has been conducted on recurrent models, which are effective but
operate sequentially. To the best of our knowledge, no research has been done on
attention-based models for the reading comprehension task. In this work, we aim to
include syntactic information of constituency trees in the model QAnet which is based
on self-attention and specifically designed for Machine Reading Comprehension task.
The proposed solution involves the use of “Hierarchical Accumulation” to encode
constituency trees in self-attention in parallel time complexity. Our model, QATnet,
achieved competitive results compared to the baseline QAnet model. Furthermore, we
demonstrated by analyzing context-question pair examples that using a hierarchical
structure model exhibited a remarkable ability to retain contextual information over
longer distances and enhanced attention towards punctuation and other grammatical
intricacies.
Description
Keywords
natural language processing, language modeling, large language models(LLMs), Deep Learning, constituency trees, Question Answering, SQuAD, SQuAD2.0, transformers, multi-head attention, self-attention, Machine Reading Comprehension, computer science