UNSUPERVISED PARAPHRASE GENERATION FROM HIERARCHICAL LANGUAGE MODELS
Date
2018-12-14T16:19:32Z
Authors
Traynor, Michael
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Paraphrase generation is a challenging problem that requires a semantic
representation of language. Language models implemented with deep
neural networks (DNN) have the ability to transform text to a real
valued vector space that can capture useful semantic information.
In light of this, this work employs hierarchical language modeling
to produce semantic representations of sentences. An encoder-decoder
model is employed that uses four components: a word encoder, sentence
encoder, sentence decoder, and word decoder. These components hierarchically
convert a sentence from characters through word representations to a fixed-size sentence representation, then back down through words to characters.
Many types of neural network are suitable for each component, and a
number of them are compared in this work, including a novel architecture,
the Self Attentive Recurrent Array (SARAh). The SARAh is shown to perform at least
as well as Gated Recurrent Units (GRU) and Transformers on language modeling
tasks, and requires fewer parameters. These language models are trained
on a large and diverse dataset, but this work also shows that it is possible
to fine tune such models to a particular domain, such as the works of a
single author. These fine tuned models are able to leverage information
learned on the larger dataset in order to perform better on the target domain.
Finally, a language model is trained to produce semantic representations of
sentences that are subsequently used to produce paraphrases in a completely
unsupervised setting. The language model, which is trained to predict
the sentence most likely to follow the input sentence, is fine tuned to
instead autoencode the input sentence. Given that the sentence encoder
produces a semantic representation, it is possible to use a number of
techniques to encourage the decoder to generate a paraphrase rather
than reconstruct the exact input sentence. These techniques include
adding noise to the sentence representation, and sampling characters from
the model's output layer.
Description
Keywords
Deep Learning, Natural Language Processing, Semantic Representation, Language Modeling, Paraphrase