papers AI Learner
The Github is limit! Click to go to the new site.

DisSent: Sentence Representation Learning from Explicit Discourse Relations

2019-05-14
Allen Nie, Erin D. Bennett, Noah D. Goodman

Abstract

Learning effective representations of sentences is one of the core missions of natural language understanding. Existing models either train on a vast amount of text, or require costly, manually curated sentence relation datasets. We show that with dependency parsing and rule-based rubrics, we can curate a high quality sentence relation task by leveraging explicit discourse relations. We show that our curated dataset provides an excellent signal for learning vector representations of sentence meaning, representing relations that can only be determined when the meanings of two sentences are combined. We demonstrate that the automatically curated corpus allows a bidirectional LSTM sentence encoder to yield high quality sentence embeddings and can serve as a supervised fine-tuning dataset for larger models such as BERT. We evaluate our sentence embeddings on a variety of transfer tasks, including SentEval. We achieve state-of-the-art result on Penn Discourse Treebank implicit relation prediction task.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1710.04334

PDF

http://arxiv.org/pdf/1710.04334


Similar Posts

Comments