papers AI Learner
The Github is limit! Click to go to the new site.

Good-Enough Compositional Data Augmentation

2019-04-21
Jacob Andreas

Abstract

We propose a simple data augmentation protocol aimed at providing a compositional inductive bias in conditional and unconditional sequence models. Under this protocol, synthetic training examples are constructed by taking real training examples and replacing (possibly discontinuous) fragments with other fragments that appear in at least one similar environment. The protocol is model-agnostic and useful for a variety of tasks. Applied to neural sequence-to-sequence models, it reduces relative error rate by up to 87% on problems from the diagnostic SCAN tasks and 16% on a semantic parsing task. Applied to n-gram language modeling, it reduces perplexity by roughly 1% on small datasets in several languages.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1904.09545

PDF

http://arxiv.org/pdf/1904.09545


Comments

Content