Low-Resource Syntactic Transfer with Unsupervised Source Reordering

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

We describe a cross-lingual transfer method for dependency parsing that takes into account the problem of word order differences between source and target languages. Our model only relies on the Bible, a considerably smaller parallel data than the commonly used parallel data in transfer methods. We use the concatenation of projected trees from the Bible corpus, and the gold-standard treebanks in multiple source languages along with cross-lingual word representations. We demonstrate that reordering the source treebanks before training on them for a target language improves the accuracy of languages outside the European language family. Our experiments on 68 treebanks (38 languages) in the Universal Dependencies corpus achieve a high accuracy for all languages. Among them, our experiments on 16 treebanks of 12 non-European languages achieve an average UAS absolute improvement of 3.3% over a state-of-the-art method.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1903.05683

PDF

http://arxiv.org/pdf/1903.05683

Low-Resource Syntactic Transfer with Unsupervised Source Reordering

Abstract

Abstract (translated by Google)

URL

PDF

Comments