Abstract
In recent years, Neural Machine Translation (NMT) has been proven to get impressive results. While some additional linguistic features of input words improve word-level NMT, any additional character features have not been used to improve character-level NMT so far. In this paper, we show that the radicals of Chinese characters (or kanji), as a character feature information, can be easily provide further improvements in the character-level NMT. In experiments on WAT2016 Japanese-Chinese scientific paper excerpt corpus (ASPEC-JP), we find that the proposed method improves the translation quality according to two aspects: perplexity and BLEU. The character-level NMT with the radical input feature’s model got a state-of-the-art result of 40.61 BLEU points in the test set, which is an improvement of about 8.6 BLEU points over the best system on the WAT2016 Japanese-to-Chinese translation subtask with ASPEC-JP. The improvements over the character-level NMT with no additional input feature are up to about 1.5 and 1.4 BLEU points in the development-test set and the test set of the corpus, respectively.
Abstract (translated by Google)
URL
https://arxiv.org/abs/1805.02937