Ancient-Modern Chinese Translation with a Large Training Dataset

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

Ancient Chinese brings the wisdom and spirit culture of the Chinese nation. Automatically translation from ancient Chinese to modern Chinese helps to inherit and carry forward the quintessence of the ancients. In this paper, we propose an Ancient-Modern Chinese clause alignment approach and apply it to create a large scale Ancient-Modern Chinese parallel corpus which contains about 1.24M bilingual pairs. To our best knowledge, this is the first large high-quality Ancient-Modern Chinese dataset. Furthermore, we train the SMT and various NMT based models on this dataset and provide a strong baseline for this task

Abstract (translated by Google)

URL

https://arxiv.org/abs/1808.03738

PDF

https://arxiv.org/pdf/1808.03738

Ancient-Modern Chinese Translation with a Large Training Dataset

Abstract

Abstract (translated by Google)

URL

PDF

Comments