papers AI Learner
The Github is limit! Click to go to the new site.

Real-time Automatic Word Segmentation for User-generated Text

2019-05-07
Won Ik Cho, Sung Jun Cheon, Woo Hyun Kang, Ji Won Kim, Nam Soo Kim

Abstract

For readability and possibly for disambiguation, appropriate word segmentation is recommended for written text. In this paper, we propose a real-time assistive technology that utilizes an automatic segmentation. The language investigated is Korean, a head-final language with various morpho-syllabic blocks as characters. The training scheme is fully neural network-based and straightforward. Besides, we show how the proposed system can be utilized in a web-based real-time revision for a user-generated text. With qualitative and quantitative comparison with widely used text processing toolkits, we show the reliability of the proposed system and how it fits with conversation-style and non-canonical texts. The demonstration is available online.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1810.13113

PDF

http://arxiv.org/pdf/1810.13113


Comments

Content