Abstract
Recognition of Handwritten Mathematical Expressions (HMEs) is a challenging problem because of the ambiguity and complexity of two-dimensional handwriting. Moreover, the lack of large training data is a serious issue, especially for academic recognition systems. In this paper, we propose pattern generation strategies that generate shape and structural variations to improve the performance of recognition systems based on a small training set. For data generation, we employ the public databases: CROHME 2014 and 2016 of online HMEs. The first strategy employs local and global distortions to generate shape variations. The second strategy decomposes an online HME into sub-online HMEs to get more structural variations. The hybrid strategy combines both these strategies to maximize shape and structural variations. The generated online HMEs are converted to images for offline HME recognition. We tested our strategies in an end-to-end recognition system constructed from a recent deep learning model: Convolutional Neural Network and attention-based encoder-decoder. The results of experiments on the CROHME 2014 and 2016 databases demonstrate the superiority and effectiveness of our strategies: our hybrid strategy achieved classification rates of 48.78% and 45.60%, respectively, on these databases. These results are competitive compared to others reported in recent literature. Our generated datasets are openly available for research community and constitute a useful resource for the HME recognition research in future.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1901.06763