Abstract
With the current upsurge in the usage of social media platforms, the trend of using short text (microtext) in place of standard words has seen a significant rise. The usage of microtext poses a considerable performance issue in concept-level sentiment analysis, since models are trained on standard words. This paper discusses the impact of coupling sub-symbolic (phonetics) with symbolic (machine learning) Artificial Intelligence to transform the out-of-vocabulary concepts into their standard in-vocabulary form. The phonetic distance is calculated using the Sorensen similarity algorithm. The phonetically similar invocabulary concepts thus obtained are then used to compute the correct polarity value, which was previously being miscalculated because of the presence of microtext. Our proposed framework increases the accuracy of polarity detection by 6% as compared to the earlier model. This also validates the fact that microtext normalization is a necessary pre-requisite for the sentiment analysis task.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1905.01967