papers AI Learner
The Github is limit! Click to go to the new site.

SentiCap: Generating Image Descriptions with Sentiments

2015-12-13
Alexander Mathews, Lexing Xie, Xuming He

Abstract

The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.

Abstract (translated by Google)

最近在图像识别和语言建模方面的进展使得图像内容的自动描述成为现实。然而,当前系统中缺少书面描述的程式化的非实际方面。一种这样的风格是情绪描述,这在日常交流中是司空见惯的,影响着决策和人际关系。我们设计一个系统来描述一个情绪的图像,并提出一个模型,自动生成积极或消极的情绪字幕。我们提出了一种新颖的带有字级正则化的开关递归神经网络,它只能使用包含情感的2000多个训练语句产生情感图像字幕。我们使用不同的自动和众包指标来评估字幕。我们的模型在图像字幕的通用质量指标方面比较有利。在84.6%的情况下,生成的正面字幕被认为至少与事实字幕一样具有描述性。在这些积极的字幕中,88%被群众来源的工人确认为具有适当的情绪。

URL

https://arxiv.org/abs/1510.01431

PDF

https://arxiv.org/pdf/1510.01431


Similar Posts

Comments