papers AI Learner
The Github is limit! Click to go to the new site.

Rich Image Captioning in the Wild

2016-03-31
Kenneth Tran, Xiaodong He, Lei Zhang, Jian Sun, Cornelia Carapcea, Chris Thrasher, Chris Buehler, Chris Sienkiewicz

Abstract

We present an image caption system that addresses new challenges of automatically describing images in the wild. The challenges include high quality caption quality with respect to human judgments, out-of-domain data handling, and low latency required in many applications. Built on top of a state-of-the-art framework, we developed a deep vision model that detects a broad range of visual concepts, an entity recognition model that identifies celebrities and landmarks, and a confidence model for the caption output. Experimental results show that our caption engine outperforms previous state-of-the-art systems significantly on both in-domain dataset (i.e. MS COCO) and out of-domain datasets.

Abstract (translated by Google)

我们提出了一个图像标题系统,以解决在野外自动描述图像的新挑战。面临的挑战包括高质量的字幕质量与人类判断,数据处理不当,以及许多应用程序所需的低延迟。建立在最先进的框架之上,我们开发了深度视觉模型,可以检测广泛的视觉概念,识别名人和地标的实体识别模型以及字幕输出的可信模型。实验结果表明,我们的字幕引擎在领域内数据集(即MS COCO)和域外数据集上均优于先前的先进系统。

URL

https://arxiv.org/abs/1603.09016

PDF

https://arxiv.org/pdf/1603.09016


Similar Posts

Comments