papers AI Learner
The Github is limit! Click to go to the new site.

Generation and Comprehension of Unambiguous Object Descriptions

2016-04-11
Junhua Mao, Jonathan Huang, Alexander Toshev, Oana Camburu, Alan Yuille, Kevin Murphy

Abstract

We propose a method that can generate an unambiguous description (known as a referring expression) of a specific object or region in an image, and which can also comprehend or interpret such an expression to infer which object is being described. We show that our method outperforms previous methods that generate descriptions of objects without taking into account other potentially ambiguous objects in the scene. Our model is inspired by recent successes of deep learning methods for image captioning, but while image captioning is difficult to evaluate, our task allows for easy objective evaluation. We also present a new large-scale dataset for referring expressions, based on MS-COCO. We have released the dataset and a toolbox for visualization and evaluation, see this https URL

Abstract (translated by Google)

我们提出了一种方法,可以生成图像中特定对象或区域的明确描述(称为引用表达式),还可以理解或解释这种表达式来推断哪个对象正在描述。我们表明,我们的方法胜过以前的方法,生成的对象的描述,而不考虑其他可能模糊的场景中的对象。我们的模型受到最近成功的深度图像字幕学习方法的启发,但是图像字幕难以评估,我们的任务允许客观评估。我们还提出了一个新的基于MS-COCO的大规模数据集表达式。我们已经发布了数据集和用于可视化和评估的工具箱,请参阅https网址

URL

https://arxiv.org/abs/1511.02283

PDF

https://arxiv.org/pdf/1511.02283


Similar Posts

Comments