Sort Story: Sorting Jumbled Images and Captions into Stories

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

Temporal common sense has applications in AI tasks such as QA, multi-document summarization, and human-AI communication. We propose the task of sequencing – given a jumbled set of aligned image-caption pairs that belong to a story, the task is to sort them such that the output sequence forms a coherent story. We present multiple approaches, via unary (position) and pairwise (order) predictions, and their ensemble-based combinations, achieving strong results on this task. We use both text-based and image-based features, which depict complementary improvements. Using qualitative examples, we demonstrate that our models have learnt interesting aspects of temporal common sense.

Abstract (translated by Google)

时态常识在AI任务中有应用，如QA，多文件摘要和人 - AI通信。我们提出了排序的任务 - 给定一组混杂的图像 - 字幕对，属于一个故事，任务是排序他们，使输出序列形成一个连贯的故事。我们提出了多种方法，通过一元（位置）和成对（顺序）预测，以及它们的基于集合的组合，在这个任务上取得了很好的结果。我们使用基于文本和基于图像的功能，这些功能描述了互补的改进。使用定性的例子，我们证明我们的模型已经学习了时间常识的有趣方面。

URL

https://arxiv.org/abs/1606.07493

PDF

https://arxiv.org/pdf/1606.07493

Sort Story: Sorting Jumbled Images and Captions into Stories

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments