Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

2019-02-01

Zhe Gan, Yu Cheng, Ahmed EI Kholy, Linjie Li, Jingjing Liu, Jianfeng Gao

arXiv_CV

arXiv_CV Attention

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

This paper presents Recurrent Dual Attention Network (ReDAN) for visual dialog, using multi-step reasoning to answer a series of questions about an image. In each turn of the dialog, ReDAN infers answers progressively through multiple steps. In each step, a recurrently-updated semantic representation of the (refined) query is used for iterative reasoning over both the image and previous dialog history. Experimental results on VisDial v1.0 dataset show that the proposed ReDAN model outperforms prior state-of-the-art approaches across multiple evaluation metrics. Visualization on the iterative reasoning process further demonstrates that ReDAN can locate context-relevant visual and textual clues leading to the correct answers step-by-step.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1902.00579

PDF

http://arxiv.org/pdf/1902.00579

Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments