Deep Reason: A Strong Baseline for Real-World Visual Reasoning

2019-05-24

Chenfei Wu, Yanzhao Zhou, Gen Li, Nan Duan, Duyu Tang, Xiaojie Wang

arXiv_CV

arXiv_CV QA Inference

Abstract
Abstract (translated by Google)
URL
PDF

Abstract

This paper presents a strong baseline for real-world visual reasoning (GQA), which achieves 60.93% in GQA 2019 challenge and won the sixth place. GQA is a large dataset with 22M questions involving spatial understanding and multi-step inference. To help further research in this area, we identified three crucial parts that improve the performance, namely: multi-source features, fine-grained encoder, and score-weighted ensemble. We provide a series of analysis on their impact on performance.

Abstract (translated by Google)

URL

http://arxiv.org/abs/1905.10226

PDF

http://arxiv.org/pdf/1905.10226

Deep Reason: A Strong Baseline for Real-World Visual Reasoning

Abstract

Abstract (translated by Google)

URL

PDF

Similar Posts

Comments