papers AI Learner
The Github is limit! Click to go to the new site.

VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering

2016-12-12
Marc Bolaños, Álvaro Peris, Francisco Casacuberta, Petia Radeva
     

Abstract

In this paper, we address the problem of visual question answering by proposing a novel model, called VIBIKNet. Our model is based on integrating Kernelized Convolutional Neural Networks and Long-Short Term Memory units to generate an answer given a question about an image. We prove that VIBIKNet is an optimal trade-off between accuracy and computational load, in terms of memory and time consumption. We validate our method on the VQA challenge dataset and compare it to the top performing methods in order to illustrate its performance and speed.

Abstract (translated by Google)

在本文中,我们通过提出一个称为VIBIKNet的新模型来解决视觉问题的回答问题。我们的模型是基于集成核化卷积神经网络和长短期记忆单元来给出一个关于图像的问题的答案。我们证明,在内存和时间消耗方面,VIBIKNet是精度和计算负载之间的最佳平衡。我们在VQA质询数据集上验证了我们的方法,并将其与最高性能的方法进行比较,以说明其性能和速度。

URL

https://arxiv.org/abs/1612.03628

PDF

https://arxiv.org/pdf/1612.03628


Similar Posts

Comments