papers AI Learner
The Github is limit! Click to go to the new site.

Leveraging Video Descriptions to Learn Video Question Answering

2016-12-19
Kuo-Hao Zeng, Tseng-Hung Chen, Ching-Yao Chuang, Yuan-Hong Liao, Juan Carlos Niebles, Min Sun

Abstract

We propose a scalable approach to learn video-based question answering (QA): answer a “free-form natural language question” about a video content. Our approach automatically harvests a large number of videos and descriptions freely available online. Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated. Next, we use these candidate QA pairs to train a number of video-based QA methods extended fromMN (Sukhbaatar et al. 2015), VQA (Antol et al. 2015), SA (Yao et al. 2015), SS (Venugopalan et al. 2015). In order to handle non-perfect candidate QA pairs, we propose a self-paced learning procedure to iteratively identify them and mitigate their effects in training. Finally, we evaluate performance on manually generated video-based QA pairs. The results show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.

Abstract (translated by Google)

我们提出了一个可扩展的方法来学习基于视频的问答(QA):回答关于视频内容的“自由形式的自然语言问题”。我们的方法可以在线自动收集大量视频和说明。然后,从描述中自动生成大量的候选QA对,而不是手动注释。接下来,我们使用这些候选QA对来训练从MN(Sukhbaatar等,2015),VQA(Antol等,2015),SA(Yao等,2015),SS(Venugopalan等)延伸的一些基于视频的QA方法al。2015)。为了处理非完美的候选QA对,我们提出了一个自学的学习过程来迭代地识别它们并减轻它们在训练中的影响。最后,我们评估手动生成的基于视频的QA对的性能。结果表明,我们的自主学习过程是有效的,并且扩展的SS模型胜过了各种基线。

URL

https://arxiv.org/abs/1611.04021

PDF

https://arxiv.org/pdf/1611.04021


Similar Posts

Comments