papers AI Learner
The Github is limit! Click to go to the new site.

Learning Visual Features from Large Weakly Supervised Data

2015-11-06
Armand Joulin, Laurens van der Maaten, Allan Jabri, Nicolas Vasilache

Abstract

Convolutional networks trained on large supervised dataset produce visual features which form the basis for the state-of-the-art in many computer-vision problems. Further improvements of these visual features will likely require even larger manually labeled data sets, which severely limits the pace at which progress can be made. In this paper, we explore the potential of leveraging massive, weakly-labeled image collections for learning good visual features. We train convolutional networks on a dataset of 100 million Flickr photos and captions, and show that these networks produce features that perform well in a range of vision problems. We also show that the networks appropriately capture word similarity, and learn correspondences between different languages.

Abstract (translated by Google)

在大型监督数据集上训练的卷积网络产生视觉特征,这构成了许多计算机视觉问题的最新技术的基础。这些视觉特征的进一步改进可能需要更大的手动标记的数据集,这严重限制了可以取得进展的速度。在本文中,我们探讨利用大量的弱标记图像集合学习良好视觉特征的潜力。我们在1亿个Flickr照片和标题的数据集上训练卷积网络,并显示这些网络产生的功能在一系列视觉问题中表现良好。我们还表明,网络适当捕捉单词相似性,并学习不同语言之间的对应关系。

URL

https://arxiv.org/abs/1511.02251

PDF

https://arxiv.org/pdf/1511.02251


Similar Posts

Comments