papers AI Learner
The Github is limit! Click to go to the new site.

Detecting unseen visual relations using analogies

2019-04-15
Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic

Abstract

We seek to detect visual relations in images of the form of triplets t = (subject, predicate, object), such as “person riding dog”, where training examples of the individual entities are available but their combinations are unseen at training. This is an important set-up due to the combinatorial nature of visual relations : collecting sufficient training data for all possible triplets would be very hard. The contributions of this work are three-fold. First, we learn a representation of visual relations that combines (i) individual embeddings for subject, object and predicate together with (ii) a visual phrase embedding that represents the relation triplet. Second, we learn how to transfer visual phrase embeddings from existing training triplets to unseen test triplets using analogies between relations that involve similar objects. Third, we demonstrate the benefits of our approach on three challenging datasets : on HICO-DET, our model achieves significant improvement over a strong baseline over both frequent and unseen triplets, and we confirm this improvement on the retrieval of unseen triplets with out-of-vocabulary predicates on COCO-a, as well as on the challenging unusual triplets of UnRel dataset.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1812.05736

PDF

http://arxiv.org/pdf/1812.05736


Similar Posts

Comments