Abstract
In this supplementary material we tackle the problem of vehicle re-identification in a camera network utilizing triplet embeddings. Re-identification is the problem of matching appearances of objects across different cameras. With the proliferation of surveillance cameras enabling smart and safer cities, there is an ever-increasing need to re-identify vehicles across cameras. Typical challenges arising in smart city scenarios include variations of viewpoints, illumination and self occlusions. Most successful approaches for re-identification involve (deep) learning an embedding space such that the vehicles of same identities are projected close to one another than the vehicles representing different identities. Popular loss functions for learning an embedding space are contrastive or triplet loss. In this paper we provide an exhaustive evaluation of these losses applied to vehicle re-identification and demonstrate that using the best practices for learning embeddings outperform most of the previous approaches proposed in the literature. Compared to existing approaches, our approach is simpler in terms of both training and inference while maintaining comparable (and in most cases, better) accuracy and retrieval results.
Abstract (translated by Google)
URL
https://arxiv.org/abs/1901.01015