Abstract
The recognition ability of human beings is developed in a progressive way. Usually, children learn to discriminate various objects from coarse to fine-grained with limited supervision. Inspired by this learning process, we propose a simple yet effective model for the Few-Shot Fine-Grained (FSFG) recognition, which tries to tackle the challenging fine-grained recognition task using meta-learning. The proposed method, named Pairwise Alignment Bilinear Network (PABN), is an end-to-end deep neural network. Unlike traditional deep bilinear networks for fine-grained classification, which adopt the self-bilinear pooling to capture the subtle features of images, the proposed model uses a novel pairwise bilinear pooling to compare the nuanced differences between base images and query images for learning a deep distance metric. In order to match base image features with query image features, we design feature alignment losses before the proposed pairwise bilinear pooling. Experiment results on four fine-grained classification datasets and one generic few-shot dataset demonstrate that the proposed model outperforms both the state-ofthe-art few-shot fine-grained and general few-shot methods.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1904.03580