Abstract
In this paper we address the following question, given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different representation methods, but will also establish an upper bound on the scalability of an automatic face recognition system. We cast the face capacity problem in terms of packing bounds on a low-dimensional manifold embedded within a deep representation space. By explicitly accounting for the manifold structure of the representation as well two different sources of representational noise: epistemic (model) uncertainty and aleatoric (data) variability, our approach is able to estimate the capacity of a given face representation. To demonstrate the efficacy of our approach, we estimate the capacity of two deep neural network based face representations, namely 128-dimensional FaceNet and 512-dimensional SphereFace. Numerical experiments on unconstrained faces (IJB-C) provides a capacity upper bound of $2.7\times10^4$ for FaceNet and $8.4\times10^4$ for SphereFace representation at a false acceptance rate (FAR) of 1%. As expected, capacity reduces drastically at lower FARs. The capacity at FAR of 0.1% and 0.001% is $2.2\times10^3$ and $1.6\times10^{1}$, respectively for FaceNet and $3.6\times10^3$ and $6.0\times10^0$, respectively for SphereFace.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1709.10433