Abstract
In this paper, we investigate the reliability of online recognition platforms, Amazon Rekognition and Microsoft Azure, with respect to changes in background, acquisition device, and object orientation. We focus on platforms that are commonly used by the public to better understand their real-world performances. To assess the variation in recognition performance, we perform a controlled experiment by changing the acquisition conditions one at a time. We use three smartphones, one DSLR, and one webcam to capture side views and overhead views of objects in a living room, an office, and photo studio setups. Moreover, we introduce a framework to estimate the recognition performance with respect to backgrounds and orientations. In this framework, we utilize both handcrafted features based on color, texture, and shape characteristics and data-driven features obtained from deep neural networks. Experimental results show that deep learning-based image representations can estimate the recognition performance variation with a Spearman’s rank-order correlation of 0.94 under multifarious acquisition conditions.
Abstract (translated by Google)
URL
http://arxiv.org/abs/1902.06585