Abstract
We propose model with larger spatial size of feature maps and evaluate it on object detection task. With the goal to choose the best feature extraction network for our model we compare several popular lightweight networks. After that we conduct a set of experiments with channels reduction algorithms in order to accelerate execution. Our vehicle detection models are accurate, fast and therefore suit for embedded visual applications. With only 1.5 GFLOPs our best model gives 93.39 AP on validation subset of challenging DETRAC dataset. The smallest of our models is the first to achieve real-time inference speed on CPU with reasonable accuracy drop to 91.43 AP.
Abstract (translated by Google)
URL
https://arxiv.org/abs/1707.01395