papers AI Learner
The Github is limit! Click to go to the new site.

Soft Conditional Computation

2019-04-10
Brandon Yang, Gabriel Bender, Quoc V. Le, Jiquan Ngiam

Abstract

Conditional computation aims to increase the size and accuracy of a network, at a small increase in inference cost. Previous hard-routing models explicitly route the input to a subset of experts. We propose soft conditional computation, which, in contrast, utilizes all experts while still permitting efficient inference through parameter routing. Concretely, for a given convolutional layer, we wish to compute a linear combination of n experts α1(W1x)++αn(Wnx), where α1,,αn are functions of the input learned through gradient descent. A straightforward evaluation requires n convolutions. We propose an equivalent form of the above computation, (α1W1++αnWn)x, which requires only a single convolution. We demonstrate the efficacy of our method, named CondConv, by scaling up the MobileNetV1, MobileNetV2, and ResNet-50 model architectures to achieve higher accuracy while retaining efficient inference. On the ImageNet classification dataset, CondConv improves the top-1 validation accuracy of the MobileNetV1(0.5x) model from 63.8% to 71.6% while only increasing inference cost by 27%. On COCO object detection, CondConv improves the minival mAP of a MobileNetV1(1.0x) SSD model from 20.3 to 22.4 with just a 4% increase in inference cost.

Abstract (translated by Google)
URL

http://arxiv.org/abs/1904.04971

PDF

http://arxiv.org/pdf/1904.04971


Similar Posts

Comments