scispace - formally typeset
Search or ask a question

Showing papers by "Andrew Howard published in 2017"


Posted Content
TL;DR: This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.
Abstract: We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.

14,406 citations


Posted Content
TL;DR: In this article, the authors proposed a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating-point inference on commonly available integer only hardware.
Abstract: The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

42 citations


Patent
06 Oct 2017
TL;DR: In this article, a neural network system is configured to receive an input image and to generate a classification output for the input image using a separable convolution subnetwork comprising a plurality of separable CNN layers arranged in a stack one after the other.
Abstract: A neural network system is configured to receive an input image and to generate a classification output for the input image. The neural network system includes: a separable convolution subnetwork comprising a plurality of separable convolutional neural network layers arranged in a stack one after the other, in which each separable convolutional neural network layer is configured to: separately apply both a depthwise convolution and a pointwise convolution during processing of an input to the separable convolutional neural network layer to generate a layer output.

7 citations