Top 6 papers published by Andrew Howard from Google in 2021

Proceedings Article•DOI•

BasisNet: Two-stage Model Synthesis for Efficient Inference

[...]

Mingda Zhang¹, Chun-Te Chu², Andrey Zhmoginov², Andrew Howard², Brendan Jou², Yukun Zhu², Li Zhang², Rebecca Hwa¹, Adriana Kovashka¹ - Show less +5 more•Institutions (2)

University of Pittsburgh¹, Google²

04 May 2021

TL;DR: BasisNet as discussed by the authors proposes a lightweight model to preview the input and generate input-dependent combination coefficients, which later controls the synthesis of a more accurate specialist model to make final prediction.

...read moreread less

Abstract: In this work, we present BasisNet which combines recent advancements in efficient neural network architectures, conditional computation, and early termination in a simple new form. Our approach incorporates a lightweight model to preview the input and generate input-dependent combination coefficients, which later controls the synthesis of a more accurate specialist model to make final prediction. The two-stage model synthesis strategy can be applied to any network architectures and both stages are jointly trained. We also show that proper training recipes are critical for increasing generalizability for such high capacity neural networks. On ImageNet classification benchmark, our BasisNet with MobileNets as backbone demonstrated clear advantage on accuracy-efficiency trade-off over several strong baselines. Specifically, BasisNet-MobileNetV3 obtained 80.3% top-1 accuracy with only 290M Multiply-Add operations, halving the computational cost of previous state-of-the-art without sacrificing accuracy. With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80.0% on ImageNet.

...read moreread less

4 citations

Proceedings Article•DOI•

Discovering Multi-Hardware Mobile Models via Architecture Search

[...]

Grace Chu¹, Okan Arikan¹, Gabriel Bender¹, Weijun Wang¹, Achille Brighton¹, Pieter-Jan Kindermans¹, Hanxiao Liu¹, Berkin Akin¹, Suyog Gupta¹, Andrew Howard¹ - Show less +6 more•Institutions (1)

Google¹

01 Jun 2021

TL;DR: In this paper, a multi-hardware neural architecture is proposed for applications that may be deployed on multiple hardware, where a single architecture is developed for multiple hardware and duplicates engineering work for debugging and fixing.

...read moreread less

Abstract: Hardware-aware neural architecture designs have been predominantly focusing on optimizing model performance on single hardware and model development complexity, where another important factor, model deployment complexity, has been largely ignored. In this paper, we argue that, for applications that may be deployed on multiple hardware, having different single-hardware models across the deployed hardware makes it hard to guarantee consistent outputs across hardware and duplicates engineering work for debugging and fixing. To minimize such deployment cost, we propose an alternative solution, multi-hardware models, where a single architecture is developed for multiple hardware. With thoughtful search space design and incorporating the proposed multi-hardware metrics in neural architecture search, we discover multi-hardware models that give state-of-the-art (SoTA) performance across multiple hardware in both average and worse case scenarios. For performance on individual hardware, the single multi-hardware model yields similar or better results than SoTA performance on accelerators like GPU, DSP and EdgeTPU which was achieved by different models, while having similar performance with MobilenetV3 Large Minimalistic model on mobile CPU. 1

...read moreread less

3 citations

Proceedings Article•DOI•

Multi-path Neural Networks for On-device Multi-domain Visual Classification

[...]

Qifei Wang¹, Junjie Ke¹, Joshua Greaves¹, Grace Chu¹, Gabriel Bender¹, Luciano Sbaiz¹, Alec Go¹, Andrew Howard¹, Ming-Hsuan Yang¹, Jeff Gilbert¹, Peyman Milanfar¹, Feng Yang¹ - Show less +8 more•Institutions (1)

Google¹

01 Jan 2021

TL;DR: In this paper, a multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space.

...read moreread less

Abstract: Learning multiple domains/tasks with a single model is important for improving data efficiency and lowering inference cost for numerous vision tasks, especially on resource-constrained mobile devices. However, hand-crafting a multi-domain/task model can be both tedious and challenging. This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices. The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space. An adaptive balanced domain prioritization algorithm is proposed to balance optimizing the joint model on multiple domains simultaneously. The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths. This approach effectively reduces the total number of parameters and FLOPS, encouraging positive knowledge transfer while mitigating negative interference across domains. Extensive evaluations on the Visual Decathlon dataset demonstrate that the proposed multi-path model achieves state-of-the-art performance in terms of accuracy, model size, and FLOPS against other approaches using MobileNetV3-like architectures. Furthermore, the proposed method improves average accuracy over learning single-domain models individually, and reduces the total number of parameters and FLOPS by 78% and 32% respectively, compared to the approach that simply bundles single-domain models for multi-domain learning.

...read moreread less

2 citations

Posted Content•

BasisNet: Two-stage Model Synthesis for Efficient Inference

[...]

Mingda Zhang¹, Chun-Te Chu¹, Andrey Zhmoginov¹, Andrew Howard², Brendan Jou², Yukun Zhu², Li Zhang², Rebecca Hwa², Adriana Kovashka² - Show less +5 more•Institutions (2)

University of Pittsburgh¹, Google²

07 May 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: BasisNet as mentioned in this paper proposes a lightweight model to preview the input and generate input-dependent combination coefficients, which later controls the synthesis of a more accurate specialist model to make final prediction.

...read moreread less

Abstract: In this work, we present BasisNet which combines recent advancements in efficient neural network architectures, conditional computation, and early termination in a simple new form. Our approach incorporates a lightweight model to preview the input and generate input-dependent combination coefficients, which later controls the synthesis of a more accurate specialist model to make final prediction. The two-stage model synthesis strategy can be applied to any network architectures and both stages are jointly trained. We also show that proper training recipes are critical for increasing generalizability for such high capacity neural networks. On ImageNet classification benchmark, our BasisNet with MobileNets as backbone demonstrated clear advantage on accuracy-efficiency trade-off over several strong baselines. Specifically, BasisNet-MobileNetV3 obtained 80.3% top-1 accuracy with only 290M Multiply-Add operations, halving the computational cost of previous state-of-the-art without sacrificing accuracy. With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80.0% on ImageNet.

...read moreread less

1 citations

Posted Content•

Bridging the Gap Between Object Detection and User Intent via Query-Modulation.

[...]

Marco Fornoni¹, Chaochao Yan², Liangchen Luo¹, Kimberly Wilber¹, Alex Stark¹, Yin Cui¹, Boqing Gong¹, Andrew Howard¹ - Show less +4 more•Institutions (2)

Google¹, University of Texas at Arlington²

18 Jun 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the authors investigate techniques to modulate standard object detectors to explicitly account for user intent, expressed as an embedding of a simple query, and show superior performance at detecting objects for a given label of interest.

...read moreread less

Abstract: When interacting with objects through cameras, or pictures, users often have a specific intent. For example, they may want to perform a visual search. However, most object detection models ignore the user intent, relying on image pixels as their only input. This often leads to incorrect results, such as lack of a high-confidence detection on the object of interest, or detection with a wrong class label. In this paper we investigate techniques to modulate standard object detectors to explicitly account for the user intent, expressed as an embedding of a simple query. Compared to standard object detectors, query-modulated detectors show superior performance at detecting objects for a given label of interest. Thanks to large-scale training data synthesized from standard object detection annotations, query-modulated detectors can also outperform specialized referring expression recognition systems. Furthermore, they can be simultaneously trained to solve for both query-modulated detection and standard object detection.

...read moreread less

Posted Content•

SpotPatch: Parameter-Efficient Transfer Learning for Mobile Object Detection

[...]

Keren Ye¹, Adriana Kovashka¹, Mark Sandler², Menglong Zhu, Andrew Howard², Marco Fornoni² - Show less +2 more•Institutions (2)

University of Pittsburgh¹, Google²

04 Jan 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors address the question: can task-specific detectors be trained and represented as a shared set of weights, plus a very small set of additional weights for each task?

...read moreread less

Abstract: Deep learning based object detectors are commonly deployed on mobile devices to solve a variety of tasks. For maximum accuracy, each detector is usually trained to solve one single specific task, and comes with a completely independent set of parameters. While this guarantees high performance, it is also highly inefficient, as each model has to be separately downloaded and stored. In this paper we address the question: can task-specific detectors be trained and represented as a shared set of weights, plus a very small set of additional weights for each task? The main contributions of this paper are the following: 1) we perform the first systematic study of parameter-efficient transfer learning techniques for object detection problems; 2) we propose a technique to learn a model patch with a size that is dependent on the difficulty of the task to be learned, and validate our approach on 10 different object detection tasks. Our approach achieves similar accuracy as previously proposed approaches, while being significantly more compact.

...read moreread less

Showing papers by "Andrew Howard published in 2021"