Proceedings ArticleDOI
Machine Learning at Facebook: Understanding Inference at the Edge
Carole-Jean Wu,David Brooks,Kevin Chen,Douglas Chen,Sy Choudhury,Marat Dukhan,Kim Hazelwood,Eldad Isaac,Yangqing Jia,Bill Jia,Tommer Leyvand,Hao Lu,Yang Lu,Lin Qiao,Brandon Reagen,Joe Spisak,Fei Sun,Andrew Tulloch,Peter Vajda,Xiaodong Wang,Yanghan Wang,Bram Wasti,Yiming Wu,Ran Xian,Sungjoo Yoo,Sungjoo Yoo,Peizhao Zhang +26 more
- pp 331-344
Reads0
Chats0
TLDR
This paper takes a datadriven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.Abstract:
At Facebook, machine learning provides a wide range of capabilities that drive many aspects of user experience including ranking posts, content understanding, object detection and tracking for augmented and virtual reality, speech and text translations. While machine learning models are currently trained on customized datacenter infrastructure, Facebook is working to bring machine learning inference to the edge. By doing so, user experience is improved with reduced latency (inference time) and becomes less dependent on network connectivity. Furthermore, this also enables many more applications of deep learning with important features only made available at the edge. This paper takes a datadriven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.read more
Citations
More filters
Journal ArticleDOI
Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing
TL;DR: A comprehensive survey of the recent research efforts on edge intelligence can be found in this paper, where the authors review the background and motivation for AI running at the network edge and provide an overview of the overarching architectures, frameworks, and emerging key technologies for deep learning model toward training/inference at the edge.
Posted Content
Edge Intelligence: Paving the Last Mile of Artificial Intelligence with Edge Computing
TL;DR: A comprehensive survey of the recent research efforts on EI is conducted, which provides an overview of the overarching architectures, frameworks, and emerging key technologies for deep learning model toward training/inference at the network edge.
Proceedings ArticleDOI
The Architectural Implications of Facebook's DNN-Based Personalized Recommendation
Udit Gupta,Carole-Jean Wu,Xiaodong Wang,Maxim Naumov,Brandon Reagen,David Brooks,Bradford Cottel,Kim Hazelwood,Mark Hempstead,Bill Jia,Hsien-Hsin S. Lee,Andrey Malevich,Dheevatsa Mudigere,Mikhail Smelyanskiy,Liang Xiong,Xuan Zhang +15 more
TL;DR: A set of real-world, production-scale DNNs for personalized recommendation coupled with relevant performance metrics for evaluation are presented and in-depth analysis is conducted that underpins future system design and optimization for at-scale recommendation.
Proceedings ArticleDOI
SPINN: synergistic progressive inference of neural networks over device and cloud
TL;DR: SPINN is proposed, a distributed inference system that employs synergistic device-cloud computation together with a progressive inference method to deliver fast and robust CNN inference across diverse settings, and provides robust operation under uncertain connectivity conditions and significant energy savings compared to cloud-centric execution.
Proceedings ArticleDOI
Sparse Tensor Core: Algorithm and Hardware Co-Design for Vector-wise Sparse Neural Networks on Modern GPUs
TL;DR: A novel pruning algorithm is devised to improve the workload balance and reduce the decoding overhead of the sparse neural networks and new instructions and micro-architecture optimization are proposed in Tensor Core to adapt to the structurally sparse Neural networks.
References
More filters
Book ChapterDOI
U-Net: Convolutional Networks for Biomedical Image Segmentation
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Proceedings ArticleDOI
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Posted Content
U-Net: Convolutional Networks for Biomedical Image Segmentation
TL;DR: It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Proceedings ArticleDOI
Mask R-CNN
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Proceedings Article
Mask R-CNN
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.