scispace - formally typeset
Proceedings ArticleDOI

A Quick Survey on Large Scale Distributed Deep Learning Systems

Reads0
Chats0
TLDR
A simple and quick survey on the distributed deep learning system from algorithm perspective, distributed system perspective and applications perspective, and brings analysis on the restricts and prospects of the distributed methods.
Abstract
Deep learning have been widely used in various fields and has worked very well as a major role. While the gradual penetration into various fields, data quantity of each applications is increasing tremendously, and so as the computation complexity and model parameters. As an obvious result, the training and inference is time consuming. For example, a classic Resnet50 classification model will be trained in 14 days on a NVIDIA M40 GPU with ImageNet data set. Thus, distributed acceleration is a very useful way to dispatch the computation of training and even inference to scale of nodes in parallel and accelerate the whole process. Facebook's work and UC Berkeley's acceleration can training the Resnet-50 model within hour and minutes by distributed deep learning algorithm and system, representatively. As other distributed accelerations, it gives a possibility to accelerate large models on large data sets from weeks to minutes, which gives researchers and developers more space to explore and search. However, besides acceleration, what other issues will be confronted of the distributed deep learning system? Where is the upper limit of acceleration? What application will acceleration be used for? What is the price and cost of acceleration? In this paper, we will take a simple and quick survey on the distributed deep learning system from algorithm perspective, distributed system perspective and applications perspective. We will present several recent excellent works, and bring analysis on the restricts and prospects of the distributed methods.

read more

Citations
More filters
Posted Content

Communication-Efficient Distributed Deep Learning: A Comprehensive Survey.

TL;DR: A comprehensive survey of the communication-efficient distributed training algorithms in both system-level and algorithmic-level optimizations is provided, which provides the readers to understand what algorithms are more efficient under specific distributed environments and extrapolate potential directions for further optimizations.
Journal ArticleDOI

Blockchain-Enabled Cross-Domain Object Detection for Autonomous Driving: A Model Sharing Approach

TL;DR: A novel blockchain-enabled model sharing approach is proposed to improve the performance of object detection with cross-domain adaptation for autonomous driving systems using a domain-adaptive you-only-look-once (YOLOv2) model.
Posted Content

Distributed Machine Learning for Wireless Communication Networks: Techniques, Architectures, and Applications

TL;DR: In this article, a comprehensive survey of distributed machine learning techniques for wireless networks is presented, focusing on power control, spectrum management, user association, and edge cloud computing, and potential adversarial attacks faced by DML applications.
Journal ArticleDOI

Distributed Machine Learning for Wireless Communication Networks: Techniques, Architectures, and Applications

TL;DR: The latest applications of DML in power control, spectrum management, user association, and edge cloud computing, and the potential adversarial attacks faced by DML applications are reviewed, and state-of-the-art countermeasures to preserve privacy and security are described.
Journal ArticleDOI

A Systematic Literature Review on Distributed Machine Learning in Edge Computing

TL;DR: The challenges of running ML/DL on edge devices in a distributed way are investigated, paying special attention to how techniques are adapted or designed to execute on these restricted devices.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Proceedings ArticleDOI

Learning Transferable Architectures for Scalable Image Recognition

TL;DR: NASNet as discussed by the authors proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset, which enables transferability and achieves state-of-the-art performance.
Related Papers (5)