Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Open AccessProceedings Article

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Urs Köster, +13 more

- Vol. 30, pp 1742-1752

Chats0

TLDR

The 16-bit Flexpoint data format as discussed by the authors is a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications.

Abstract:

Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the \emph{neon} deep learning framework. We demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters. Our results suggest Flexpoint as a promising numerical format for future hardware for training and inference.

Citations

PDF

Open Access

More filters

Proceedings Article

Zero-Shot Text-to-Image Generation

Aditya Ramesh, +7 more

TL;DR: This work describes a simple approach based on a transformer that autoregressively models the text and image tokens as a single stream of data that is competitive with previous domain-specific models when evaluated in a zero-shot fashion.

...read moreread less

Journal ArticleDOI

Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey

Lei Deng, +4 more

TL;DR: This article reviews the mainstream compression approaches such as compact model, tensor decomposition, data quantization, and network sparsification, and answers the question of how to leverage these methods in the design of neural network accelerators and present the state-of-the-art hardware architectures.

...read moreread less

Journal ArticleDOI

Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis

Tal Ben-Nun, +1 more

- 30 Aug 2019 -

ACM Computing Surveys

TL;DR: The problem of parallelization in DNNs is described from a theoretical perspective, followed by approaches for its parallelization, and potential directions for parallelism in deep learning are extrapolated.

...read moreread less

Proceedings ArticleDOI

Data-Free Quantization Through Weight Equalization and Bias Correction

Markus Nagel, +3 more

TL;DR: This work introduces a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection, and achieves near-original model performance on common computer vision architectures and tasks.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Posted Content

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Martín Abadi, +39 more

- 01 Jan 2015 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

Book ChapterDOI

Identity Mappings in Deep Residual Networks

Kaiming He, +3 more

TL;DR: In this paper, the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation.

...read moreread less

Posted Content

GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium

Martin Heusel, +5 more

- 26 Jun 2017 -

arXiv: Learning

TL;DR: In this article, a two time-scale update rule (TTUR) was proposed for training GANs with stochastic gradient descent on arbitrary GAN loss functions, which has an individual learning rate for both the discriminator and the generator.

...read moreread less

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Citations

Zero-Shot Text-to-Image Generation

Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey

A configurable cloud-scale DNN processor for real-time AI

Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis

Data-Free Quantization Through Weight Equalization and Bias Correction

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Identity Mappings in Deep Residual Networks

GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium

Related Papers (5)

Deep Residual Learning for Image Recognition

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

ImageNet Classification with Deep Convolutional Neural Networks

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Trending Questions (1)