scispace - formally typeset
Open AccessPosted Content

Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error

Reads0
Chats0
TLDR
A new theoretical framework for model compression is developed and a new pruning method called spectral pruning is proposed based on this framework, which defines the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes.
Abstract
Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound. However, there is still huge gap between a practically effective compression method and its rigorous background of statistical learning theory. To resolve this issue, we develop a new theoretical framework for model compression and propose a new pruning method called {\it spectral pruning} based on this framework. We define the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes and show that the compression ability is essentially controlled by this quantity. Moreover, we present a sharp generalization error bound of the compressed model and characterize the bias--variance tradeoff induced by the compression procedure. We apply our method to several datasets to justify our theoretical analyses and show the superiority of the the proposed method.

read more

Citations
More filters
Posted Content

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

TL;DR: A unified frame-work is given that can convert compression based bounds to those for non-compressed original networks and gives a data dependent generalization error bound which gives a tighter evaluation than the data independent ones.
Proceedings ArticleDOI

Rate-Distortion Theoretic Generalization Bounds for Stochastic Learning Algorithms

TL;DR: This study proves novel generalization bounds through the lens of rate-distortion theory, and explicitly relate the concepts of mutual information, compressibility, and fractal dimensions in a single mathematical framework.
Journal ArticleDOI

Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility

TL;DR: The infinite-width limit of deep feedforward neural networks whose weights are dependent, and modelled via a mixture of Gaussian distributions is studied, and it is shown that, in this regime, the weights are compressible, and feature learning is possible.
Posted Content

Understanding the Effects of Pre-Training for Object Detectors via Eigenspectrum

TL;DR: In this article, the eigenspectrum dynamics of the covariance matrix of each feature map in object detectors were analyzed and the effect of pre-training on the performance of object detectors was analyzed.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI

Densely Connected Convolutional Networks

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
Related Papers (5)