Open AccessPosted Content
Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error
Taiji Suzuki,Hiroshi Abe,Tomoya Murata,Shingo Horiuchi,Kotaro Ito,Tokuma Wachi,So Hirai,Masatoshi Yukishima,Tomoaki Nishimura +8 more
Reads0
Chats0
TLDR
A new theoretical framework for model compression is developed and a new pruning method called spectral pruning is proposed based on this framework, which defines the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes.Abstract:
Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound. However, there is still huge gap between a practically effective compression method and its rigorous background of statistical learning theory. To resolve this issue, we develop a new theoretical framework for model compression and propose a new pruning method called {\it spectral pruning} based on this framework. We define the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes and show that the compression ability is essentially controlled by this quantity. Moreover, we present a sharp generalization error bound of the compressed model and characterize the bias--variance tradeoff induced by the compression procedure. We apply our method to several datasets to justify our theoretical analyses and show the superiority of the the proposed method.read more
Citations
More filters
Posted Content
Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network
TL;DR: A unified frame-work is given that can convert compression based bounds to those for non-compressed original networks and gives a data dependent generalization error bound which gives a tighter evaluation than the data independent ones.
Proceedings ArticleDOI
Rate-Distortion Theoretic Generalization Bounds for Stochastic Learning Algorithms
TL;DR: This study proves novel generalization bounds through the lens of rate-distortion theory, and explicitly relate the concepts of mutual information, compressibility, and fractal dimensions in a single mathematical framework.
Journal ArticleDOI
Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility
TL;DR: The infinite-width limit of deep feedforward neural networks whose weights are dependent, and modelled via a mixture of Gaussian distributions is studied, and it is shown that, in this regime, the weights are compressible, and feature learning is possible.
Posted Content
Understanding the Effects of Pre-Training for Object Detectors via Eigenspectrum
TL;DR: In this article, the eigenspectrum dynamics of the covariance matrix of each feature map in object detectors were analyzed and the effect of pre-training on the performance of object detectors was analyzed.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI
Densely Connected Convolutional Networks
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Journal ArticleDOI
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.