Complexity of Linear Regions in Deep Networks

Open AccessPosted Content

Complexity of Linear Regions in Deep Networks

Boris Hanin, +1 more

- 25 Jan 2019 -

arXiv: Machine Learning

Chats0

TLDR

In this article, it is shown that the number of regions in a piecewise linear network grows linearly in the total number of neurons, far below the exponential upper bound, and that the average distance to the nearest region boundary at initialization scales like the inverse of the neurons.

Abstract:

It is well-known that the expressivity of a neural network depends on its architecture, with deeper networks expressing more complex functions. In the case of networks that compute piecewise linear functions, such as those with ReLU activation, the number of distinct linear regions is a natural measure of expressivity. It is possible to construct networks with merely a single region, or for which the number of linear regions grows exponentially with depth; it is not clear where within this range most networks fall in practice, either before or after training. In this paper, we provide a mathematical framework to count the number of linear regions of a piecewise linear network and measure the volume of the boundaries between these regions. In particular, we prove that for networks at initialization, the average number of regions along any one-dimensional subspace grows linearly in the total number of neurons, far below the exponential upper bound. We also find that the average distance to the nearest region boundary at initialization scales like the inverse of the number of neurons. Our theory suggests that, even after training, the number of linear regions is far below exponential, an intuition that matches our empirical observations. We conclude that the practical expressivity of neural networks is likely far below that of the theoretical maximum, and that this gap can be quantified.

Citations

PDF

Open Access

More filters

Posted Content

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks.

Keyulu Xu, +5 more

- 24 Sep 2020 -

arXiv: Learning

TL;DR: The success of GNNs in extrapolating algorithmic tasks to new data relies on encoding task-specific non-linearities in the architecture or features, and a hypothesis is suggested for which theoretical and empirical evidence is provided.

...read moreread less

Journal ArticleDOI

Model complexity of deep learning: a survey

Xia Hu, +4 more

- 22 Aug 2021 -

Knowledge and Information Systems

TL;DR: In this article, the authors conduct a systematic overview of the latest studies on model complexity in deep learning and propose several interesting future directions, including model generalization, model optimization, and model selection and design.

...read moreread less

Posted Content

Liquid Time-constant Networks

Ramin Hasani, +4 more

- 08 Jun 2020 -

arXiv: Learning

TL;DR: This work introduces a new class of time-continuous recurrent neural network models that construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates, and demonstrates the approximation capability of Liquid Time-Constant Networks (LTCs) compared to modern RNNs.

...read moreread less

Proceedings Article

Gradient Dynamics of Shallow Univariate ReLU Networks

Francis Williams, +5 more

TL;DR: A theoretical and empirical study of the gradient dynamics of overparameterized shallow ReLU networks with one-dimensional input, solving least-squares interpolation shows that learning in the kernel regime yields smooth interpolants, minimizing curvature, and reduces to cubic splines for uniform initializations.

...read moreread less

Proceedings ArticleDOI

Interpreting Deep Learning-Based Networking Systems

Zili Meng, +5 more

- 09 Oct 2019 -

arXiv: Networking and Internet Architect...

TL;DR: Metis as discussed by the authors proposes a framework that provides interpretability for two general categories of network problems spanning local and global control, and introduces two different interpretation methods based on decision tree and hypergraph, where it converts DNN policies to interpretable rule-based controllers and highlights critical components based on analysis over hypergraph.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

Understanding deep learning requires rethinking generalization

Chiyuan Zhang, +4 more

- 10 Nov 2016 -

arXiv: Learning

TL;DR: The authors showed that deep neural networks can fit a random labeling of the training data, and that this phenomenon is qualitatively unaffected by explicit regularization, and occurs even if the true images are replaced by completely unstructured random noise.

...read moreread less

Proceedings Article

Understanding deep learning requires rethinking generalization.

Chiyuan Zhang, +4 more

TL;DR: This article showed that deep neural networks can fit a random labeling of the training data, and that this phenomenon is qualitatively unaffected by explicit regularization, and occurs even if the true images are replaced by completely unstructured random noise.

...read moreread less

Proceedings Article

Do Deep Nets Really Need to be Deep

Jimmy Ba, +1 more

TL;DR: This paper empirically demonstrate that shallow feed-forward nets can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models.

...read moreread less

Posted Content

Do Deep Nets Really Need to be Deep

Lei Jimmy Ba, +1 more

- 21 Dec 2013 -

arXiv: Learning

TL;DR: This paper showed that shallow feed-forward networks can learn the complex functions previously learned by deep networks and achieve accuracies previously only achievable with deep models, and in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model.

...read moreread less

Proceedings Article

A closer look at memorization in deep networks

Devansh Arpit, +10 more

TL;DR: The analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization.

...read moreread less

Collapse

Complexity of Linear Regions in Deep Networks

Citations

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks.

Model complexity of deep learning: a survey

Liquid Time-constant Networks

Gradient Dynamics of Shallow Univariate ReLU Networks

Interpreting Deep Learning-Based Networking Systems

References

Understanding deep learning requires rethinking generalization

Understanding deep learning requires rethinking generalization.

Do Deep Nets Really Need to be Deep

Do Deep Nets Really Need to be Deep

A closer look at memorization in deep networks

Related Papers (5)

On the expressive power of deep neural networks

On the Number of Linear Regions of Deep Neural Networks

Deep Residual Learning for Image Recognition

Approximation by superpositions of a sigmoidal function

Multilayer feedforward networks are universal approximators