Random matrices and complexity of spin glasses
TLDR
This study enables detailed information about the bottom of the energy landscape, including the absolute minimum, and the other local minima, and describes an interesting layered structure of the low critical values for the Hamiltonians of these models.Abstract:
CERN ´ Y Abstract. We give an asymptotic evaluation of the complexity of spherical p-spin spin- glass models via random matrix theory. This study enables us to obtain detailed infor- mation about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We also show that our ap- proach allows us to compute the related TAP-complexity and extend the results known in the physics literature. As an independent tool, we prove a LDP for the k-th largest eigenvalue of the GOE, extending the results of (BDG01). How many critical values of given index and below a given level does a typical random Morse function have on a high dimensional manifold? Our work addresses this question in a very special case. We look at certain natural random Gaussian functions on the N- dimensional sphere known as p-spin spherical spin glass models. We cannot yet answer the question above about the typical number, but we can study thoroughly the mean number, which we show is exponentially large in N. We introduce a new identity, based on the classical Kac-Rice formula, relating random matrix theory and the problem of counting these critical values. Using this identity and tools from random matrix theory, we give an asymptotic evaluation of the complexity of these spherical spin-glass models. The complexity mentioned here is defined as the mean number of critical points of given index whose value is below (or above) a given level. This includes the important question of counting the mean number of local minima below a given level, and in particular the question of finding the ground state energy (the minimal value of the Hamiltonian). We show that this question is directly related to the study of the edge of the spectrum of the Gaussian Orthogonal Ensemble (GOE). The question of computing the complexity of mean-field spin glass models has recently been thoroughly studied in the physics literature (see for example (CLR03) and the refer- ences therein), mainly for a different measure of the complexity, i.e. the mean number of solutions to the Thouless-Anderson-Palmer equations, or TAP-complexity. Our approach to the complexity enables us to recover known results in the physics literature about TAP-complexity, to compute the ground state energy (when p is even), and to describe an interesting layered structure of the low energy levels of the Hamiltonians of these models, which might prove useful for the study of the metastability of Langevin dynamics for these models (in longer time scales than those studied in (BDG01)). The paper is organised as follows. In Section 2, we give our main results. In Section 3, we prove two main formulas (Theorem 2.1 and 2.2), relating random matrix theory (specif- ically the GOE) and spherical spin glasses. These formulas are direct consequences of the Kac-Rice formula (we learned the version needed here in the book (AT07), for another modern account see (AW09)). The main new ingredient is the fact that, for spherical spin- glass models, the Hessian of the Hamiltonian at a critical point, conditioned on the value of the Hamiltonian, is a symmetric Gaussian random matrix with independent entries (up to symmetry) plus a diagonal matrix. This implies, in particular, that it is possible toread more
Citations
More filters
Proceedings Article
The Loss Surfaces of Multilayer Networks
TL;DR: In this paper, the authors study the connection between the loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of variable independence, redundancy in network parametrization, and uniformity.
Proceedings Article
Gradient Descent Only Converges to Minimizers
TL;DR: The authors showed that gradient descent converges to a local minimizer almost surely with random initialization by applying the Stable Manifold Theorem from dynamical systems theory, which is proved by applying stable manifold theorem to gradient descent.
Posted Content
Identity Matters in Deep Learning
Moritz Hardt,Tengyu Ma +1 more
TL;DR: This work gives a strikingly simple proof that arbitrarily deep linear residual networks have no spurious local optima and shows that residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size.
Posted Content
Gradient Descent Converges to Minimizers.
TL;DR: It is shown that gradient descent converges to a local minimizer, almost surely with random initialization, by applying the Stable Manifold Theorem from dynamical systems theory.
Posted Content
Neural networks as Interacting Particle Systems: Asymptotic convexity of the Loss Landscape and Universal Scaling of the Approximation Error
TL;DR: A Law of Large Numbers and a Central Limit Theorem for the empirical distribution are established, which together show that the approximation error of the network universally scales as O(n-1) and the scale and nature of the noise introduced by stochastic gradient descent are quantified.
References
More filters
Book
Large Deviations Techniques and Applications
Amir Dembo,Ofer Zeitouni +1 more
TL;DR: The LDP for Abstract Empirical Measures and applications-The Finite Dimensional Case and Applications of Empirically Measures LDP are presented.
Book
Random Fields and Geometry
Robert J. Adler,Jonathan Taylor +1 more
TL;DR: Random Fields and Geometry as discussed by the authors is a comprehensive survey of the general theory of Gaussian random fields with a focus on geometric problems arising in the study of random fields, including continuity and boundedness, entropy and majorizing measures, Borell and Slepian inequalities.
Journal ArticleDOI
Strong asymptotics of orthogonal polynomials with respect to exponential weights
TL;DR: In this paper, asymptotics of orthogonal polynomials with respect to weights w(x)dx = e Q(x)-dx on the real line were considered.
BookDOI
Level sets and extrema of random processes and fields
Jean-Marc Azaïs,Mario Wschebor +1 more
TL;DR: In this article, the authors present a generalization of the Rice series for Gaussian processes with continuous paths and show that it is invariant under orthogonal transformations and translations.
Journal ArticleDOI
Large deviations for Wigner's law and Voiculescu's non-commutative entropy
G. Ben Arous,Alice Guionnet +1 more
TL;DR: In this paper, the authors studied the spectral measure of Gaussian Wigner's matrices and proved that it satisfies the large deviation principle and showed that the good rate function which governs this principle achieves its minimum value at Wigners semicircular law.