Generalization Error of Limear Neural Networks in Unidentifiable Cases

doi:10.1007/3-540-46769-6_5

Book ChapterDOI

Generalization Error of Limear Neural Networks in Unidentifiable Cases

Kenji Fukumizu

- pp 51-62

Chats0

TLDR

It is shown that the expectation of the generalization error in the unidentifiable cases is larger than what is given by the usual asymptotic theory, and dependent on the rank of the target function.

Abstract:

The statistical asymptotic theory is often used in theoretical results in computational and statistical learning theory It describes the limiting distribution of the maximum likelihood estimator (MLE) as an normal distribution However, in layered models such as neural networks, the regularity condition of the asymptotic theory is not necessarily satisfied The true parameter is not identifiable, if the target function can be realized by a network of smaller size than the size of the model There has been little known on the behavior of the MLE in these cases of neural networks In this paper, we analyze the expectation of the generalization error of three-layer linear neural networks, and elucidate a strange behavior in unidentifiable cases We show that the expectation of the generalization error in the unidentifiable cases is larger than what is given by the usual asymptotic theory, and dependent on the rank of the target function

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Algebraic Analysis for Nonidentifiable Learning Machines

Sumio Watanabe

- 01 Apr 2001 -

Neural Computation

TL;DR: It is rigorously proved that the Bayesian stochastic complexity or the free energy is asymptotically equal to 1 logn (m1 1) loglogn + constant, where n is the number of training samples and 1 and m1 are the rational number and the natural number, which are determined as the birational invariant values of the singularities in the parameter space.

...read moreread less

Proceedings Article

Statistical Performance of Convex Tensor Decomposition

Ryota Tomioka, +3 more

TL;DR: Under some conditions that the mean squared error of the convex method scales linearly with the quantity the authors call the normalized rank of the true tensor, which naturally extends the analysis of convex low-rank matrix estimation to tensors.

...read moreread less

Journal ArticleDOI

Singularities Affect Dynamics of Learning in Neuromanifolds

Shun-ichi Amari, +2 more

- 01 May 2006 -

Neural Computation

TL;DR: An overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures is given and the natural gradient method is shown to perform well because it takes the singular geometrical structure into account.

...read moreread less

BookDOI

Machine Learning: ECML 2000

Ramon López de Mántaras, +1 more

TL;DR: This talk describes how information about the search process can be taken into account when evaluating hypotheses, and how the expected generalization error of a hypothesis is computed as a function of the search steps leading to it.

...read moreread less

Journal ArticleDOI

Likelihood ratio of unidentifiable models and multilayer neural networks

Kenji Fukumizu

- 01 Jun 2003 -

Annals of Statistics

TL;DR: The behavior of the maximum likelihood estimator (MLE), in the case that the true parameter cannot be identified uniquely, is discussed, and a larger order is proved if the true function is given by a smaller model.

...read moreread less

References

PDF

Open Access

More filters

Book

Perturbation theory for linear operators

Tosio Kato

TL;DR: The monograph by T Kato as discussed by the authors is an excellent reference work in the theory of linear operators in Banach and Hilbert spaces and is a thoroughly worthwhile reference work both for graduate students in functional analysis as well as for researchers in perturbation, spectral, and scattering theory.

...read moreread less

Journal ArticleDOI

The Strong Limits of Random Matrix Spectra for Sample Matrices of Independent Elements

Kenneth W. Wachter

- 01 Feb 1978 -

Annals of Probability

TL;DR: In this paper, the authors prove almost-sure convergence of the empirical measure of the normalized singular values of increasing rectangular submatrices of an infinite random matrix of independent elements, where the matrix elements are required to have uniformly bounded central $2 + εth moments, and the same means and variances within a row.

...read moreread less

Journal ArticleDOI

Learning in linear neural networks: a survey

Pierre Baldi, +1 more

- 01 Jul 1995 -

IEEE Transactions on Neural Networks

TL;DR: Most of the known results on linear networks, including backpropagation learning and the structure of the error function landscape, the temporal evolution of generalization, and unsupervised learning algorithms and their properties are surveyed.

...read moreread less

Book

Multivariate reduced-rank regression

Neil H. Timm

TL;DR: Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

...read moreread less

Journal ArticleDOI

A regularity condition of the information matrix of a multilayer perceptron network

Kenji Fukumizu

- 01 Jul 1996 -

Neural Networks

TL;DR: This paper proves rigorously that the Fisher information matrix of a three-layer perceptron network is positive definite if and only if the network is irreducible.

...read moreread less