scispace - formally typeset
Search or ask a question

Showing papers by "Shun-ichi Amari published in 1991"


Journal ArticleDOI
TL;DR: An information geometrical method, which can be applied to more general neural network manifolds, is proposed and the accuracy of statistical estimation is shown in terms of the dimensionality of a model and the number of examples.

66 citations


Journal ArticleDOI
TL;DR: In this paper, a higher-order asymptotic theory of sequential estimation is given in the framework of geometry of multidimensional curved exponential families, and a design principle of the second-order efficient sequential estimation procedure is also given.
Abstract: Sequential estimation continues observations until the observed sample satisfies a prescribed criterion. Its properties are superior on the average to those of nonsequential estimation in which the number of observations is fixed a priori. A higher-order asymptotic theory of sequential estimation is given in the framework of geometry of multidimensional curved exponential families. This gives a design principle of the second-order efficient sequential estimation procedure. It is also shown that a sequential estimation can be designed to have a covariance stabilizing effect at the same time.

37 citations


Journal ArticleDOI
TL;DR: The study of the retrieval dynamics shows that, as soon as the number of patterns stored exceeds some critical value, retrieval becomes limited to the states with the same activity as the prototype patterns.
Abstract: We consider very sparsely coded associative memories of binary neurons, for both Hebbian and covariant learning rules. We calculate explicitly their maximal capacity both in terms of patterns, and in terms of information content, taking into account the correlation of local fields, and we investigate its dependence on the degree of sparsity. The sparseness of the coding enhances both the memory capacity and the information capacity, whatever the chosen scheme. The study of the retrieval dynamics shows that, as soon as the number of patterns stored exceeds some critical value, retrieval becomes limited to the states with the same activity as the prototype patterns.

35 citations


Journal ArticleDOI
01 Feb 1991
TL;DR: A mathematical theory of learning is presented in a unified manner to be applicable to various architectures of networks based on parameter modification driven by a time series of input signals generated from a stochastic information source.
Abstract: A mathematical theory of learning is presented in a unified manner to be applicable to various architectures of networks. The theory is based on parameter modification driven by a time series of input signals generated from a stochastic information source. A network modifies its behavior such that it adapts to the environmental information structure. The theory is self-organization of a neural system. A typical discrete structure is automatically formed through continuous parameter modification by self-organization.

30 citations


Journal ArticleDOI
TL;DR: A sufficient condition for two hidden Markov processes based on different Markov chains to be equivalent as the stochastic process is given.
Abstract: The hidden Markov information source (process) is a stochastic process with a finite-state Markov chain behind it, and the state cannot directly be observed, while only a function of the state is observed as a stream of symbols. This kind of process is very important in both theory and application, but its theoretical structure has not been clarified. This paper gives a sufficient condition for two hidden Markov processes based on different Markov chains to be equivalent as the stochastic process. The condition for that condition to be necessary also is shown. A condition for a hidden Markov process to be equivalent to a Markov chain also is presented.

4 citations


Journal ArticleDOI
TL;DR: In this article, a Δ-subspace is constructed which gives the effective minimum degree of the hidden Markov process, and it is shown also that two stochastic processes are equivalent if and only if there exists an isomorphism between the Δ sub-spaces.
Abstract: There exists a class of stochastic processes based on a finite-state discrete-time Markov chain, where the states are not directly observable and only the output symbols generated by the states are observable. If different states produce the same output symbol, the process is called a hidden Markov process. It is very important both in theory and applications. It is known that two Markov processes with different transition matrices are equivalent hidden Markov processes. The problem has been considered interesting concerning by what structures such a situation can arise. This paper gives a complete solution to the identification problem of the hidden Markov process, which has long been proposed and is still unsolved. At the same time, a structure of this kind of process, which has been unknown, is revealed. An n algebraic technique is used in the discussion, where new concepts called Δ-subspace and Δ-cy-clic subspace are introduced into the framework of subspace and cyclic subspace in the usual linear algebra. In this paper, a Δ-subspace is actually constructed which gives the effective minimum degree of the hidden Markov process. It is shown also that two stochastic processes are equivalent if and only if there exists an isomorphism between the Δ-sub-spaces.

4 citations


Journal ArticleDOI
25 Jun 1991