scispace - formally typeset
Search or ask a question
Author

Shun-ichi Amari

Bio: Shun-ichi Amari is an academic researcher from RIKEN Brain Science Institute. The author has contributed to research in topics: Artificial neural network & Information geometry. The author has an hindex of 90, co-authored 495 publications receiving 40383 citations. Previous affiliations of Shun-ichi Amari include University of the West of Scotland & Shimadzu Corp..


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors used information geometry to calculate the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the spaces of linear dynamical systems for blind source deconvolution, and proved that Fisher efficient online learning has asymptotically the same performance as the optimal batch estimation of parameters.
Abstract: When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction, but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon, which appears in the backpropagation learning algorithm of multilayer perceptrons, might disappear or might not be so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.

2,504 citations

Proceedings Article
27 Nov 1995
TL;DR: A new on-line learning algorithm which minimizes a statistical dependency among outputs is derived for blind separation of mixed signals and has an equivariant property and is easily implemented on a neural network like model.
Abstract: A new on-line learning algorithm which minimizes a statistical dependency among outputs is derived for blind separation of mixed signals. The dependency is measured by the average mutual information (MI) of the outputs. The source signals and the mixing matrix are unknown except for the number of the sources. The Gram-Charlier expansion instead of the Edgeworth expansion is used in evaluating the MI. The natural gradient approach is used to minimize the MI. A novel activation function is proposed for the on-line learning algorithm which has an equivariant property and is easily implemented on a neural network like model. The validity of the new learning algorithm are verified by computer simulations.

2,145 citations

Book
12 Oct 2009
TL;DR: This book provides a broad survey of models and efficient algorithms for Nonnegative Matrix Factorization (NMF), including NMFs various extensions and modifications, especially Nonnegative Tensor Factorizations (NTF) and Nonnegative Tucker Decompositions (NTD).
Abstract: This book provides a broad survey of models and efficient algorithms for Nonnegative Matrix Factorization (NMF) This includes NMFs various extensions and modifications, especially Nonnegative Tensor Factorizations (NTF) and Nonnegative Tucker Decompositions (NTD) NMF/NTF and their extensions are increasingly used as tools in signal and image processing, and data analysis, having garnered interest due to their capability to provide new insights and relevant information about the complex latent relationships in experimental data sets It is suggested that NMF can provide meaningful components with physical interpretations; for example, in bioinformatics, NMF and its extensions have been successfully applied to gene expression, sequence analysis, the functional characterization of genes, clustering and text mining As such, the authors focus on the algorithms that are most useful in practice, looking at the fastest, most robust, and suitable for large-scale models Key features: Acts as a single source reference guide to NMF, collating information that is widely dispersed in current literature, including the authors own recently developed techniques in the subject area Uses generalized cost functions such as Bregman, Alpha and Beta divergences, to present practical implementations of several types of robust algorithms, in particular Multiplicative, Alternating Least Squares, Projected Gradient and Quasi Newton algorithms Provides a comparative analysis of the different methods in order to identify approximation error and complexity Includes pseudo codes and optimized MATLAB source codes for almost all algorithms presented in the book The increasing interest in nonnegative matrix and tensor factorizations, as well as decompositions and sparse representation of data, will ensure that this book is essential reading for engineers, scientists, researchers, industry practitioners and graduate students across signal and image processing; neuroscience; data mining and data analysis; computer science; bioinformatics; speech processing; biomedical engineering; and multimedia

2,136 citations

Journal ArticleDOI
TL;DR: The dynamics of pattern formation is studied for lateral-inhibition type homogeneous neural fields with general connections and it is proved that there are five types of pattern dynamics.
Abstract: The dynamics of pattern formation is studied for lateral-inhibition type homogeneous neural fields with general connections. Neural fields consisting of single layer are first treated, and it is proved that there are five types of pattern dynamics. The type of the dynamics of a field depends not only on the mutual connections within the field but on the level of homogeneous stimulus given to the field. An example of the dynamics is as follows: A fixed size of localized excitation, once evoked by stimulation, can be retained in the field persistently even after the stimulation vanishes. It moves until it finds the position of the maximum of the input stimulus. Fields consisting of an excitatory and an inhibitory layer are next analyzed. In addition to stationary localized excitation, fields have such pattern dynamics as production of oscillatory waves, travelling waves, active and dual active transients, etc.

1,996 citations


Cited by
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations

Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: EELAB as mentioned in this paper is a toolbox and graphic user interface for processing collections of single-trial and/or averaged EEG data of any number of channels, including EEG data, channel and event information importing, data visualization (scrolling, scalp map and dipole model plotting, plus multi-trial ERP-image plots), preprocessing (including artifact rejection, filtering, epoch selection, and averaging), Independent Component Analysis (ICA) and time/frequency decomposition including channel and component cross-coherence supported by bootstrap statistical methods based on data resampling.

17,362 citations