scispace - formally typeset
Search or ask a question
Book

Self Organization And Associative Memory

01 Jan 1984-
TL;DR: The purpose and nature of Biological Memory, as well as some of the aspects of Memory Aspects, are explained.
Abstract: 1. Various Aspects of Memory.- 1.1 On the Purpose and Nature of Biological Memory.- 1.1.1 Some Fundamental Concepts.- 1.1.2 The Classical Laws of Association.- 1.1.3 On Different Levels of Modelling.- 1.2 Questions Concerning the Fundamental Mechanisms of Memory.- 1.2.1 Where Do the Signals Relating to Memory Act Upon?.- 1.2.2 What Kind of Encoding is Used for Neural Signals?.- 1.2.3 What are the Variable Memory Elements?.- 1.2.4 How are Neural Signals Addressed in Memory?.- 1.3 Elementary Operations Implemented by Associative Memory.- 1.3.1 Associative Recall.- 1.3.2 Production of Sequences from the Associative Memory.- 1.3.3 On the Meaning of Background and Context.- 1.4 More Abstract Aspects of Memory.- 1.4.1 The Problem of Infinite-State Memory.- 1.4.2 Invariant Representations.- 1.4.3 Symbolic Representations.- 1.4.4 Virtual Images.- 1.4.5 The Logic of Stored Knowledge.- 2. Pattern Mathematics.- 2.1 Mathematical Notations and Methods.- 2.1.1 Vector Space Concepts.- 2.1.2 Matrix Notations.- 2.1.3 Further Properties of Matrices.- 2.1.4 Matrix Equations.- 2.1.5 Projection Operators.- 2.1.6 On Matrix Differential Calculus.- 2.2 Distance Measures for Patterns.- 2.2.1 Measures of Similarity and Distance in Vector Spaces.- 2.2.2 Measures of Similarity and Distance Between Symbol Strings.- 2.2.3 More Accurate Distance Measures for Text.- 3. Classical Learning Systems.- 3.1 The Adaptive Linear Element (Adaline).- 3.1.1 Description of Adaptation by the Stochastic Approximation.- 3.2 The Perceptron.- 3.3 The Learning Matrix.- 3.4 Physical Realization of Adaptive Weights.- 3.4.1 Perceptron and Adaline.- 3.4.2 Classical Conditioning.- 3.4.3 Conjunction Learning Switches.- 3.4.4 Digital Representation of Adaptive Circuits.- 3.4.5 Biological Components.- 4. A New Approach to Adaptive Filters.- 4.1 Survey of Some Necessary Functions.- 4.2 On the "Transfer Function" of the Neuron.- 4.3 Models for Basic Adaptive Units.- 4.3.1 On the Linearization of the Basic Unit.- 4.3.2 Various Cases of Adaptation Laws.- 4.3.3 Two Limit Theorems.- 4.3.4 The Novelty Detector.- 4.4 Adaptive Feedback Networks.- 4.4.1 The Autocorrelation Matrix Memory.- 4.4.2 The Novelty Filter.- 5. Self-Organizing Feature Maps.- 5.1 On the Feature Maps of the Brain.- 5.2 Formation of Localized Responses by Lateral Feedback.- 5.3 Computational Simplification of the Process.- 5.3.1 Definition of the Topology-Preserving Mapping.- 5.3.2 A Simple Two-Dimensional Self-Organizing System.- 5.4 Demonstrations of Simple Topology-Preserving Mappings.- 5.4.1 Images of Various Distributions of Input Vectors.- 5.4.2 "The Magic TV".- 5.4.3 Mapping by a Feeler Mechanism.- 5.5 Tonotopic Map.- 5.6 Formation of Hierarchical Representations.- 5.6.1 Taxonomy Example.- 5.6.2 Phoneme Map.- 5.7 Mathematical Treatment of Self-Organization.- 5.7.1 Ordering of Weights.- 5.7.2 Convergence Phase.- 5.8 Automatic Selection of Feature Dimensions.- 6. Optimal Associative Mappings.- 6.1 Transfer Function of an Associative Network.- 6.2 Autoassociative Recall as an Orthogonal Projection.- 6.2.1 Orthogonal Projections.- 6.2.2 Error-Correcting Properties of Projections.- 6.3 The Novelty Filter.- 6.3.1 Two Examples of Novelty Filter.- 6.3.2 Novelty Filter as an Autoassociative Memory.- 6.4 Autoassociative Encoding.- 6.4.1 An Example of Autoassociative Encoding.- 6.5 Optimal Associative Mappings.- 6.5.1 The Optimal Linear Associative Mapping.- 6.5.2 Optimal Nonlinear Associative Mappings.- 6.6 Relationship Between Associative Mapping, Linear Regression, and Linear Estimation.- 6.6.1 Relationship of the Associative Mapping to Linear Regression.- 6.6.2 Relationship of the Regression Solution to the Linear Estimator.- 6.7 Recursive Computation of the Optimal Associative Mapping.- 6.7.1 Linear Corrective Algorithms.- 6.7.2 Best Exact Solution (Gradient Projection).- 6.7.3 Best Approximate Solution (Regression).- 6.7.4 Recursive Solution in the General Case.- 6.8 Special Cases.- 6.8.1 The Correlation Matrix Memory.- 6.8.2 Relationship Between Conditional Averages and Optimal Estimator.- 7. Pattern Recognition.- 7.1 Discriminant Functions.- 7.2 Statistical Formulation of Pattern Classification.- 7.3 Comparison Methods.- 7.4 The Subspace Methods of Classification.- 7.4.1 The Basic Subspace Method.- 7.4.2 The Learning Subspace Method (LSM).- 7.5 Learning Vector Quantization.- 7.6 Feature Extraction.- 7.7 Clustering.- 7.7.1 Simple Clustering (Optimization Approach).- 7.7.2 Hierarchical Clustering (Taxonomy Approach).- 7.8 Structural Pattern Recognition Methods.- 8. More About Biological Memory.- 8.1 Physiological Foundations of Memory.- 8.1.1 On the Mechanisms of Memory in Biological Systems.- 8.1.2 Structural Features of Some Neural Networks.- 8.1.3 Functional Features of Neurons.- 8.1.4 Modelling of the Synaptic Plasticity.- 8.1.5 Can the Memory Capacity Ensue from Synaptic Changes?.- 8.2 The Unified Cortical Memory Model.- 8.2.1 The Laminar Network Organization.- 8.2.2 On the Roles of Interneurons.- 8.2.3 Representation of Knowledge Over Memory Fields.- 8.2.4 Self-Controlled Operation of Memory.- 8.3 Collateral Reading.- 8.3.1 Physiological Results Relevant to Modelling.- 8.3.2 Related Modelling.- 9. Notes on Neural Computing.- 9.1 First Theoretical Views of Neural Networks.- 9.2 Motives for the Neural Computing Research.- 9.3 What Could the Purpose of the Neural Networks be?.- 9.4 Definitions of Artificial "Neural Computing" and General Notes on Neural Modelling.- 9.5 Are the Biological Neural Functions Localized or Distributed?.- 9.6 Is Nonlinearity Essential to Neural Computing?.- 9.7 Characteristic Differences Between Neural and Digital Computers.- 9.7.1 The Degree of Parallelism of the Neural Networks is Still Higher than that of any "Massively Parallel" Digital Computer.- 9.7.2 Why the Neural Signals Cannot be Approximated by Boolean Variables.- 9.7.3 The Neural Circuits do not Implement Finite Automata.- 9.7.4 Undue Views of the Logic Equivalence of the Brain and Computers on a High Level.- 9.8 "Connectionist Models".- 9.9 How can the Neural Computers be Programmed?.- 10. Optical Associative Memories.- 10.1 Nonholographic Methods.- 10.2 General Aspects of Holographic Memories.- 10.3 A Simple Principle of Holographic Associative Memory.- 10.4 Addressing in Holographic Memories.- 10.5 Recent Advances of Optical Associative Memories.- Bibliography on Pattern Recognition.- References.
Citations
More filters
Journal ArticleDOI
22 Dec 2000-Science
TL;DR: Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.
Abstract: Many areas of science depend on exploratory data analysis and visualization. The need to analyze large amounts of multivariate data raises the fundamental problem of dimensionality reduction: how to discover compact representations of high-dimensional data. Here, we introduce locally linear embedding (LLE), an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs. Unlike clustering methods for local dimensionality reduction, LLE maps its inputs into a single global coordinate system of lower dimensionality, and its optimizations do not involve local minima. By exploiting the local symmetries of linear reconstructions, LLE is able to learn the global structure of nonlinear manifolds, such as those generated by images of faces or documents of text.

15,106 citations


Cites methods from "Self Organization And Associative M..."

  • ...Iterative hill-climbing methods for autoencoder neural networks (13, 14), self-organizing maps ( 15 ), and latent variable models (16 )d o not have the same guarantees of global optimality or convergence; they also tend to involve many more free parameters, such as learning rates, convergence criteria, and architectural specifications....

    [...]

Journal ArticleDOI
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

14,635 citations


Additional excerpts

  • ...…& Mitchison, 1989; Barrow, 1987; Deco & Parra, 1997; Field, 1987; Földiák, 1990; Földiák & Young, 1995; Grossberg, 1976a, 1976b; Hebb, 1949; Kohonen, 1972, 1982, 1988; Kosko, 1990; Martinetz, Ritter, & Schulten, 1990; Miller, 1994; Mozer, 1991; Oja, 1989; Palm, 1992; Pearlmutter &Hinton,…...

    [...]

Journal ArticleDOI
TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
Abstract: Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.

14,054 citations

Journal ArticleDOI
22 Dec 2000-Science
TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.
Abstract: Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs-30,000 auditory nerve fibers or 10(6) optic nerve fibers-a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

13,652 citations

Journal ArticleDOI
TL;DR: This work presents a simple and efficient implementation of Lloyd's k-means clustering algorithm, which it calls the filtering algorithm, and establishes the practical efficiency of the algorithm's running time.
Abstract: In k-means clustering, we are given a set of n data points in d-dimensional space R/sup d/ and an integer k and the problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for k-means clustering is Lloyd's (1982) algorithm. We present a simple and efficient implementation of Lloyd's k-means clustering algorithm, which we call the filtering algorithm. This algorithm is easy to implement, requiring a kd-tree as the only major data structure. We establish the practical efficiency of the filtering algorithm in two ways. First, we present a data-sensitive analysis of the algorithm's running time, which shows that the algorithm runs faster as the separation between clusters increases. Second, we present a number of empirical studies both on synthetically generated data and on real data sets from applications in color quantization, data compression, and image segmentation.

5,288 citations


Cites methods from "Self Organization And Associative M..."

  • ...These include approaches based on splitting and merging such as ISODATA [6], [28], randomized approaches such as CLARA [34], CLARANS [44], methods based on neural nets [35], and methods designed to scale to large databases, including DBSCAN [17], BIRCH [50], and ScaleKM [10]....

    [...]