scispace - formally typeset

Book ChapterDOI

Cogito componentiter ergo sum

05 Mar 2006-pp 446-453

TL;DR: Evidence that independent component analysis of abstract data such as text, social interactions, music, and speech leads to low level cognitive components is presented.
Abstract: Cognitive component analysis (COCA) is defined as the process of unsupervised grouping of data such that the ensuing group structure is well-aligned with that resulting from human cognitive activity. We present evidence that independent component analysis of abstract data such as text, social interactions, music, and speech leads to low level cognitive components.

Content maybe subject to copyright    Report

Cogito componentiter ergo sum
Lars Kai Hansen and Ling Feng
Informatics and Mathematical Modelling,
Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
lkh,lf@imm.dtu.dk, www.imm.dtu.dk
Abstract. Cognitive component analysis (COCA) is defined as the pro-
cess of unsupervised grouping of data such that the ensuing group struc-
ture is well-aligned with that resulting from human cognitive activity.
We present evidence that independent component analysis of abstract
data such as text, social interactions, music, and speech leads to low
level cognitive components.
1 Introduction
During evolution human and animal visual, auditory, and other primary sensory
systems have adapted to a broad ecological ensemble of natural stimuli. This
long-time on-going adaption process has resulted in representations in human
and animal perceptual systems which closely resemble the information theo-
retically optimal representations obtained by independent component analysis
(ICA), see e.g., [1] on visual contrast representation, [2] on visual features in-
volved in color and stereo processing, and [3] on representations of sound fea-
tures. For a general discussion consult also the textbook [4]. The human per-
ceptional system can model complex multi-agent scenery. Human cognition uses
a broad spectrum of cues for analyzing perceptual input and separate individ-
ual signal producing agents, such as speakers, gestures, affections etc. Humans
seem to be able to readily adapt strategies from one perceptual domain to an-
other and furthermore to apply these information processing strategies, such as,
object grouping, to both more abstract and more complex environments, than
have been present during evolution. Given our present, and rather detailed, un-
derstanding of the ICA-like representations in primary sensory systems, it seems
natural to pose the question: Are such information optimal representations rooted
in independence also relevant for modeling higher cognitive functions? We are
currently pursuing a research programme, trying to understand the limitations
of the ecological hypothesis for higher level cognitive processes, such as grouping
abstract objects, navigating social networks, understanding multi-speaker envi-
ronments, and understanding the representational differences between self and
environment.
Wagensberg has pointed to the importance of independence for successful
‘life forms’ [5]
A living individual is part of the world with some identity that tends to
become independent of the uncertainty of the rest of the world

Thus natural selection favors innovations that increase independence of the agent
in the face of environmental uncertainty, while maximizing the gain from the
predictable aspects of the niche. This view represents a precision of the classical
Darwinian formulation that natural selection simply favors adaptation to given
conditions. Wagensberg points out that recent biological innovations, such as ner-
vous systems and brains are means to decrease the sensitivity to un-predictable
fluctuations. An important aspect of environmental analysis is to be able to rec-
ognize event induced by the self and other agents. Wagensberg also points out
that by creating alliances agents can give up independence for the benefit of
a group, which in turns may increase independence for the group as an entity.
Both in its simple one-agent form and in the more tentative analysis of the group
model, Wagensberg’s theory emphasizes the crucial importance of statistical in-
dependence for evolution of perception, semantics and indeed cognition. While
cognition may be hard to quantify, its direct consequence, human behavior, has a
rich phenomenology which is becoming increasingly accessible to modeling. The
digitalization of everyday life as reflected, say, in telecommunication, commerce,
and media usage allows quantification and modeling of human patterns of activ-
ity, often at the level of individuals. Grouping of events or objects in categories is
−100 −50 0 50 100 150
−60
−40
−20
0
20
40
60
x
1
x
2
−0.1 −0.05 0 0.05 0.1 0.15 0.2 0.25
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
LATENT COMPONENT 4
LATENT COMPONENT 2
Fig. 1. Generic feature distribution produced by a linear mixture of sparse sources
(left) and a typical ‘latent semantic analysis’ scatter plot of principal component pro-
jections of a text database (right). The characteristics of a sparse signal is that it
consists of relatively few large magnitude samples on a background of small signals.
Latent semantic analysis of the so-called MED text database reveals that the semantic
comp onents are indeed very sparse and does follow the laten directions (principal com-
p onents). Topics are indicated by the different markers. In [6] an ICA analysis of this
data set post-processed with simple heuristic classifier showed that manually defined
topics were very well aligned with the independent components. Hence, constituting
an example of cognitive component analysis: Unsupervised learning leads to a label
structure corresponding to that of human cognitive activity.
fundamental to human cognition. In machine learning, classification is a rather

well-understoo d task when based on labelled examples [7]. In this case classifica-
tion belongs to the class of supervised learning problems. Clustering is a closely
related unsupervised learning problem, in which we use general statistical rules
to group objects, without a priori providing a set of labelled examples. It is a
fascinating finding in many real world data sets that the label structure discov-
ered by unsupervised learning closely coincides with labels obtained by letting a
human or a group of humans perform classification, labels derived from human
cognition. We thus define cognitive component analysis (COCA) as unsupervised
grouping of data such that the ensuing group structure is well-aligned with that
resulting from human cognitive activity [8]. This presentation is based on our
earlier results using ICA for abstract data such as text, dynamic text (chat),
web pages including text and images, see e.g., [9–13].
2 Where have we found cognitive components?
Text analysis. Symbol manipulation as in text is a hallmark of human cog-
nition. Salton proposed the so-called vector space representation for statistical
modeling of text data, for a review see [14]. A term set is chosen and a doc-
ument is represented by the vector of term frequencies. A document database
then forms a so-called term-document matrix. The vector space representation
can be used for classification and retrieval by noting that similar documents
are somehow expected to be ‘close’ in the vector space. A metric can be based
on the simple Euclidean distance if document vectors are properly normalized,
otherwise angular distance may be useful. This approach is principled, fast, and
language independent. Deerwester and co-workers developed the concept of la-
tent semantics based on principal component analysis of the term-document
matrix [15]. The fundamental observation behind the latent semantic indexing
(LSI) approach is that similar documents are using similar vocabularies, hence,
the vectors of a given topic could appear as produced by a stochastic process
with highly correlated term-entries. By projecting the term-frequency vectors on
a relatively low dimensional subspace, say determined by the maximal amount
of variance one would be able to filter out the inevitable ‘noise’. Noise should
here be thought of as individual document differences in term usage within a
specific context. For well-defined topics, one could simply hope that a given
context would have a stable core term set that would come out as a eigen ‘di-
rection’ in the term vector space. The orthogonality constraint of co-variance
matrix eigenvectors, however, often limits the interpretability of the LSI rep-
resentation, and LSI is therefore more often used as a dimensional reduction
tool. The representation can be post-processed to reveal cognitive components,
e.g., by interactive visualization schemes [16]. In Figure 1 (right) we indicate
the scatter plot of a small text database. The database consists of documents
with overlapping vocabulary but five different (high level cognitive) labels. The
‘ray’-structure signaling a sparse linear mixture is evident.

Social networks. The ability to understand social networks is critical to hu-
mans. Is it possible that the simple unsupervised scheme for identification of
independent components could play a role in this human capacity? To investi-
gate this issue we have initiated an analysis of a well-known social network of
some practical importance. The so-called actor network is a quantitative rep-
−0.1 −0.05 0 0.05 0.1
−0.15
−0.1
−0.05
0
0.05
0.1
EIGENCAST 3
EIGENCAST 5
Fig. 2. The so-called actor network quantifies the collaborative pattern of 382.000
actors participating in almost 128.000 movies. For visualization we have projected
the data onto principal components (LSI) of the actor-actor co-variance matrix. The
eigenvectors of this matrix are called ‘eigencasts’ and they represent characteristic
communities of actors that tend to co-appear in movies. The network is extremely
sparse, so the most prominent variance components are related to near-disjunct sub-
communities of actors with many common movies. However, a close up of the coupling
b etween two latent semantic components (the region (0, 0)) reveals the ubiquitous
signature of a sparse linear mixture: A pronounced ‘ray’ structure emanating from
(0,0). The ICA components are color coded. We speculate that the cognitive machinery
developed for handling of independent events can also be used to locate independent
sub-communities, hence, navigate complex social networks.
resentation of the co-participation of actors in movies, for a discussion of this
network, see e.g., [17]. The observation model for the network is not too different
from that of text. Each movie is represented by the cast, i.e., the list of actors.
We have converted the table of the about T = 128.000 movies with a total
of J = 382.000 individual actors, to a sparse J × T matrix. For visualization
we have projected the data onto principal components (LSI) of the actor-actor
co-variance matrix. The eigenvectors of this matrix are called ‘eigencasts’ and
represent characteristic communities of actors that tend to co-appear in movies.
The sparsity and magnitude of the network means that the components are dom-
inated by communities with very small intersections, however, a closer look at
such scatter plots reveals detail suggesting that a simple linear mixture model in-
deed provides a reasonable representation of the (small) coupling between these
relative trivial disjunct subsets, see Figure 2. Such insight may be used for com-

puter assisted navigation of collaborative, peer-to-peer networks, for example in
the context of search and retrieval.
Musical genre. The growing market for digital music and intelligent music
services creates an increasing interest in modeling of music data. It is now feasible
to estimate consensus musical genre by supervised learning from rather short
music segments, say 5-10 seconds, see e.g., [18], thus enabling computerized
handling of music request at a high cognitive complexity level. To understand
the possibilities and limitations for unsupervised modeling of music data we here
visualize a small music sample using the latent semantic analysis framework.
The intended use is for a music search engine function, hence, we envision that
−1 0 1
−5
0
5
−1 0 1
−5
0
5
−1 0 1
−2
0
2
−1 0 1
−2
0
2
−1 0 1
−0.2
0
0.2
0.4
−5 0 5
−5
0
5
−5 0 5
−2
0
2
−5 0 5
−2
0
2
−1 0 1
−0.1
0
0.1
0.2
−5 0 5
−2
0
2
4
−5 0 5
−2
0
2
−5 0 5
−2
0
2
−0.5 0 0.5 1
−0.2
0
0.2
−0.5 0 0.5 1
0
1
2
−0.6−0.4−0.2 0 0.2
−1.5
−1
−0.5
0
0.5
−2 0 2
−2
0
2
−1 −0.5 0 0.5
−0.2
0
0.2
0.4
−1 −0.5 0 0.5
0
1
2
−0.5 0 0.5 1
−2
0
2
−0.2 0 0.20.40.6
−0.4
−0.2
0
0.2
0.4
PC 1
PC 2
PC 3
PC 4
PC 5
Fig. 3. We represent three music tunes (genre labels: heavy metal, jazz, classical)
by their spectral content in overlapping small time frames (w = 30msec, with an overlap
of 10msec, see [18], for details). To make the visualization relatively independent of
‘pitch’, we use the so-called mel-cepstral representation (MFCC, K = 13 coefficients
pr. frame). To reduce noise in the visualization we have ‘sparsified’ the amplitudes. This
was achieved simply by keeping coefficients that belonged to the upper 5% magnitude
p ercentile. The total number of frames in the analysis was F = 10
5
. Latent semantic
analysis provided unsupervised subspaces with maximal variance for a given dimension.
We show the scatter plots of the data of the first 1-5 latent dimensions. The scatter
plots below the diagonal have been ‘zoomed’ to reveal more details of the ICA ‘ray’
structure. For interpretation we have coded the data points with signatures of the three
genres involved: classical (), heavy metal (diamond), jazz (+). The ICA ray structure
is striking, however, note that the situation is not one-to-one (ray to genre) as in the
small text databases. A component (ray) quantifies a characteristic musical ‘theme’
at the temporal level of a frame (30msec), i.e., an entity similar to the ‘phoneme’ in
sp eech.
a largely text based query has resulted in a few music entries, and the algorithm
is going to find the group structure inherent in the retrieval for the user. We

Figures (3)
Citations
More filters

Book ChapterDOI
07 Jun 2009
TL;DR: It is proposed that it might be feasible to automatically generate affective user preferences based on song lyrics by applying LSA latent semantic analysis to bottom-up represent the correlation of terms and song lyrics in a vector space that reflects the emotional context.
Abstract: Outlining a high level cognitive approach to how we select media based on affective user preferences, we model the latent semantics of lyrics as patterns of emotional components. Using a selection of affective last.fm tags as top-down emotional buoys, we apply LSA latent semantic analysis to bottom-up represent the correlation of terms and song lyrics in a vector space that reflects the emotional context. Analyzing the resulting patterns of affective components, by comparing them against last.fm tag clouds describing the corresponding songs, we propose that it might be feasible to automatically generate affective user preferences based on song lyrics.

6 citations


Additional excerpts

  • ...In such a model the bottom-up part would resemble cognitive component analysis [18]....

    [...]


01 Jan 2008
TL;DR: The hypothesis is ecological: features that essentially independent in a context defined ensemble can be efficiently coded using a sparse independent component representation and it is found that supervised and unsupervised learning seem to identify similar representations.
Abstract: COgnitive Component Analysis (COCA) defined as the process of unsupervised grouping of data such that the ensuing group structure is well-aligned with that resulting from human cognitive activity, has been explored on phoneme data. Statistical regularities have been revealed at multiple time scales. The basic features are 25-dimensional short time (20ms) melfrequency weighted cepstral coefficients. Features are integrated by means of stacking to obtain features at longer time scales. Energy based sparsification is carried out to achieve sparse representations. Our hypothesis is ecological: we assume that features that essentially independent in a context defined ensemble can be efficiently coded using a sparse independent component representation. This means that supervised and unsupervised learning should result in similar representations. We indeed find that supervised and unsupervised learning seem to identify similar representations, here, measured by the classification similarity.

5 citations


Ling Feng, Lars Kai Hansen1Institutions (1)
01 Jan 2007
TL;DR: The independent cognitive component hypothesis is proposed and tested, it is hypothesized that features that are essen- tially independent in a reasonable ensemble can be efficiently coded using a sparse independent component representation, and efficient representa- tions of high level processes are based on sparse distributed codes and approximate independence, similar to what has been found for more basic perceptual processes.
Abstract: Cognitive Components of Speech at Different Time Scales Ling Feng (lf@imm.dtu.dk) Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby, Denmark Lars Kai Hansen (lkh@imm.dtu.dk) Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby, Denmark Abstract statistics. This has been demonstrated by a variety of inde- pendent component analysis (ICA) algorithms, whose rep- resentations closely resemble those found in natural percep- tual systems. Examples are, e.g., visual features (Bell & Se- jnowski, 1997; Hoyer & Hyvrinen, 2000), and sound features (Lewicki, 2002). Within an attempt to generalize these findings to higher cognitive functions we proposed and tested the independent cognitive component hypothesis, which basically asks the question: Do humans also use information theoretically opti- mal ICA methods in more generic and abstract data analysis? Cognitive component analysis (COCA) is thus simply defined as the process of unsupervised grouping of abstract data such that the ensuing group structure is well-aligned with that re- sulting from human cognitive activity (Hansen, Ahrendt, & Larsen, 2005). For the preliminary research on COCA, hu- man cognitive activity is restricted to the human labels in su- pervised learning methods. This interpretation is not compre- hensive, however it is capable of representing some intrinsic mechanism of human cognition. Further more, COCA is not limited to one specific technique, but rather a conglomerate of different techniques. We envision that efficient representa- tions of high level processes are based on sparse distributed codes and approximate independence, similar to what has been found for more basic perceptual processes. As men- tioned, independence can dramatically reduce the perception- to-action mappings by using factorial codes rather than com- plex codes based on the full joint distribution. Hence, it is a natural starting point to look for high-level statistically inde- pendent features when aiming at high-level representations. In this paper we focus on cognitive processes in digital speech signals. The paper is organized as follows: First we discuss the specifics of the cognitive component hypothesis in rela- tion to speech, then we describe our specific methods, present results obtained for the TIMIT database, and finally, we con- clude and draw some perspectives. Cognitive component analysis (COCA) is defined as unsu- pervised grouping of data leading to a group structure well- aligned with that resulting from human cognitive activity. We focus here on speech at different time scales looking for pos- sible hidden ‘cognitive structure’. Statistical regularities have earlier been revealed at multiple time scales corresponding to: phoneme, gender, height and speaker identity. We here show that the same simple unsupervised learning algorithm can de- tect these cues. Our basic features are 25-dimensional short- time Mel-frequency weighted cepstral coefficients, assumed to model the basic representation of the human auditory system. The basic features are aggregated in time to obtain features at longer time scales. Simple energy based filtering is used to achieve a sparse representation. Our hypothesis is now basi- cally ecological: We hypothesize that features that are essen- tially independent in a reasonable ensemble can be efficiently coded using a sparse independent component representation. The representations are indeed shown to be very similar be- tween supervised learning (invoking cognitive activity) and un- supervised learning (statistical regularities), hence lending ad- ditional support to our cognitive component hypothesis. Keywords: Cognitive component analysis; time scales; en- ergy based sparsification; statistical regularity; unsupervised learning; supervised learning. Introduction The evolution of human cognition is an on-going interplay between statistical properties of the ecology, the process of natural selection, and learning. Robust statistical regularities will be exploited by an evolutionary optimized brain (Barlow, 1989). Statistical independence may be one such regularity, which would allow the system to take advantage of factorial codes of much lower complexity than those pertinent to the full joint distribution. In (Wagensberg, 2000), the success of given ‘life forms’ is linked to their ability to recognize in- dependence between predictable and un-predictable process in a given niche. This represents a precision of the classical Darwinian paradigm by arguing that natural selection sim- ply favors innovations which increase the independence of the agent and un-predictable processes. The agent can be an individual or a group. The resulting human cognitive sys- tem can model complex multi-agent scenery, and use a broad spectrum of cues for analyzing perceptual input and for iden- tification of individual signal producing processes. The optimized representations for low level perception are indeed based on independence in relevant natural ensemble Cognitive Component Analysis In sensory coding it is proposed that visual system is near to optimal in representing natural scenes by invoking ‘sparse distributed’ coding (Field, 1994). The sparse signal consists of relatively few large magnitude samples in a background of numbers of small signals. When mixing such indepen-

2 citations


Cites methods from "Cogito componentiter ergo sum"

  • ...Thus, we used ICA to model the ray structure and represent semantic structure in text, social networks, and other abstract data such as music (Hansen et al., 2005; Hansen & Feng, 2006)....

    [...]


Proceedings ArticleDOI
Lars Kai Hansen1Institutions (1)
28 May 2012
TL;DR: A statistical machine learning model of top-down task driven attention based on the notion of `gist' which shows the performance of the classifier equipped with the attention mechanism is almost as good as one that has access to all low-level features and clearly improving over a simple `random attention' alternative.
Abstract: We review a statistical machine learning model of top-down task driven attention based on the notion of ‘gist’. In this framework we consider the task to be represented as a classification problem with two sets of features — a gist of coarse grained global features and a larger set of low-level local features. Attention is modeled as the choice process over the low-level features given the gist. The model takes its departure in a classical information theoretic framework for experimental design. This approach requires the evaluation over marginalized and conditional distributions. By implementing the classifier within a Gaussian Discrete mixture it is straightforward to marginalize and condition, hence, we obtained a relatively simple expression for the feature dependent information gain — the top-down saliency. As the top-down attention mechanism is modeled as a simple classification problem, we can evaluate the strategy simply by estimating error rates on a test data set. We illustrate the attention mechanism on a simple simulated visual domain in which the choice is over nine patches in which a binary pattern has to be classified. The performance of the classifier equipped with the attention mechanism is almost as good as one that has access to all low-level features and clearly improving over a simple ‘random attention’ alternative.

2 citations


01 Sep 2010
TL;DR: The thesis thus combines elements of machine learning with aspects of cognitive semantics that could potentially be utilized in applications ranging from media information retrieval and business related sentiment analysis to cognitive neuroscience.
Abstract: Though one might think of media as an audiovisual stream of consciousness, we frequently encode frames of video sequences and waves of sound into strings of text. Language allows us to both share the internal representations of what we perceive as mental concepts, as well as categorizing them as distinct states in the continuous ebb and flow of emotions underlying consciousness. Whether it being a soundscape of structured peaks or tiny black characters lined up across a page, we rely on syntax for parsing sequences of symbols, which based on hierarchically nested structures allow us to express and share the meaning contained within a sentence or a melodic phrase. As both low-level semantic structure of texts and our affective responses can be encoded in words, a simplified cognitive model can be constructed which uses LSA latent semantic analysis to emulate how we perceive the emotional context of media based on lyrics, synopses, subtitles, blogs or web pages associated with the content. In the proposed model the bottom-up generated sensory input is a matrix of tens of thousands of words co-occurring within multiple contexts, that are in turn represented as vectors in a semantic space of reduced dimensionality. While top-down, patterns of emotional categorization emerge by defining term vector distances to affective adjectives, that constrain the latent semantic structures according to the neurophysiological dimensions of valence and arousal. The thesis thus combines elements of machine learning with aspects of cognitive semantics that could potentially be utilized in applications ranging from media information retrieval and business related sentiment analysis to cognitive neuroscience.

1 citations


Cites result from "Cogito componentiter ergo sum"

  • ...This indicates that core elements of lyrical music appear to be treated in a fashion similar to those of language [18], which is in turn supported by EEG `electroencephalograhy' studies showing that language and music compete for the same neural resources when processing syntax and semantics [19]....

    [...]


References
More filters

Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Abstract: Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

30,921 citations


Book
Christopher M. Bishop1Institutions (1)
01 Jan 1995
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

19,046 citations


Book ChapterDOI
Suresh Kothari1, Heekuck Oh1Institutions (1)
TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Abstract: Publisher Summary This chapter provides an account of different neural network architectures for pattern recognition. A neural network consists of several simple processing elements called neurons. Each neuron is connected to some other neurons and possibly to the input nodes. Neural networks provide a simple computing paradigm to perform complex recognition tasks in real time. The chapter categorizes neural networks into three types: single-layer networks, multilayer feedforward networks, and feedback networks. It discusses the gradient descent and the relaxation method as the two underlying mathematical themes for deriving learning algorithms. A lot of research activity is centered on learning algorithms because of their fundamental importance in neural networks. The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue. It closes with the discussion of performance and implementation issues.

12,585 citations


"Cogito componentiter ergo sum" refers background in this paper

  • ...well-understood task when based on labelled examples [7]....

    [...]


Journal ArticleDOI
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Abstract: A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.

12,005 citations


"Cogito componentiter ergo sum" refers background in this paper

  • ...Deerwester and co-workers developed the concept of latent semantics based on principal component analysis of the term-document matrix [15]....

    [...]


Book
Aapo Hyvärinen1, Juha Karhunen1, Erkki Oja1Institutions (1)
18 May 2001
Abstract: In this chapter, we discuss a statistical generative model called independent component analysis. It is basically a proper probabilistic formulation of the ideas underpinning sparse coding. It shows how sparse coding can be interpreted as providing a Bayesian prior, and answers some questions which were not properly answered in the sparse coding framework.

8,330 citations


Network Information
Related Papers (1)
Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20121
20101
20091
20083
20071
20061