Showing papers on "Probabilistic latent semantic analysis published in 2001"

PDF

Open Access

Proceedings Article•

[...]

David M. Blei¹, Andrew Y. Ng¹, Michael I. Jordan¹•Institutions (1)

03 Jan 2001

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

...read moreread less

25,546 citations

Journal Article•DOI•

Unsupervised Learning by Probabilistic Latent Semantic Analysis

[...]

Thomas Hofmann¹•Institutions (1)

Brown University¹

01 Jan 2001-Machine Learning

TL;DR: This paper proposes to make use of a temperature controlled version of the Expectation Maximization algorithm for model fitting, which has shown excellent performance in practice, and results in a more principled approach with a solid foundation in statistical inference.

...read moreread less

Abstract: This paper presents a novel statistical method for factor analysis of binary and count data which is closely related to a technique known as Latent Semantic Analysis. In contrast to the latter method which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed technique uses a generative latent class model to perform a probabilistic mixture decomposition. This results in a more principled approach with a solid foundation in statistical inference. More precisely, we propose to make use of a temperature controlled version of the Expectation Maximization algorithm for model fitting, which has shown excellent performance in practice. Probabilistic Latent Semantic Analysis has many applications, most prominently in information retrieval, natural language processing, machine learning from text, and in related areas. The paper presents perplexity results for different types of text and linguistic data collections and discusses an application in automated document indexing. The experiments indicate substantial and consistent improvements of the probabilistic method over standard Latent Semantic Analysis.

...read moreread less

2,574 citations

Patent•

System and method for personalized search, information filtering, and for generating recommendations utilizing statistical latent class models

[...]

Thomas Hofmann, Jan Puzicha

26 Jul 2001

TL;DR: In this paper, a probabilistic Latent Semantic Analysis (PLSA) model is used to integrate textual and other content descriptions of items to be searched, user profiles, demographic information, query logs of previous searches, and explicit user ratings of items.

...read moreread less

Abstract: The disclosed system implements a novel method for personalized filtering of information and automated generation of user-specific recommendations. The system uses a statistical latent class model, also known as Probabilistic Latent Semantic Analysis, to integrate data including textual and other content descriptions of items to be searched, user profiles, demographic information, query logs of previous searches, and explicit user ratings of items. The disclosed system learns one or more statistical models based on available data. The learning may be reiterated once additional data is available. The statistical model, once learned, is utilized in various ways: to make predictions about item relevance and user preferences on un-rated items, to generate recommendation lists of items, to generate personalized search result lists, to disambiguate a users query, to refine a search, to compute similarities between items or users, and for data mining purposes such as identifying user communities.

...read moreread less

645 citations

Proceedings Article•DOI•

Spectral analysis of data

[...]

Yossi Azar¹, Amos Fiat¹, Anna R. Karlin², Frank McSherry², Jared Saia² - Show less +1 more•Institutions (2)

Tel Aviv University¹, University of Washington²

06 Jul 2001

TL;DR: A model for framing data mining tasks and a unified approach to solving the resulting data mining problems using spectral analysis are presented, which give strong justification to the use of spectral techniques for latent semantic indexing, collaborative filtering, and web site ranking.

...read moreread less

Abstract: Experimental evidence suggests that spectral techniques are valuable for a wide range of applications. A partial list of such applications include (i) semantic analysis of documents used to cluster documents into areas of interest, (ii) collaborative filtering --- the reconstruction of missing data items, and (iii) determining the relative importance of documents based on citation/link structure. Intuitive arguments can explain some of the phenomena that has been observed but little theoretical study has been done. In this paper we present a model for framing data mining tasks and a unified approach to solving the resulting data mining problems using spectral analysis. These results give strong justification to the use of spectral techniques for latent semantic indexing, collaborative filtering, and web site ranking.

...read moreread less

322 citations

Journal Article•DOI•

Latent Semantic Kernels

[...]

Nello Cristianini¹, John Shawe-Taylor¹, Huma Lodhi¹•Institutions (1)

Royal Holloway, University of London¹

28 Jun 2001

TL;DR: In this paper, the LSI approach can be implemented in a kernel-defined feature space, and experimental results demonstrate that the approach can significantly improve performance, and that it does not impair it.

...read moreread less

Abstract: Kernel methods like support vector machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches. Latent semantic indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost. In this paper we describe how the LSI approach can be implemented in a kernel-defined feature space. We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.

...read moreread less

303 citations

Journal Article•DOI•

Latent Class Factor and Cluster Models, Bi-Plots, and Related Graphical Displays

[...]

J. Magidson, Jeroen K. Vermunt¹•Institutions (1)

Tilburg University¹

01 Jan 2001-Sociological Methodology

TL;DR: Analyses over several data sets suggest that LC factor models typically fit data better and provide results that are easier to interpret than the corresponding LC cluster models.

...read moreread less

Abstract: We propose an alternative method of conducting exploratory latent class analysis that utilizes latent class factor models, and compare it to the more traditional approach based on latent class cluster models. We show that when formulated in terms of R mutually independent, dichotomous latent factors, the LC factor model has the same number of distinct parameters as an LC cluster model with R+1 clusters. Analyses over several data sets suggest that LC factor models typically fit data better and provide results that are easier to interpret than the corresponding LC cluster models. We also introduce a new graphical bi-plot display for LC factor models and compare it to similar plots used in correspondence analysis and to a barycentric coordinate display for LC cluster models. New results on identification of LC models are also presented. We conclude by describing various model extensions and an approach for eliminating boundary solutions in identified and unidentified LC models, which we have implemented in a new computer program.

...read moreread less

290 citations

Proceedings Article•

Latent Semantic Analysis for Text Segmentation

[...]

Freddy Y. Y. Choi, Peter Wiemer-Hastings, Johanna D. Moore

01 Jan 2001

242 citations

Patent•

Probabilistic information retrieval based on differential latent semantic space

[...]

Naoyuki Tokuda, Liang Chen, Hiroyuki Sasai

08 May 2001

TL;DR: A computer-based information search and retrieval system and method for retrieving textual digital objects that makes full use of the projections of the documents onto both the reduced document space characterized by the singular value decomposition-based latent semantic structure and its orthogonal space is presented in this paper.

...read moreread less

Abstract: A computer-based information search and retrieval system and method for retrieving textual digital objects that makes full use of the projections of the documents onto both the reduced document space characterized by the singular value decomposition-based latent semantic structure and its orthogonal space. The resulting system and method has increased robustness, improving the instability of the traditional keyword search engine due to synonymy and/or polysemy of a natural language, and therefore is particularly suitable for web document searching over a distributed computer network such as the Internet.

...read moreread less

218 citations

Weight functions impact on LSA performance

[...]

Preslav Nakov, Antonia Popova, Plamen Mateev, James Bourchier

01 Jan 2001

TL;DR: Experimental results of usage of LSA for analysis of English literature texts and preliminary transformations of the frequency text-document matrix with different weight functions are tested on the basis of control subsets.

...read moreread less

Abstract: This paper presents experimental results of usage of LSA for analysis of English literature texts. Several preliminary transformations of the frequency text-document matrix with different weight functions are tested on the basis of control subsets. Additional clustering based on correlation matrix is applied in order to reveal the latent structure. The algorithm creates a shaded form matrix via singular values and vectors. The results are interpreted as a quality of the transformations and compared to the control set tests.

...read moreread less

129 citations

Journal Article•DOI•

Unsupervised Learning by Probabilistic Latent Semantic Analysis

[...]

HofmannThomas

01 Jan 2001-Machine Learning

TL;DR: A novel statistical method for factor analysis of binary and count data which is closely related to a technique known as Latent Semantic Analysis is presented.

...read moreread less

103 citations

Bayesian Latent Semantic Analysis of Multimedia Databases

[...]

Nando de Freitas, Kobus Barnard

11 Oct 2001

TL;DR: A Bayesian mixture model for probabilistic latent semantic analysis of documents with images and text and enables a priori knowledge, such as word and image preferences, to be encoded.

...read moreread less

Abstract: We present a Bayesian mixture model for probabilistic latent semantic analysis of documents with images and text. The Bayesian perspective allows us to perform automatic regularisation to obtain sparser and more coherent clustering models. It also enables us to encode a priori knowledge, such as word and image preferences. The learnt model can be used for browsing digital databases, information retrieval with image and/or text queries, image annotation (adding words to an image) and text illustration (adding images to a text).

...read moreread less

Visualizing region and scale in information spaces

[...]

Sara Irina Fabrikant

01 Jan 2001

TL;DR: Latent semantic indexing in conjunction with two different ordination techniques is employed to construct a semantic Reuters news wire space and topological information helps to identify the appropriate levels of granularity at which the information space can be visually explored.

...read moreread less

Abstract: The geographic concepts of region and scale can be preserved in semantic information spaces and depicted cartographically. Region and scale are fundamental to geographical analysis, and are also associated with cognitive and experiential properties of the real world. Scale is important when graphically representing a spatialization, as it affects the amount of detail that can be shown. Latent semantic indexing in conjunction with two different ordination techniques is employed to construct a semantic Reuters news wire space. Intramax, a hierarchical clustering algorithm, is applied to delineate semantic regions in the Reuters database based on a functional distance measure. This topological information helps to identify the appropriate levels of granularity at which the information space can be visually explored. Amplification of ordination techniques with the Intramax procedure is a useful strategy for creating scale-dependent information spaces that facilitate the exploration of abstract, complex data archives.

...read moreread less

Combining statistics and semantics for word and document clustering

[...]

Alexandre Termier¹, Marie-Christine Rousset¹, Michèle Sebag²•Institutions (2)

University of Paris¹, École Polytechnique²

04 Aug 2001

TL;DR: Experimental results show that accounting for semantic information in fact decreases the performances compared to LSI standalone, and the main weakenesses of the current hybrid scheme are discussed and several tracks for improvement are sketched.

...read moreread less

Abstract: A new approach for constructing pseudo-keywords, referred to as Sense Units, is proposed. Sense Units are obtained by a word clustering process, where the underlying similarity reflects both statistical and semantic properties, respectively detected through Latent Semantic Analysis and WordNet. Sense Units are used to recode documents and are evaluated from the performance increase they permit in classification tasks. Experimental results show that accounting for semantic information in fact decreases the performances compared to LSI standalone. The main weakenesses of the current hybrid scheme are discussed and several tracks for improvement are sketched.

...read moreread less

Journal Article•

An object-oriented model for representing semantic locality in the UMLS.

[...]

Olivier Bodenreider¹•Institutions (1)

National Institutes of Health¹

01 Jan 2001-Studies in health technology and informatics

TL;DR: An object-oriented model in which the semantic features of the UMLS are made available through four major classes for representing Metathesaurus concepts, semantic types, inter-concept relationships and Semantic Network relationships is proposed.

...read moreread less

Abstract: Several information models have been developed for the Unified Medical Language System (UMLS). While some models are term-oriented, a knowledge-oriented model is needed for representing semantic locality, i.e. the various semantic links among concepts. We propose an objectoriented model in which the semantic features of the UMLS are made available through four major classes for representing Metathesaurus concepts, semantic types, interconcept relationships and Semantic Network relationships. Additional semantic methods for reducing the complexity of the hierarchical relationships represented in the UMLS are proposed. Implementation details are presented, as well as examples of use. The interest of this approach is discussed.

...read moreread less

Book Chapter•DOI•

Single representations of multiple meanings in latent semantic analysis.

[...]

Thomas K. Landauer

01 Jan 2001

Hierarchical Segmentation: Finding Changes in a Text Signal

[...]

Malcolm Slaney¹, Dulce B. Ponceleon•Institutions (1)

IBM¹

01 Jan 2001

TL;DR: A signal processing algorithm which discovers the hierarchical organization of a document or media presentation is described, using latent semantic indexing to describe the semantic content of the signal, and scalespace segmentation to describe its features at many different scales.

...read moreread less

Abstract: This paper describes a signal processing algorithm which discovers the hierarchical organization of a document or media presentation. We use latent semantic indexing to describe the semantic content of the signal, and scalespace segmentation to describe its features at many different scales. We represent the semantic content of the document as a signal that varies through the document. We lowpass filter this signal to compute the document’s semantic path at many different time scales and then look for changes. The changes are sorted by their strength to form a hierarchical segmentation. We present results from a text document and a video transcript. 1. THE PROBLEM As prices decline and storage and computational horsepower increase, we will soon be swamped in multimedia data. Unfortunately, given an audio or a video signal there is little information readily available that can help us find our way around such a time-based signal. Technical papers are structured into major and minor headings, imposing a hierarchical structure. Often professional or high-quality audio–visual (AV) presentations are also structured. However, this information is hidden in the signal. Our goal is to use the semantic information in the AV signal to create a hierarchical table of contents that describes the associated signal. Towards this end we combine two powerful concepts: scale space (SS) filtering and Latent Semantic Indexing (LSI). We use LSI to provide a continuously valued feature that describes the semantic content of an AV signal. By doing this we reduce the dimensionality of the problem and, more importantly, we address synonymy and polysemy as LSI does. The combined approach remains language independent. We use scale-space techniques to represent the semantic signal over many different time scales. We are looking for changes in the signal and scale space allows us to talk about features of the document that span from a single sentence to entire chapters. The scale parameter specifies the level of detail for our analysis. Intuitively, at small scales we are looking at the individual trees, and at large scales we are seeing the entire forest. We look at a wide range of scales to determine when the content of the signal has changed. In Section 5 we use Latent Semantic Indexing (LSI) as a means to describe the semantic content of a signal.

...read moreread less

Book Chapter•DOI•

Comparison and Choice: Analyzing Discrete Preference Data by Latent Class Scaling Models

[...]

Ulf Böckenholt

01 Jan 2001

Journal Article•DOI•

Latent Semantic Indexing and Its Application

[...]

Guo Li Chen Yue

25 Dec 2001-Data Analysis and Knowledge Discovery

Book•

A probabilistic model for latent semantic indexing in information retrieval and filtering

[...]

Chris Ding¹•Institutions (1)

Lawrence Berkeley National Laboratory¹

01 Jan 2001

Journal Article•

Probabilistic Information Retrieval Method Based on Differential Latent Semantic Index Space

[...]

Liang Chen, Naoyuki Tokuda, Akira Nagai

01 Jul 2001-IEICE Transactions on Information and Systems

Proceedings Article•DOI•

Unitary operators for fast latent semantic indexing (FLSI)

[...]

Eduard Hoenkamp¹•Institutions (1)

Nijmegen Institute for Cognition and Information¹

01 Sep 2001

TL;DR: This paper shows how LSI is based on a unitary transformation, for which there are computationally more attractive alternatives, exemplified by the Haar transform, which is memory efficient, and can be computed in linear to sublinear time.

...read moreread less

Abstract: Latent Semantic Indexing (LSI) dramatically reduces the dimension of the document space by mapping it into a space spanned by conceptual indices. Empirically, the number of concepts that can represent the documents are far fewer than the great variety of words in the textual representation. Although this almost obviates the problem of lexical matching, the mapping incurs a high computational cost compared to document parsing, indexing, query matching, and updating. This paper shows how LSI is based on a unitary transformation, for which there are computationally more attractive alternatives. This is exemplified by the Haar transform, which is memory efficient, and can be computed in linear to sublinear time. The principle advantages of LSI are thus preserved while the computational costs are drastically reduced.

...read moreread less

Book Chapter•DOI•

Factor Analysis and Latent Structure: Overview

[...]

D.J. Bartholomew¹•Institutions (1)

London School of Economics and Political Science¹

01 Jan 2001

TL;DR: All variables, latent and observable, are treated as random variables whose joint distribution constitutes the model and can readily be extended to the study of relationships between latent variables as exemplified by Joreskog's LISREL and similar models.

...read moreread less

Abstract: Many of the quantities which appear in social science are latent, meaning that they are not directly observable. General intelligence, or ‘g,’ was an early example but latent variables (or factors) are now used to represent many social and psychological characteristics including attitudes and abilities. During the twentieth century a large body of methods was introduced to identify latent variables and to provide measurement scales for them. Factor analysis, introduced by Spearman in 1904, is used where the observable and latent variables are continuous. Latent structure analysis was introduced by Lazarsfeld in 1950 and covers the cases where either or both of the observable and latent variables are categorical. Latent trait models, used in educational testing, and latent class models have been particularly prominent in the literature. The application of all such models has been extended greatly by the wide availability of powerful desktop computers. Until recently these methods have been developed separately, yet all have a common basis and purpose. This article sets all of the models within a common conceptual framework from which they emerge as special cases. The key idea is that all variables, latent and observable, are treated as random variables whose joint distribution constitutes the model. This approach can readily be extended to the study of relationships between latent variables as exemplified by Joreskog's LISREL and similar models. It provides a basis for their further generalization and for their critical evaluation.

...read moreread less

Book Chapter•DOI•

Latent class and trait models for data classification and visualisation

[...]

Mark Girolami, Stephen J. Roberts, Richard M. Everson

01 Mar 2001

Investigating the Degree of Adequacy of the Relations in the Concept Structure of Students using the Method of Latent Semantic Analysis

[...]

Senia Petrova Terzieva, Preslav Nakov, Sneja Handjieva

01 Jan 2001

TL;DR: The problem has been solved by the use of the latent semantic analysis for comparison and assessment of scientific texts and knowledge, expressed by the students in the form of free verbal statements.

...read moreread less

Abstract: The research on the effects of study is hindered by the possibilities of the techniques and methods of registering, measuring and assessing the actually formed knowledge as information represented in the memory with the appropriate correlation among its units. The problem has been solved by the use of the latent semantic analysis for comparison and assessment of scientific texts and knowledge, expressed by the students in the form of free verbal statements.

...read moreread less

Dissertation•

Contribution to the analysis of latent structures

[...]

Ernest Fokoué

01 Jan 2001

TL;DR: The main emphasis of the work is on the derivation and construction of computationally efficient algorithms that perform well on both synthetic tasks and real-life problems, and that can be used as alternatives to other existing methods wherever appropriate.

...read moreread less

Abstract: What is a latent variable? Simply defined, a latent variable is a variable that cannot be directly measured or observed. A latent variable model or latent structure model is a model whose structure contains one or many latent variables. The subject of this thesis is the study of various topics that arise during the analysis and/or use of latent structure models. Two classical models, namely the factor analysis (FA) model and the finite mixture (FM) model, are first considered and examined extensively, after which the mixture of factor analysers (MFA) model, constructed using ingredients from both FA and FM is introduced and studied at length. Several extensions of the MFA model are also presented, one of which consists of the incorporation of fixed observed covariates into the model. Common to all the models considered are such topics as: (a) model selection which consists of the determination or estimation of the dimensionality of the latent space; (b) parameter estimation which consists of estimating the parameters of the postulated model in order to interpret and characterise the mechanism that produced the observed data; (c) prediction which consists of estimating responses for future unseen observations. Other important topics such as identifiability (for unique solution, interpretability and parameter meaningfulness), density estimation, and to a certain extent aspects of unsupervised learning and exploration of group structure (through clustering, data visualisation in 2D) are also covered. We approach such topics as parameter estimation and model selection from both the likelihood-based and Bayesian perspectives, with a concentration on Maximum Likelihood Estimation via the EM algorithm, and Bayesian Analysis via Stochastic Simulation (derivation of efficient Markov Chain Monte Carlo algorithms). The main emphasis of our work is on the derivation and construction of computationally efficient algorithms that perform well on both synthetic tasks and real-life problems, and that can be used as alternatives to other existing methods wherever appropriate.

...read moreread less

Proceedings Article•DOI•

Latent maximum entropy principle for statistical language modeling

[...]

Shaojun Wang¹, Roni Rosenfeld¹, Yunxin Zhao•Institutions (1)

Carnegie Mellon University¹

09 Dec 2001

TL;DR: A unified probabilistic framework for statistical language modeling, the latent maximum entropy principle, which shows that the hidden causal hierarchical dependency structure can be encoded into the statistical model in a principled way by mixtures of exponential families with a rich expressive power.

...read moreread less

Abstract: We describe a unified probabilistic framework for statistical language modeling, the latent maximum entropy principle. The salient feature of this approach is that the hidden causal hierarchical dependency structure can be encoded into the statistical model in a principled way by mixtures of exponential families with a rich expressive power. We first show the problem formulation, solution, and certain convergence properties. We then describe how to use this machine learning technique to model various aspects of natural language, such as syntactic structure of sentences, semantic information in a document. Finally, we draw a conclusion and point out future research directions.

...read moreread less

Proceedings Article•DOI•

Latent class DEDICOM

[...]

Yoshio Takane¹, Henk A.L. Kiers•Institutions (1)

McGill University¹

25 Jul 2001

TL;DR: A probabilistic DEDICOM model is proposed for mobility tables that captures asymmetry in observed mobility tables by asymmetric latent mobility tables and a maximum penalized likelihood (MPL) method is developed for parameter estimation.

...read moreread less

Abstract: A probabilistic DEDICOM model is proposed for mobility tables. The model attempts to explain observed transition probabilities by a latent mobility table and a set of transition probabilities from latent classes to observed classes. The model captures asymmetry in observed mobility tables by asymmetric latent mobility tables. It may be viewed as a special case of both the latent class model and DEDICOM with special constraints. A maximum penalized likelihood (MPL) method is developed for parameter estimation. The EM algorithm is adapted for the MPL estimation. An example is given to illustrate the proposed method.

...read moreread less

Book Chapter•DOI•

[...]

Ulrike Hahn¹, Evan Heit²•Institutions (2)

Cardiff University¹, University of Warwick²

01 Jan 2001

TL;DR: The range of different approaches to capturing semantic similarity are discussed, including multidimensional scaling models and featural models (e.g., Tversky's contrast model) are introduced alongside newer approaches such as structural alignment accounts, context vectors, connectionist models, and generative probabilistic models.

...read moreread less

Abstract: This article discusses the range of different approaches to capturing semantic similarity. Specifically, multidimensional scaling models and featural models (e.g., Tversky's contrast model) are introduced alongside newer approaches such as structural alignment accounts, context vectors, connectionist models, and generative probabilistic models. In addition, references are made to several cognitive abilities in which semantic similarity plays a role, including categorization, inductive reasoning, and memory.

...read moreread less