Search or ask a question

Showing papers on "Probabilistic latent semantic analysis published in 1994"

PDF

Open Access

Proceedings Article•DOI•

Improving text retrieval for the routing problem using latent semantic indexing

[...]

David A. Hull¹•Institutions (1)

Stanford University¹

01 Aug 1994

TL;DR: This paper applies LSI to the routing task, which operates under the assumption that a sample of relevant and non-relevant documents is available to use in constructing the query, and finds that when LSI is used is conjuction with statistical classification, there is a dramatic improvement in performance.

...read moreread less

Abstract: Latent Semantic Indexing (LSI) is a novel approach to information retrieval that attempts to model the underlying structure of term associations by transforming the traditional representation of documents as vectors of weighted term frequencies to a new coordinate space where both documents and terms are represented as linear combinations of underlying semantic factors. In previous research, LSI has produced a small improvement in retrieval performance. In this paper, we apply LSI to the routing task, which operates under the assumption that a sample of relevant and non-relevant documents is available to use in constructing the query. Once again, LSI slightly improves performance. However, when LSI is used is conjuction with statistical classification, there is a dramatic improvement in performance.

...read moreread less

197 citations

Cross-Language Information Retrieval Using Latent Semantic Indexing

[...]

Paul G. Young

01 Oct 1994

TL;DR: Using the proposed merge strategies, LSI is shown to be able to retrieve relevant documents from either language (Greek or English) without requiring any translation of a user's query.

...read moreread less

Abstract: In this thesis, a method for indexing cross-language databases for conceptual querymatching is presented. Two languages (Greek and English) are combined by appending a small portion of documents from one language to the identical documents in the other language. The proposed merging strategy duplicates less than 7% of the entire database (made up of di erent translations of the Gospels). Previous strategies duplicated up to 34% of the initial database in order to perform the merger. The proposed method retrieves a larger number of relevant documents for both languages with higher cosine rankings when Latent Semantic Indexing (LSI) is employed. Using the proposed merge strategies, LSI is shown to be e ective in retrieving documents from either language (Greek or English) without requiring any translation of a user's query. An e ective Bible search product needs to allow the use of natural language for searching (queries). LSI enables the user to form queries with using natural expressions in the user's own native language. The merging strategy proposed in this study enables LSI to retrieve relevant documents e ectively while duplicating a minimum of the entire database. iv

...read moreread less

27 citations

Latent variables models and missing data analysis.

[...]

Michael J. Rovine

01 Jan 1994

22 citations

Latent semantic analysis and the measurement of knowledge

[...]

Thomas K. Landauer, Susan T. Dumais

01 Jan 1994

21 citations

Proceedings Article•DOI•

Heuristic search integrating syntactic, semantic and dialog-level constraints

[...]

Tatsuya Kawahara¹, Masahiro Araki¹, Shuji Doshita¹•Institutions (1)

Kyoto University¹

19 Apr 1994

TL;DR: A new model of speech understanding, based on the cooperation of the speech recognizer and language analyzer, which interacts with the knowledge sources while keeping its modularity is presented, which realizes robust understanding.

...read moreread less

Abstract: We present a new model of speech understanding, based on the cooperation of the speech recognizer and language analyzer, which interacts with the knowledge sources while keeping its modularity. The semantic analyzer is realized with a semantic network that represents the possible concepts in a task. The speech recognizer based on an LR parser interacts with the semantic analyzer to eliminate invalid hypotheses at an early stage. The coupling of a loose grammar and interactive semantic analysis accepts ill-formed sentences while filtering out non-sense ones, thus realizes robust understanding. Dialog-level knowledge is also incorporated to constrain both the syntactic and the semantic knowledge sources. The key to guide the search efficiently is powerful heuristics. The relationship between the heuristic power and search efficiency is examined experimentally. The stochastic word bigram is derived from the probabilistic LR grammar as A*-admissible heuristics. >

...read moreread less

11 citations

Proceedings Article•DOI•

The semantic vector space model (SVSM): a text representation and searching technique

[...]

Liu¹•Institutions (1)

University of Hawaii at Manoa¹

01 Jan 1994

TL;DR: In this paper, a text representation and searching technique labeled as "Semantic Vector Space Model" (SVSM) is described, which combines Salton's VSM (1991) with distributed representation of semantic case structures of natural language text.

...read moreread less

Abstract: This paper describes a text representation and searching technique labeled as "Semantic Vector Space Model" (SVSM). The proposed technique combines Salton's VSM (1991) with distributed representation of semantic case structures of natural language text. It promises a way of abstracting and encoding richer semantic information of natural language text, and therefore, a better precision performance of IR, without involving sophisticated semantic processing. >

...read moreread less

8 citations