scispace - formally typeset
Patent

Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis

TLDR
In this article, an index generator and query expander for information retrieval in a corpus is presented, where the disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants.
Abstract
An index generator and query expander for use in information retrieval in a corpus. A corpus is provided as an input to an inflectional analyzer, which produces a lemmatized corpus having base forms and associated inflections for each word in the original corpus. The lemmatized corpus is provided as an input to a disambiguator, which performs part of speech tagging and morpho-syntactic disambiguation to produce a disambiguated corpus. The disambiguated corpus is provided as an input to a derivational generator, which produces an expanded corpus having all possible valid derivatives of each word of the disambiguated corpus. The disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants, producing an index to the corpus having a minimum of variants. Alternatively, a query expander is provided utilizing similar techniques.

read more

Citations
More filters
Patent

Method and system for optimally searching a document database using a representative semantic space

TL;DR: In this paper, a term-by-document matrix is compiled from a corpus of documents representative of a particular subject matter that represents the frequency of occurrence of each term per document.
Patent

Adaptive and scalable method for resolving natural language ambiguities

TL;DR: In this article, a method for resolving ambiguities in natural language by organizing the task into multiple iterations of analysis done in successive levels of depth is presented. But the method is adaptive to the users' need for accuracy and efficiency.
Patent

Concept-based method and system for dynamically analyzing unstructured information

TL;DR: In this article, a method, operating model, system, data structure, computer program and computer program product for analyzing and categorizing unstructured information is provided such that conventional structured data access techniques can be utilized over unstructuring objects.
Journal ArticleDOI

Supporting web query expansion efficiently using multi-granularity indexing and query processing

TL;DR: The notion of a multi-granularity information and processing structure is used to support efficient query expansion, which involves an indexing phase, a query processing and a ranking phase.
Patent

System and method for searching for a query

TL;DR: A search system for searching for electronic documents, and providing a search result in response to a search query is provided in this paper, which includes a search engine that executes a search based on the search query term and the equivalent terms.
References
More filters
Patent

Computer method for automatic extraction of commonly specified information from business correspondence

TL;DR: A Parametric Information Extraction (PIE) system has been developed to identify automatically commonly specified information such as author, date, recipient, address, subject statement, etc. as mentioned in this paper.
Patent

Knowledge information processing system

TL;DR: In this article, a system capable of automatically generating an answer in response to a query or an instruction such as a letter writing instruction input to the system in the form of a natural language.
PatentDOI

Knowledge based information retrieval system

TL;DR: In this paper, an information retrieval system with good human-interface methods to give the system ease-of-use having two distinctive features with the first being visual interface and the second being natural language interpretation.
Patent

Computer system and computer-implemented process for phonology-based automatic speech recognition

TL;DR: In this article, a speech signal containing an utterance is received and linguistic cues in the speech signal are detected, from these detected linguistic cues, a symbolic representation of the contents of the speech signals is generated.
Patent

Information processing system for completing or resolving ambiguity of input information and method therefor

TL;DR: This paper detected incomplete and fuzzy portions in input information by making a reference to world knowledge and knowledge of a specific field in a knowledge base prior to understanding the input information indicating a concept.
Related Papers (5)