scispace - formally typeset
Search or ask a question

Showing papers on "Noisy text analytics published in 1990"


Journal ArticleDOI
Udo Hahn1
TL;DR: This paper introduces a parser which is based on the conceptual knowledge of its domain and is organized as a collection of distributed lexicalized grammar modules (word experts) which communicate through message-passing.
Abstract: The rapid proliferation of full-text databases poses serious problems to the natural language processing components of information retrieval systems Not taking text-level phenomena of written natural language discourse into account causes a marked decrease of performance for many text information system applications Consequently, appropriate text parsing facilities must be capable of recognizing the rich internal structure of full-texts on lower levels of text connectivity as well as on the global organizational level of text coherence This paper introduces such a parser which is based on the conceptual knowledge of its domain and is organized as a collection of distributed lexicalized grammar modules (word experts) which communicate through message-passing Emphasis is put on text grammatical specifications which state formal conditions for recognizing higher-order text constituents and their coherent configuration on the global level of textual macro organization

71 citations


Proceedings ArticleDOI
02 Dec 1990
TL;DR: An experiment that was conducted to assess the feasibility of using automatic text-to-speech translation in a message relay service for the deaf indicates that speech synthesis can be a viable alternative to relay operator translation.
Abstract: An experiment that was conducted to assess the feasibility of using automatic text-to-speech translation in a message relay service for the deaf is described. Five conditions manipulated telecommunication device for the deaf (TDD) text to create improvements that could be implemented automatically. The manipulations included preprocessing changes such as expanding abbreviations, use of an automatic spelling corrector, and providing rules to users about how to generate more comprehensible text. The results show that, overall, subjects understood the symbolic speech messages very well and that the text manipulations resulted in improved performance. It is shown that a combined condition, which uses the preprocessing changes, the spelling corrections, and the rules to users, resulted in the best performance. These results indicate that speech synthesis can be a viable alternative to relay operator translation. >

14 citations


01 Oct 1990
TL;DR: In the present note, some details are given concerning the usefulness of term weighting systems for the content analysis of natural-language texts, and of text matching strategies designed to identify relevant text items in answer to available search requests.
Abstract: In information retrieval, it is not uncommon to be faced with large collections of unrestricted natural-language text. In such circumstances, the text analysis and retrieval operations must be based mainly on a study of the text collections actually under construction. Two main operations are of interest: a text analysis operation designed to assign content identifiers to the stored texts, and a text comparison system designed to identify texts covering particular subject areas. In the present note, some details are given concerning the usefulness of term weighting systems for the content analysis of natural-language texts, and of text matching strategies designed to identify relevant text items in answer to available search requests. A sample collection of electronic mail messages is used for experimental purposes.

3 citations


01 Jun 1990
TL;DR: Evaluation of text matching operations for text excerpts of varying scope shows that when the global text similarity between distinct text paragraphs is high, while at the same time local similarities also exist for particular text sentences included in these paragraphs, the presumption is that the paragraphs cover related subject matter.
Abstract: When large text collections must be processed, it is not possible to limit the scope of the subject matter of interest. In such a situation the standard content analysis methods that are based on the use of knowledge bases to represent the relevant subject areas are not applicable. Necessarily, the text themselves must then serve as the main basis for the content analysis operations. Experiments are described in this note designed to evaluate text matching operations for text excerpts of varying scope, including in particular text paragraphs and text sentences extracted from book size materials. The evaluation shows that when the global text similarity between distinct text paragraphs is high, while at the same time local similarities also exist for particular text sentences included in these paragraphs, the presumption is that the paragraphs cover related subject matter. One concludes that text matching systems may prove useful for text linking and information retrieval.

2 citations