scispace - formally typeset
Search or ask a question

Showing papers on "Natural language understanding published in 1996"


Proceedings ArticleDOI
M. Epstein1, Kishore Papineni1, Salim Roukos1, T. Ward1, S. Della Pietra1 
07 May 1996
TL;DR: A new approach to natural language understanding (NLU) based on the source-channel paradigm is presented, and it is applied to ARPA's Air Travel Information Service (ATIS) domain.
Abstract: We present a new approach to natural language understanding (NLU) based on the source-channel paradigm, and apply it to ARPA's Air Travel Information Service (ATIS) domain. The model uses techniques similar to those used by IBM in statistical machine translation. The parameters are trained using the exact match algorithm; a hierarchy of models is used to facilitate the bootstrapping of more complex models from simpler models.

316 citations


Journal Article
TL;DR: The architecture of spoken language systems and the components of which they are made are discussed, and both a variety of possible approaches and the particular design decisions made in some systems developed at BT Laboratories are described.
Abstract: Spoken language systems allow users to interact with computers by speaking to them. This paper focuses on the most advanced systems, which seek to allow as natural a style of interaction as possible. Specifically this means the use of continuous speech recognition - natural language understanding to interpret the utterance, and an intelligent dialogue manager which allows a flexible style of 'conversation' between computer and user. This paper discusses the architecture of spoken language systems and the components of which they are made, and describes both a variety of possible approaches and the particular design decisions made in some systems developed at BT Laboratories. Three spoken language systems in the course of development are described - a multimodal interface to the BT Business Catalogue, an e-mail secretary which can be consulted over the telephone network, and a multimodal system to allow selection of films in the interactive TV environment.

89 citations


Book
27 Jun 1996
TL;DR: This book discusses machine learning, natural language understanding, and more.
Abstract: KNOWLEDGE IN AI Overview Introduction Representing Knowledge Metrics for Assessing Knowledge Representation Schemes Logic Representations Procedural Representation Network Representations Structured Representations General Knowledge The Frame Problem Knowledge Elicitation Summary Exercises Recommended Further Reading REASONING Overview What is Reasoning? Forward and Backward Reasoning Reasoning with Uncertainty Summary Exercises Recommended Further Reading SEARCH Introduction Exhaustive Search and Simple Pruning Heuristic Search Knowledge-Rich Search Summary Exercises Recommended Further Reading MACHINE LEARNING Overview Why Do We Want Machine Learning? How Machines Learn Deductive Learning Inductive Learning Explanation-Based Learning Example: Query-by-Browsing Summary Recommended Further Reading GAME PLAYING Overview Introduction Characteristics of Game Playing Standard Games Non-Zero-Sum Games and Simultaneous Play The Adversary is Life! Probability Summary Exercises Recommended Further Reading EXPERT SYSTEMS Overview What Are Expert Systems? Uses of Expert Systems Architecture of an Expert System Examples of Four Expert Systems Building an Expert System Limitations of Expert Systems Summary Exercises Recommended Further Reading NATURAL LANGUAGE UNDERSTANDING Overview What is Natural Language Understanding? Why Do We Need Natural Language Understanding? Why Is Natural Language Understanding Difficult? An Early Attempt at Natural Language Understanding: SHRDLU How Does Natural Language Understanding Work? Syntactic Analysis Semantic Analysis Pragmatic Analysis Summary Exercises Recommended Further Reading Solution to SHRDLU Problem COMPUTER VISION Overview Introduction Digitization and Signal Processing Edge Detection Region Detection Reconstructing Objects Identifying Objects Multiple Images Summary Exercises Recommended Further Reading PLANNING AND ROBOTICS Overview Introduction Global Planning Local Planning Limbs, Legs, and Eyes Practical Robotics Summary Exercises Recommended Further Reading AGENTS Overview Software Agents Co-operating Agents and Distributed AI Summary Exercises Recommended Further Reading MODELS OF THE MIND Overview Introduction What is the Human Mind? Production System Models Connectionist Models of Cognition Summary Exercises Recommended Further Reading Notes EPILOGUE: PHILOSOPHICAL AND SOCIOLOGICAL ISSUES Overview Intelligent Machines or Engineering Tools? What Is Intelligence? Computational Argument vs. Searle's Chinese Room Who Is Responsible? Morals and Emotions Social Implications Summary Recommended Further Reading

65 citations


Proceedings ArticleDOI
03 Oct 1996
TL;DR: An evaluation methodology is used that assesses performance at different semantic levels, including the database response comparison used in the ARPA ATIS paradigm, and replaces the system of rules for the semantic analysis with a relatively simple first-order hidden Markov model.
Abstract: A stochastically based approach for the semantic analysis component of a natural spoken language system for the ARPA Air Travel Information Services (ATIS) task has been developed. The semantic analyzer of the spoken language system already in use at LIMSI makes use of a rule-based case grammar. In this work, the system of rules for the semantic analysis is replaced with a relatively simple first-order hidden Markov model. The performances of the two approaches can be compared because they use identical semantic representations, despite their rather different methods for meaning extraction. We use an evaluation methodology that assesses performance at different semantic levels, including the database response comparison used in the ARPA ATIS paradigm.

47 citations


01 Jan 1996
TL;DR: In this article, a semiotic approach based on an ecological understanding of informational systems is proposed to simulate the constitution of meanings and interpretation of signs prior to any predicative and propositional representations in syntax and semantics.
Abstract: Other than the clear-cut realistic division between information processing systems and their surrounding environments employed sofar in models of natural language understanding by machine, it is argued here that a semiotic approach based on an ecological understanding of informational systems is feasible and more adequate. Characterizing such systems’ performance in general and the pragmatics of communicative interaction by real language users in particular, a critical evaluation of cognitive approaches in knowledge-based computational linguistics together with the seminal notions of situation and language game are combined to allow for a procedural modelling and numerical reconstruction of processes that simulate the constitution of meanings and the interpretation of signs prior to any predicative and propositional representations which dominate traditional formats in syntax and semantics. This is achieved by analysing the linear or syntagmatic and selective or paradigmatic constraints which natural language structure imposes on the formation of (strings of) linguistic entities. A formalism with related algorithms and test results of their implementation are produced in order to substantiate the claim for a model of a semiotic cognitive information processing system (SCIPS) that operates in a language environment as some meaning acquisition and understanding device.

18 citations


Book ChapterDOI
Claire Cardie1
01 Jan 1996
TL;DR: Kenmore, a general framework for knowledge acquisition for natural language processing (NLP) systems, argues that the learning and knowledge acquisition components should be embedded components of the NLP system in that learning should take place within the larger natural language understanding system as it processes text.
Abstract: This paper presents Kenmore, a general framework for knowledge acquisition for natural language processing (NLP) systems. To ease the acquisition of knowledge in new domains, Kenmore exploits an online corpus using robust sentence analysis and embedded symbolic machine learning techniques while requiring only minimal human intervention. By treating all problems in ambiguity resolution as classification tasks, the framework uniformly addresses a range of subproblems in sentence analysis, each of which traditionally had required a separate computational mechanism. In a series of experiments, we demonstrate the successful use of Kenmore for learning solutions to several problems in lexical and structural ambiguity resolution. We argue that the learning and knowledge acquisition components should be embedded components of the NLP system in that (1) learning should take place within the larger natural language understanding system as it processes text, and (2) the learning components should be evaluated in the context of practical language-processing tasks.

15 citations


Book
01 Jan 1996
TL;DR: MESNET is presented in its double function as a cognitive model and as the tar- get language for the semantic interpretation processes in NLU systems with emphasis on the ontological as- pect of knowledge representation.
Abstract: Semantic Networks (SN) have been used in many ap- plications, especially in the field of natural language understanding (NLU). The multflayered extended se- mantic network MESNET presented in this paper on the one hand follows the tradition of semantic net- works (SN) starting with the work of Quillian (13). On the other hand, MESNET for the first time con- sequently and explicitly makes use of a multilayered structuring of a SN built upon an orthogonal system of dimensions and especially upon the distinction be- tween an intensional and a preextensional layer. Fur- thermore, MESNET is based on a comprehensive sys- tem of classificatory means (sorts and features) as well as on semantically primitive relations and functions. It uses a relatively large but fixed inventory of repre- sentational means, encapsulation of concepts and a distinction between immanent and situative known edge. The whole complex of representational means is independent of special application domains. With regard to the representation of taxonomic knowledge, MESNET is characterized by the use of a multidimen- sional ontology. A first prototype of MESNET has been successfully applied for the meaning represen- tation of natural language expressions in the system LINAS. In this paper, MESNET is presented in its double function as a cognitive model and as the tar- get language for the semantic interpretation processes in NLU systems with emphasis on the ontological as- pect of knowledge representation.

13 citations


01 Jan 1996
TL;DR: Borrowing from the field of communication theory, an information theoretic approach to natural language understanding is applied based on the source-channel model of communication, and several mathematical models of the noisy channel are developed.
Abstract: The problem of Natural Language Understanding (NLU) has intrigued researchers since the 1960's. Most researchers working in computational linguistics focus on linguistic solutions to their problems. They develop grammars and parsers to process the input natural language into a meaning representation. In this thesis, a new approach is utilized. Borrowing from the field of communication theory, an information theoretic approach to natural language understanding is applied. This is based on the source-channel model of communication. The source-channel model of NLU assumes that the user has a meaning in the domain of the application that he wishes to convey. This meaning is sent through a noisy channel. The observer receives the English sentence as output from the noisy channel. The observer then submits the English sentence to a decoder, which determines the meaning most likely to have generated the English. The decoder uses mathematical models of the channel and the meanings to process the English sentence. Thus, the following problems must be addressed in a source-channel model for NLU: (1) A mathematical model of the noisy-channel must be developed. (2) The parameters of the model must be set, either manually or by an automatic training procedure. (3) A decoder must be built to search through the meaning space for the most likely meaning to have generated the observed English. This dissertation focuses on the first two of these problems. Several mathematical models of the noisy channel are developed. They are trained from a corpus of context independent sentence pairs consisting of both English and the corresponding meaning. The parameters of the models are trained to maximize the likelihood of the model's prediction of the observed training data using the Expectation-Maximization algorithm. Results are presented for the Air Travel Information Service (ATIS) domain.

12 citations


Posted Content
TL;DR: This work argues for a performance-based design of natural language grammars and their associated parsers in order to meet the constraints posed by real-world natural language understanding.
Abstract: We argue for a performance-based design of natural language grammars and their associated parsers in order to meet the constraints posed by real-world natural language understanding. This approach incorporates declarative and procedural knowledge about language and language use within an object-oriented specification framework. We discuss several message passing protocols for real-world text parsing and provide reasons for sacrificing completeness of the parse in favor of efficiency.

11 citations


Journal ArticleDOI
TL;DR: This paper describes the current state of a medical language processor for Dutch and proposes a language specific front-end compatible with some existing applications that aim at the intelligent extraction and processing of information from patient discharge summaries.
Abstract: This paper describes the current state of a medical language processor for Dutch. The goal is to implement a language specific front-end compatible with some existing applications that aim at the intelligent extraction and processing of information from patient discharge summaries. A complete chain for processing and understanding Dutch medical documents will be the ultimate result. The text focuses mainly on the language specific aspects of the language processing chain. Evaluation results of the already functioning components are given as well as an outline for future developments and enhancements. A short theoretical background is provided (cf. also [1-3]: Rossi Mori et al., Proc. SCAMC 90, 1990, pp. 185-189; Wingert, in: Informatics and Medicine, an advanced course, Springer-Verlag. 1977, pp. 579-646; Wingert, Proc. MEDINFO 80, 1980, pp. 1321-1331) before the description of each component in order to familiarise the non-experienced reader with the basic notions of computational linguistics.

8 citations


Journal ArticleDOI
01 Aug 1996
TL;DR: NALIG is described, a system able to "understand" and "reason about" high level descriptions of spatial scenes in CAD systems for interior design by using a natural language interface which is expressive enough to allow the description of complex configurations of objects.
Abstract: We are mainly interested in the development of CAD systems for interior design. An effective use of such systems relies to a large extent on the characteristics of their user interface. This paper describes NALIG, a system able to "understand" and "reason about" high level descriptions of spatial scenes. The user interacts with the system by using a natural language interface which, though very simple, is expressive enough to allow the description of complex configurations of objects. NALIG replies by drawing on the screen an image mirroring its own "understanding" of the scene described. The comprehension process has required the integration of different AI-techniques (e.g., natural language understanding, spatial reasoning, default and common sense reasoning).

Journal ArticleDOI
TL;DR: The paper argues that the recognition abilities underlying the application of language to the world are indeed a prerequisite of semantic competence, and that an integrated system could not be considered as essentially on a par with a purely inferential system of the traditional kind unless one were prepared to regard even the human understanding system as “purely syntactic” (and therefore incapable of genuine understanding.
Abstract: The main reason why systems of natural language understanding are often said not to “really” understand natural language is their lack of referential competence. A traditional system, even an ideal one, cannot relate language to the perceived world, whereas — obviously — a human speaker can. The paper argues that the recognition abilities underlying the application of language to the world are indeed a prerequisite of semantic competence.

Proceedings ArticleDOI
14 Oct 1996
TL;DR: This article shows how linguistic resolutions can be achieved by using both rules and associations in a neurosymbolic framework and how context effects can be modeled and carried over into the sentence analysis.
Abstract: Like many natural cognitive processes, natural language processing involves a simultaneous consideration of large number of different sources of information. It is unrealistic to assume a single and simple recipe can solve the general problem of ambiguity. In this article, we show how linguistic resolutions can be achieved by using both rules and associations in a neurosymbolic framework and how context effects can be modeled and carried over into the sentence analysis. Three applications attest the validity of our framework. This exploration contributes to our understanding in linguistic resolution as well as simulates the dynamic and complex processes that take place in text comprehension.

Proceedings ArticleDOI
05 Aug 1996
TL;DR: This work is an initial step toward the ultimate goal of text and speech translation for enhanced multilingual and multinational operations, and has adopted an interlingua approach with natural language understanding and generation modules at the core.
Abstract: This paper describes our work-in-progress in automatic English-to-Korean text translation. This work is an initial step toward the ultimate goal of text and speech translation for enhanced multilingual and multinational operations. For this purpose, we have adopted an interlingua approach with natural language understanding (TINA) and generation (GENESIS) modules at the core. We tackle the ambiguity problem by incorporating syntactic and semantic categories in the analysis grammar. Our system is capable of producing accurate translation of complex sentences (38 words) and sentence fragments as well as average length (12 words) grammatical sentences. Two types of system evaluation have been carried out: one for grammar coverage and the other for overall performance. For system robustness, integration of two subsystems is under way: (i) a rule-based part-of-speech tagger to handle unknown words/constructions, and (ii) a word-forword translator to handle other system failures.


Proceedings ArticleDOI
07 May 1996
TL;DR: The algorithm is an extension of a previously developed method for stochastic context-free grammars used both in the process of grammar inference and on the classification of sleep macrostructure, and features its application into other areas, namely in natural language and speech understanding.
Abstract: This paper addresses the problem of language recognition according to stochastic attributed context-free grammars. Starting with an application as the main motivation-automatic sleep analysis-a heuristic recognizer for attributed grammars is proposed. The algorithm (an extension of a previously developed method for stochastic context-free grammars) is there used both in the process of grammar inference and on the classification of sleep macrostructure. The generality of the method features its application into other areas, namely in natural language and speech understanding.

Journal ArticleDOI
TL;DR: The modular design of ECA permits its use within NAUS in addition to other applications such as sentence generation, speech analysis and synthesis, translation systems, teaching Arabic, and checking and correcting grammatical errors.
Abstract: Natural Language Understanding (NLU) has been a growing area of research in computer science. Although morphology and syntax play an essential role in NLU, end-case analysis in some languages, like Arabic, is an important component in the correct interpretation of a sentence. In this research, an end-case analyzer (ECA) for Arabic sentences has been designed, implemented, and integrated within a Natural Arabic Understanding System (NAUS). ECA consists of two main components: the end-case generator, and the invoker. The end-case generator determines the end-case analysis for the given sentence and its constituents according to Arabic end-case grammar rules, which have been encoded in ECA as IF-THEN Prolog rules and predicates. The invoker was implemented as calls to the end-case generator inserted within the syntactic analyzer of NAUS. ECA was implemented in Prolog on a personal computer with Arabic support. The modular design of ECA permits its use within NAUS in addition to other applications such as sentence generation, speech analysis and synthesis, translation systems, teaching Arabic, and checking and correcting grammatical errors.

Journal ArticleDOI
TL;DR: The framework of a system that can understand a textbook for machine operation, based on the model of an object world that can retain informations not only by symbolic representation but also by image is described and the imagery object world model is proposed.
Abstract: This paper describes the framework of a system that can understand a textbook for machine operation, based on the model of an object world that can retain informations not only by symbolic representation but also by image. In any understanding of the description for machine operation, spatial information is important, and there must be a model that can retain the informations with natural reflection of the continuity and relative location, independently of the particular point of observation. Such a representation is difficult, dealing only with the symbolic knowledge representation used in the conventional language understanding system. From such a viewpoint, this paper notes the usefulness of the representation by the image and proposes the imagery object world model, which can utilize partially the properties of the image, using mathematical expressions in a two-dimensional real coordinate system. Mechanisms both to execute the simulation on the proposed model and to read the text using the simulation mechanism, are investigated. A system is constructed experimentally based on the proposed idea, which can construct the model for the input sentence, while executing a simulation and generating symbolic recognition expressions. With those functions, the identity or the similarity of the language expressions derived from the same object or event from different viewpoints, can adequately be decided.


Book Chapter
01 Jan 1996
TL;DR: The components which comprise a Natural Language Understanding system to deal with continuous dialogue are described and it is suggested that the approach taken allows the system to provide a cooperative response to assist the user in attaining the information seeking goal.
Abstract: The benefits of a Natural Language Understanding (NLU) system for information seeking can only be realised if the system allows for effective communication. The system should be able to deal with the interpretation of referring expressions in dialogue, such as anaphors and ellipsis. In this paper, the components which comprise a NLU system to deal with continuous dialogue are described. Given that the syntactic and semantic information can produce a suitable representation of each utterance, pragmatic information may be used to determine how this contextual information determines the interpretation of subsequent utterances. It is suggested that the approach taken allows the system to provide a cooperative response to assist the user in attaining the information seeking goal.

Proceedings Article
01 Dec 1996
TL;DR: This paper has used this neural network based Classification Tool in Part-of-Speech tagging and SemanticCategory tagging of Chinese lexicon with the help of thesaurus and large training corpus and results are analysed and compared.
Abstract: Lexical attributes, like syntactic (part-of-speech) and semantic (semantic category) attributes, are in most cases, ambiguous in every languages. Automatic resolution of ambiguity of these attributes can be achieved using different techniques; rule-based, statistical, NN-based and their hybrids. Moreover, one linguistic feature also has influence over the resolution of ambiguity of another feature; eg.. knowledge of syntactical category can assist smooth disambiguation of semantic category and vice versa. Properly disambiguated syntactic and semantic properties of lexicon may significantly help us in word sense disambiguation, text analysis, information retreival, natural language understanding and speech processing etc. In this paper, we have presented our neural network based Classification Tool. We have used this tool in Part-of-Speech tagging and SemanticCategory tagging of Chinese lexicon with the help of thesaurus and large training corpus. Experimental results are analysed and compared. 1 Introduction to Lexical ambiguity in Chinese