scispace - formally typeset
Search or ask a question

Showing papers on "Natural language understanding published in 2005"


Proceedings ArticleDOI
29 Jun 2005
TL;DR: This article investigated computational linguistics (CL) techniques in marking short free text responses automatically and found that successful automatic marking of free text answers would seem to presuppose an advanced level of performance in automated natural language understanding.
Abstract: Our aim is to investigate computational linguistics (CL) techniques in marking short free text responses automatically. Successful automatic marking of free text answers would seem to presuppose an advanced level of performance in automated natural language understanding. However, recent advances in CL techniques have opened up the possibility of being able to automate the marking of free text responses typed into a computer without having to create systems that fully understand the answers. This paper describes some of the techniques we have tried so far vis-a-vis this problem with results, discussion and description of the main issues encountered.1

114 citations


Patent
14 Apr 2005
TL;DR: In this paper, a method of integrating conversational speech into a multimodal, Web-based processing model can include speech recognizing a user spoken utterance directed to a voice-enabled field of a multi-modal markup language document presented within a browser.
Abstract: A method of integrating conversational speech into a multimodal, Web-based processing model can include speech recognizing a user spoken utterance directed to a voice-enabled field of a multimodal markup language document presented within a browser. A statistical grammar can be used to determine a recognition result. The method further can include providing the recognition result to the browser, receiving, within a natural language understanding (NLU) system, the recognition result from the browser, and semantically processing the recognition result to determine a meaning. Accordingly, a next programmatic action to be performed can be selected according to the meaning.

78 citations


Book ChapterDOI
09 Jul 2005
TL;DR: This article present a principled approach to semantic entailment that builds on inducing representations of text snippets into a hierarchical knowledge representation along with a sound optimization-based inferential mechanism that makes use of it to decide semantic entailments.
Abstract: Semantic entailment is the problem of determining if the meaning of a given sentence entails that of another. This is a fundamental problem in natural language understanding that provides a broad framework for studying language variability and has a large number of applications. This paper presents a principled approach to this problem that builds on inducing representations of text snippets into a hierarchical knowledge representation along with a sound optimization-based inferential mechanism that makes use of it to decide semantic entailment. A preliminary evaluation on the PASCAL text collection is presented.

68 citations


Patent
30 Dec 2005
TL;DR: In this paper, the authors present a system for building a language model representation of an NLU application, where a developer assigns sentences of an NLP application to categories, sub-categories or end targets across one or more features for associating each sentence with desire interpretations.
Abstract: The invention disclosed herein concerns a system (100) and method (600) for building a language model representation of an NLU application. The method 500 can include categorizing an NLU application domain (602), classifying a corpus in view of the categorization (604), and training at least one language model in view of the classification (606). The categorization produces a hierarchical tree of categories, sub-categories and end targets across one or more features for interpreting one or more natural language input requests. During development of an NLU application, a developer assigns sentences of the NLU application to categories, sub-categories or end targets across one or more features for associating each sentence with desire interpretations. A language model builder (140) iteratively builds multiple language models for this sentence data, and iteratively evaluating them against a test corpus, partitioning the data based on the categorization and rebuilding models, so as to produce an optimal configuration of language models to interpret and respond to language input requests for the NLU application.

62 citations


Proceedings Article
06 May 2005
TL;DR: The use of information extraction and machine learning techniques in the marking of short, free text responses of up to around five lines is described.
Abstract: Traditionally, automatic marking has been restricted to item types such as multiple choice that narrowly constrain how students may respond More open ended items have generally been considered unsuitable for machine marking because of the difficulty of coping with the myriad ways in which credit-worthy answers may be expressed Successful automatic marking of free text answers would seem to presuppose an advanced level of performance in automated natural language understanding However, recent advances in computational linguistics techniques have opened up the possi-bility of being able to automate the marking of free text responses typed into a computer without having to create systems that fully understand the answers This paper describes the use of information extraction and machine learning techniques in the marking of short, free text responses of up to around five lines

45 citations


Patent
Yuqing Gao1, Hong-Kwang Kuo1, Roberto Pieraccini1, Jerome L. Quinn1, Cheng Wu1 
12 Jul 2005
TL;DR: In this paper, a number of categorized sentences are extracted from a natural language understanding (NLU) system and categorized into several categories, and sentences within a given one of the categories are clustered into sub clusters, and the sub clusters are analyzed to identify data anomalies.
Abstract: Techniques for detecting data anomalies in a natural language understanding (NLU) system are provided. A number of categorized sentences, categorized into a number of categories, are obtained. Sentences within a given one of the categories are clustered into a number of sub clusters, and the sub clusters are analyzed to identify data anomalies. The clustering can be based on surface forms of the sentences. The anomalies can be, for example, ambiguities or inconsistencies. The clustering can be performed, for example, with a K-means clustering algorithm.

25 citations


Book ChapterDOI
01 Jan 2005
TL;DR: An architecture designed to accommodatemultiple aspects of human mental functioning that accommodates perception, associative memory, emotions, action-selection, deliberation, language generation, behavioral and perceptual learning, self-preservation and metacognition modules.
Abstract: Here we describe an architecture designed to accommodatemultiple aspects of human mental functioning. In a roughly star-shaped configuration centered on a “consciousness” module, the architecture accommodates perception, associative memory, emotions, action-selection, deliberation, language generation, behavioral and perceptual learning, self-preservation and metacognition modules. The various modules (partially) implement several different theories of these various aspects of cognition. The mechanisms used in implementing the several modules have been inspired by a number of different “new AI” techniques. One software agent embodying much of the architecture is in the debugging stage (Bogner et al. in press). A second, intending to include all of the modules of the architecture is well along in the design stage (Franklin et al. 1998). The architecture, together with the underlying mechanisms, comprises a fairly comprehensive model of cognition (Franklin & Graesser 1999). The most significant gap is the lack of such human-like senses as vision and hearing, and the lack of realworld physical motor output. The agents interact with their environments mostly through email in natural language. The “consciousness” module is based on global workspace theory (Baars 1988, 1997). The central role of this module is due to its ability to select relevant resources with which to deal with incoming perceptions and with current internal states. Its underlying mechanism was inspired by pandemonium theory (Jackson 1987). The perception module employs analysis of surface features for natural language understanding (Allen 1995). It partially implements perceptual symbol system theory (Barsalou 1999), while its underlying mechanism constitutes a portion of the copycat architecture (Hofstadter & Mitchell 1994). Within this architecture the emotions play something of the role of the temperature in the copycat architecture and of the gain control in pandemonium theory. They give quick indication of how well things are going, and influence both actionselection and memory. The theory behind this module was influenced by several sources (Picard 1997, Johnson 1999, Rolls 1999). The implementation is via pandemonium theory enhanced with an activation-passing network. The action-selection mechanism of this architecture is implemented by a major enhancement of the behavior net (Maes 1989). Behavior in this model corresponding to goal contexts in global workspace theory. The net is fed at one end by environmental and/or internal state influences, and at the other by fundamental drives. Activation passes in both directions. The behaviors compete for execution, that is, to become the dominant goal context. The deliberation and language generation modules are implemented via pandemonium theory. The construction of scenarios and of outgoing messages are both accomplished by repeated appeal to the “consciousness” mechanism. Relevant events for the scenarios and paragraphs for the messages offer themselves in response to “conscious” broadcasts. The learning modules employ case-based reasoning (Kolodner 1993) using information gleaned from human correspondents. Metacognition is based on fuzzy classifier systems (Valenzuela-Rendon 1991). As in the copycat architecture, almost all of the actions taken by the agents, both internal and external, are performed by codelets. These are small pieces of code typically doing one small job with little communication between them. Our architecture can be thought of as a multi-agent system overlaid with a few, more abstract mechanisms. Altogether, it offers one possible architecture for a relatively fully functioning mind. One could consider these agents as early attempts at the exploration of design space and niche space (Sloman 1998).

23 citations


Patent
15 Dec 2005
TL;DR: In this article, the authors present a method to add the creation of examples at developer level in the generation of NLU models, tying the examples into a NLU sentence database, and automatically validating a correct outcome of using the examples, and resolving the problems the user has using examples.
Abstract: A method (300) and system (100) is provided to add the creation of examples at a developer level in the generation of Natural Language Understanding (NLU) models, tying the examples into a NLU sentence database (130), automatically validating (310) a correct outcome of using the examples, and automatically resolving (316) problems the user has using the examples. The method (300) can convey examples of what a caller can say to a Natural Language Understanding (NLU) application. The method includes entering at least one example associated with an existing routing destination, and ensuring an NLU model correctly interprets the example unambiguously for correctly routing a call to the routing destination. The method can include presenting the example sentence in a help message (126) within an NLU dialogue as an example of what a caller can say for connecting the caller to a desired routing destination. The method can also include presented a failure dialogue for displaying at least one example that failed to be properly interpreted to ensure that ambiguous or incorrect examples are not presented in a help message.

22 citations



Proceedings ArticleDOI
09 Oct 2005
TL;DR: Where general-purpose resources such as PropBank and Framenet provide invaluable training data for general case, it tends to be a problem to obtain enough training data in a specific dialogue oriented domain.
Abstract: Natural language understanding is an essential module in any dialogue system. To obtain satisfactory performance levels, a dialogue system needs a semantic parser/natural language understanding system (NLU) that produces accurate and detailed dialogue oriented semantic output. Recently, a number of semantic parsers trained using either the FrameNet (Baker et al., 1998) or the Prop-Bank (Kingsbury et al., 2002) have been reported. Despite their reasonable performances on general tasks, these parsers do not work so well in specific domains. Also, where these general purpose parsers tend to provide case-frame structures, that include the standard core case roles (Agent, Patient, Instrument, etc.), dialogue oriented domains tend to require additional information about addressees, modality, speech acts, etc. Where general-purpose resources such as PropBank and Framenet provide invaluable training data for general case, it tends to be a problem to obtain enough training data in a specific dialogue oriented domain.

17 citations


Proceedings ArticleDOI
02 Oct 2005
TL;DR: This paper presents a "matcher" that uses semantic transformations to overcome structural differences between the two representations and evaluates this matcher in a MUC-like template-filling task and compares its performance to that of two similar systems.
Abstract: An ultimate goal of AI is to build end-to-end systems that interpret natural language, reason over the resulting logical forms, and perform actions based on that reasoning. This requires systems from separate fields be brought together, but often this exposes representational gaps between them. The logical forms from a language interpreter may mirror the surface forms of utterances too closely to be usable as-is, given a reasoner's requirements for knowledge representations. What is needed is a system that can match logical forms to background knowledge flexibly to acquire a rich semantic model of the speaker's goal. In this paper, we present such a "matcher" that uses semantic transformations to overcome structural differences between the two representations. We evaluate this matcher in a MUC-like template-filling task and compare its performance to that of two similar systems.

Patent
25 Jul 2005
TL;DR: In this article, systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding are presented. But none of the disclosed systems and method embrace a data-driven approach to NLP understanding which progresses seamlessly along the continuum of availability of annotated collected data.
Abstract: Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data.

Proceedings ArticleDOI
06 Jul 2005
TL;DR: A system that performs vision based isolated Arabic sign language recognition using hidden Markov models together with EM algorithm for parameters estimation and an approach to track hands in subsequent frames is proposed using a fuzzy object similarity measure based on a number of geometrical features of hands.
Abstract: As a part of natural language understanding, sign language recognition is considered an important area of research. The applications of such a system range from human-computer interaction in virtual reality systems to auxiliary tools for deaf-mute to communicate with ordinary people through computer. A great deal of research is done so far but fewer researchers have extended it to Arabic sign language recognition. In this paper, we have presented a system that performs vision based isolated Arabic sign language recognition using hidden Markov models together with EM algorithm for parameters estimation. An approach to track hands in subsequent frames is proposed using a fuzzy object similarity measure based on a number of geometrical features of hands. Moreover, we have used the centroid of the signer's face to centralize the body coordinates instead of fixing the signer's position or using position tracker device. The overall accuracy of the recognition task is 98% over a dataset of 50 signs including single hand and two-handed signs.

Proceedings ArticleDOI
31 Jul 2005
TL;DR: The hardware architecture design is based on a few ideas taken from the anatomy of mammalian neo-cortex, and uses an approximation to cortical computation called the network of networks which holds that the basic computing unit in the cortex is not a single neuron but small groups of them working together in attractor networks.
Abstract: We want to design a suitable computer for the efficient execution of the software now being developed that displays human-like cognitive abilities. Examples of these potential software applications include natural language understanding, text processing, conceptually based Internet search, natural human-computer interfaces, cognitively based data mining, sensor fusion, and image understanding. Requirements of the proposed software are primary in shaping our hardware design. The hardware architecture design is based on a few ideas taken from the anatomy of mammalian neo-cortex. In common with other such attempts it is a massively parallel, two-dimensional array of CPUs and their associated memory. However, the design used in this project: (1) uses an approximation to cortical computation called the network of networks which holds that the basic computing unit in the cortex is not a single neuron but small groups of them working together in attractor networks; and (2) assumes connections in cortex are very sparse. The resulting architecture depends largely on local data movement.

Book ChapterDOI
01 Jan 2005

Proceedings ArticleDOI
14 Nov 2005
TL;DR: The concept of a generalized constraint may serve an important function as a bridge from natural languages to mathematics, and in the role, it may find many applications, ranging from formalization of legal reasoning to enhancement of Web intelligence and natural language understanding.
Abstract: The concept of a generalized constraint was introduced close to two decades ago. For a number of years, it lay dormant and unused. But then, in the mid-nineties, it found an important application as a basis for the methodology of computing with words (CW). The basis for the expectation is that as we move further into the age of machine intelligence and mechanized reasoning, the problem of natural language understanding looms larger and larger in importance and visibility. To deal with this problem what is needed is fuzzy logic - a logic in which everything is, or is allowed to be, a matter of degree. In this paper, the author discussed the concept of generalized constraint. With the concept of a generalized constraint as its centerpiece, PNL opens the door to a wide-ranging enlargement of the role of natural language in scientific theories, particularly there theories on what is human perceptions, judgment and small emotions play important roles.

Proceedings Article
01 Jan 2005
TL;DR: This thesis presents a Natural Language Understanding system developed in the context of the Geometry Cognitive Tutor that combines unification-based syntactic processing with Description Logic based semantics to achieve the necessary accuracy level.
Abstract: High-precision Natural Language Understanding is needed in Geometry Tutoring to accurately determine the semantic content of students' explanations. The thesis presents a Natural Language Understanding system developed in the context of the Geometry Cognitive Tutor. The system combines unification-based syntactic processing with Description Logic based semantics to achieve the necessary accuracy level. The thesis examines in detail the main problem faced by a natural language understanding system in the geometry tutor, that of accurately determining the semantic content of students' input. It then reviews alternative approaches used in Intelligent Tutoring Systems, and presents some difficulties these approaches have in addressing the main problem. The thesis proceeds to describe the system architecture of our approach, as well as the compositional process of building the syntactic structure and the semantic interpretation of students' explanations. The syntactic and semantic processing of natural language are described in detail, as well as solutions for specific natural language understanding problems, like metonymy resolution and reference resolution. The thesis also discusses a number of problems occurring in determining semantic equivalence of natural language input and shows how our approach deals with them. The classification performance of the adopted solution is evaluated on data collected during a recent classroom study and is compared to a Naive Bayes approach. The generality of our solution is demonstrated in a practical experiment of porting the approach to a new semantic domain, Algebra. The thesis discusses the changes needed in the new implementation, the time effort required, and presents the classification performance in the new domain. Finally, the thesis provides a high level Description Logic view of the presented approach to semantic representation and inference, and talks about the possibility to implement it in other logic systems.

Proceedings ArticleDOI
05 Jul 2005
TL;DR: This study proposes an ontology approach as teachers' perspective to fully exploit the implicit metadata in manifest.xml to implement content packages of searching and recommending for teachers while they intend to deliver a specified course unit.
Abstract: The purpose of this study is to implement content packages of searching and recommending for teachers while they intend to deliver a specified course unit. Many efforts have been made on how to technically develop authoring tools instead how to apply existing content packages in teaching practices. To overcome these barriers, this study proposes an ontology approach as teachers' perspective to fully exploit the implicit metadata in manifest.xml. The implementation tool consists of Protegee 2000, Jena API, and Java script. Firstly, the parser of Natural language Understanding can accept the Mandarin in GUI and transform it into an index of competence (IC) from domain ontology. Then, task ontology can be attained with a CP. Secondly, the application ontology is defined as learning sequences of different CPs. Thus, it guides scaffolding for extensible or remedial CPs for teachers' instructional design. Finally, the implications of this study and future research are also included.

Proceedings ArticleDOI
Yu Sun1, F. Karray1, Otman A. Basir1, Jiping Sun1, Mohamed S. Kamel1 
25 May 2005
TL;DR: This study proposes a fuzzy approach to tackle issues in a way to provide a methodology for logical reorganizing context in order to tackle the issue of boundary limitation, to create the more reasonable and understandable word association, and to make the processes of imparting of human knowledge easier and less domain dependent.
Abstract: One of the many issues that confront traditional statistical approaches of natural language understanding (NLU) is on how to overcome the insufficient co-occurrence information caused by the limited boundary of statistical approaches. Researches have long used the imparting of human knowledge into statistical approaches, including definition of rules and collections of hierarchy of concepts. However, these are difficult to define even for a domain expert. They are also very much people and domain dependent. This study proposes a fuzzy approach to tackle these issues in a way as to provide a methodology for logical reorganizing context in order to tackle the issue of boundary limitation, to create the more reasonable and understandable word association which will be referenced as membership degree in latter stage, and to make the processes of imparting of human knowledge easier and less domain dependent. The accomplishment of these tasks could be achieved through the concept of precisiated natural language (PNL)

01 Jan 2005
TL;DR: In this article, an architecture and its implementation for the natural language dialog (NLD) applications in data exploration and presentation domain is presented, which can be integrated as a part of corporate information delivery web portal to bring new modalities for user interfaces.
Abstract: In this paper we present architecture and its implementation for the natural language dialog (NLD) applications in data exploration and presentation domain. Presented architecture can be integrated as a part of corporate information delivery web portal to bring new modalities for user interfaces. The architecture is based on software agent's paradigm and supports mobile as well as stationary agents. On this NLD architecture we implemented open source project for data exploration when the system user explores corporate data in the terms of machine human dialog. The following well know toolboxes has been integrated in this project: GATE general architecture for natural language processing, IBM natural language understanding (NLU) toolbox, JOONE neural network toolbox, Aglets mobile agents framework.

Proceedings Article
30 Jul 2005
TL;DR: This article present a principled approach to semantic entailment that builds on inducing re-representations of text snippets into a hierarchical knowledge representation along with a sound inferential mechanism that makes use of it to prove entailment.
Abstract: Semantic entailment is the problem of determining if the meaning of a given sentence entails that of another. This is a fundamental problem in natural language understanding that provides a broad framework for studying language variability and has a large number of applications. We present a principled approach to this problem that builds on inducing re-representations of text snippets into a hierarchical knowledge representation along with a sound inferential mechanism that makes use of it to prove semantic entailment.

Proceedings ArticleDOI
07 Oct 2005
TL;DR: This presentation will demonstrate the latest development of an in-car dialog system for an MP3 player designed under a joint research effort from Bosch RTC, VW ERL, Stanford CSLI, and SRI STAR Lab funded by NIST ATP.
Abstract: In recent years, an increasing number of new devices have found their way into the cars we drive. Speech-operated devices in particular provide a great service to drivers by minimizing distraction, so that they can keep their hands on the wheel and their eyes on the road. This presentation will demonstrate our latest development of an in-car dialog system for an MP3 player designed under a joint research effort from Bosch RTC, VW ERL, Stanford CSLI, and SRI STAR Lab funded by NIST ATP [Weng et al 2004] with this goal in mind. This project has developed a number of new technologies, some of which are already incorporated in the system. These include: end-pointing with prosodic cues, error identification and recovering strategies, flexible multi-threaded, multi-device dialog management, and content optimization and organization strategies. A number of important language phenomena are also covered in the system's functionality. For instance, one may use words relying on context, such as 'this,' 'that,' 'it,' and 'them,' to reference items mentioned in particular use contexts. Different types of verbal revision are also permitted by the system, providing a great convenience to its users. The system supports multi-threaded dialogs so that users can diverge to a different topic before the current one is finished and still come back to the first after the second topic is done. To lower the cognitive load on the drivers, the content optimization component organizes any information given to users based on ontological structures, and may also refine users' queries via various strategies. Domain knowledge is represented using OWL, a web ontology language recommended by W3C, which should greatly facilitate its portability to new domains.The spoken dialog system consists of a number of components (see Fig. 1 for details). Instead of the hub architecture employed by Communicator projects [Senef et al, 1998], it is developed in Java and uses a flexible event-based, message-oriented middleware. This allows for dynamic registration of new components. Among the component modules in Figure 1, we use the Nuance speech recognition engine with class-based ngrams and dynamic grammars, and the Nuance Vocalizer as the TTS engine. The Speech Enhancer removes noises and echo. The Prosody module will provide additional features to the Natural Language Understanding (NLU) and Dialogue Manager (DM) modules to improve their performance.The NLU module takes a sequence of recognized words and tags, performs a deep linguistic analysis with probabilistic models, and produces an XML-based semantic feature structure representation. Parallel to the deep analysis, a topic classifier assigns top n topics to the utterance, which are used in the cases where the dialog manager cannot make any sense of the parsed structure. The NLU module also supports dynamic updates of the knowledge base.The CSLI DM module mediates and manages interaction. It uses the dialogue-move approach to maintain dialogue context, which is then used to interpret incoming utterances (including fragments and revisions), resolve NPs, construct salient responses, track issues, etc. Dialogue states can also be used to bias SR expectation and improve SR performance, as has been performed in previous applications of the DM. Detailed descriptions of the DM can be found in [Lemon et al 2002; Mirkovic & Cavedon 2005].The Knowledge Manager (KM) controls access to knowledge base sources (such as domain knowledge and device information) and their updates. Domain knowledge is structured according to domain-dependent ontologies. The current KM makes use of OWL, a W3C standard, to represent the ontological relationships between domain entities. Protege (http://protege.stanford.edu), a domain-independent ontology tool, is used to maintain the ontology offline. In a typical interaction, the DM converts a user's query into a semantic frame (i.e. a set of semantic constraints) and sends this to the KM via the content optimizer.The Content Optimization module acts as an intermediary between the dialogue management module and the knowledge management module during the query process. It receives semantic frames from the DM, resolves possible ambiguities, and queries the KM. Depending on the items in the query result as well as the configurable properties, the module selects and performs an appropriate optimization strategy.Early evaluation shows that the system has a task completion rate of 80% on 11 tasks of MP3 player domain, ranging from playing requests to music database queries. Porting to a restaurant selection domain is currently under way.

01 Jan 2005
TL;DR: This thesis presents OntoNat, a prototypical system for answering Yes/No-questions on natural language sentences, which uses background knowledge from the Suggested Upper Model Ontology so that it can perform some kind of common sense reasoning to answer a question.
Abstract: This thesis presents OntoNat, a prototypical system for answering Yes/No-questions on natural language sentences. Different from existing systems, OntoNat uses background knowledge from the Suggested Upper Model Ontology (SUMO)[NP01], so that it can perform some kind of common sense reasoning to answer a question. SUMO is translated to a disjunctive logic program (DLP). The input sentence and the Yes/Noquestion are also translated to DLPs, in cooperation with the Computational Linguistics Department of Saarland University. These DLPs are given to a first-order theorem-prover (KRHyper[Wer03]), which tries to answer the question. Acknowledgement This work would not have been possible without the persisting support of my supervisor Dr. habil. Peter Baumgartner. I would like to thank him for his patience and his fruitful comments.

Proceedings Article
01 Jul 2005
TL;DR: This paper introduces a network-based answer discovery scheme coupled with some advanced reasoning features that is part of NaLURI (Natural Language Understanding and Reasoning for Intelligence), a knowledge-based question answering system.
Abstract: This paper introduces a network-based answer discovery scheme coupled with some advanced reasoning features that is part of NaLURI (Natural Language Understanding and Reasoning for Intelligence), a knowledge-based question answering system. This move beyond classical logic-based reasoning is necessary in order to provide intelligent responses under suboptimal circumstances or failures and is especially important for question answering systems like NaLURI where the nature of the input varies greatly, causing immense uncertainties during response generation.

Book ChapterDOI
31 Aug 2005
TL;DR: This contribution looks back on the last years in the history of telephone-based speech dialog systems and discusses certain requirements necessary for the successful application of dialog systems.
Abstract: In this contribution we look back on the last years in the history of telephone-based speech dialog systems. We will start in 1993 when the world wide first natural language understanding dialog system using a mixed-initiative approach was made accessible for the public, the well-known EVAR system from the Chair for Pattern Recognition of the University of Erlangen-Nuremberg. Then we discuss certain requirements we consider necessary for the successful application of dialog systems. Finally we present trends and developments in the area of telephone-based dialog systems.

01 Jan 2005
TL;DR: An NLU system developed in the context of the Geometry Cognitive Tutor is presented, which combines unification-based syntactic processing with description logics based semantics to achieve the necessary accuracy level.
Abstract: High-precision Natural Language Understanding is needed in Geometry Tutoring to accurately determine the semantic content of students’ explanations. The paper presents an NLU system developed in the context of the Geometry Cognitive Tutor. The system combines unification-based syntactic processing with description logics based semantics to achieve the necessary accuracy level. The paper describes the compositional process of building the syntactic structure and the semantic interpretation of NLU explanations. It also discusses results of an evaluation of classification performance on data collected during a classroom study.

Journal Article
TL;DR: According to the methodology of natural language understanding based on comprehensive information theory, a mail is analyzed through syntactic (keyword filtering), semantic (topic filtering), pragmatic (tendency filtering) three levels filtering to avoid misjudgement of legal mails and leakage of illegal mails.
Abstract: According to the methodology of natural language understanding based on comprehensive information theory, a mail is analyzed through syntactic (keyword filtering), semantic (topic filtering), pragmatic (tendency filtering) three levels filtering. The system can avoid misjudgement of legal mails and leakage of illegal mails.

01 Jan 2005
TL;DR: A model-driven approach to natural language understanding (NLU) in which the “meaning” of natural language queries is extracted based on a domain model composed of a set of concepts and relations specified in the system’s domain.
Abstract: This report describes a model-driven approach to natural language understanding (NLU) in which the “meaning” of natural language queries is extracted based on a domain model composed of a set of concepts and relations specified in the system’s domain. The extracted meaning is represented as a set of semantic constraints that describe concept instances and their relations. This representation can easily be translated into a form usable for the system, e.g. an SQL-query or a dialogue move. The method has been tested on a dialogue system for browsing and searching in a database of recorded multimodal meetings, and experimental results as well as a discussion on the robustification of the approach are provided.

01 Jan 2005
TL;DR: The evolutionary language understanding approach to build a natural language understanding machine in a virtual human training project that is trained based on the automated data first and is improved as more and more real data come in.
Abstract: The lack of well-annotated data is always one of the biggest problems for most training-based dialogue systems. Without enough training data, it’s almost impossible for a trainable system to work. In this paper, we explore the evolutionary language understanding approach to build a natural language understanding machine in a virtual human training project. We build the initial training data with a finite state machine. The language understanding system is trained based on the automated data first and is improved as more and more real data come in, which is proved by the experimental results.

Patent
28 Apr 2005
TL;DR: In this paper, a rules-based grammar for slots in a schema and a statistical model for preterminals is proposed. But this model is not suitable for pre-terminals.
Abstract: PROBLEM TO BE SOLVED: To provide slots for a rules-based grammar and a statistical model for preterminals in a natural language understanding system. SOLUTION: The NLU system includes a rules-based grammar for slots in a schema and a statistical model for preterminals. A training system is also provided. COPYRIGHT: (C)2005,JPO&NCIPI