scispace - formally typeset
Search or ask a question

Showing papers by "Michael S. Phillips published in 1991"


Proceedings ArticleDOI
14 Apr 1991
TL;DR: Recent attempts at improving the integration between the speech recognition and natural language components are described, using the generation capability of the natural language component to produce a word-pair language model to constrain the recognizer's search space, thus improving the coverage of the overall system.
Abstract: The MIT VOYAGER speech understanding system is an urban exploration and navigation system that interacts with the user through spoken dialogue, text, and graphics. The authors describe recent attempts at improving the integration between the speech recognition and natural language components. They used the generation capability of the natural language component to produce a word-pair language model to constrain the recognizer's search space, thus improving the coverage of the overall system. They also implemented a strategy in which the recognizer generates the top N word strings and passes them along to the natural language component for filtering. Results on performance evaluation are presented. >

66 citations



Proceedings ArticleDOI
19 Feb 1991
TL;DR: The MIT ATIS system as discussed by the authors is based on the MIT SUMMIT system using context independent phone models, and includes a word-pair grammar with perplexity 92 (on the June-90 test set).
Abstract: This paper represents a status report on the MIT ATIS system. The most significant new achievement is that we now have a speech-input mode. It is based on the MIT SUMMIT system using context independent phone models, and includes a word-pair grammar with perplexity 92 (on the June-90 test set). In addition, we have completely redesigned the back-end component, in order to emphasize portability and extensibility. The parser now produces an intermediate semantic frame representation, which serves as the focal point for all back-end operations, such as history management, text generation, and SQL query generation. Most of those aspects of the system that are tied to a particular domain are now entered through a set of tables associated with a small artificial language for decoding them. We have also improved the display of the database table, making it considerably easier for a subject to comprehend the information given. We report here on the results of the official DARPA February-91 evaluation, as well as on results of an evaluation on data collected at MIT, for both speech input and text input.

19 citations



Proceedings ArticleDOI
19 Feb 1991
TL;DR: These changes, along with an improved corrective training procedure for adapting pronunciation are weights and a larger set of training data, have resulted in the reduction of error rate by almost a factor of two on the Resource Management task.
Abstract: In 1989, our group first reported on the development of SUMMIT, a segment-based speaker-independent continuous-speech recognition system [13]. The initial version of SUMMIT made use of fairly simple context-independent models for the lexical labels. Recently, we have begun to incorporate more complex models of lexical labels that take into account a variety of contextual factors. These changes, along with an improved corrective training procedure for adapting pronunciation are weights and a larger set of training data, have resulted in the reduction of error rate by almost a factor of two on the Resource Management task.

9 citations


Proceedings ArticleDOI
19 Feb 1991
TL;DR: Experiments on a fully integrated system which uses the parser to predict possible next words to the recognizer are now underway, and improvement by combining acoustic score and parse probability normalized for number of terminals.
Abstract: This paper describes several experiments combining natural language and acoustic constraints to improve overall performance of the MIT VOYAGER spoken language system. This system couples the SUMMIT speech recognition system with the TINA language understanding system to answer spoken queries about navigational assistance in the Cambridge, MA, area. The overall goal of our research is to combine acoustic, syntactic and semantic knowledge sources. Our first experiment showed improvement by combining acoustic score and parse probability normalized for number of terminals. Results were further improved by the use of an explicit rejection criterion based on normalized parse probabilities. The use of the combined parse/acoustic score, together with the rejection criterion, gave an improvement in overall score of more than 33% on both training and test data, where score is defined as percent correct minus percent incorrect. Experiments on a fully integrated system which uses the parser to predict possible next words to the recognizer are now underway.

8 citations


01 Jan 1991
TL;DR: A lug nut is held and guided by a plastic cage formed by a flanged bush, whose cylindrical part has longitudinal apertures in which the lugs slide axially and bosses at right angles to the aperture adjacent the flange.
Abstract: A lug nut is held and guided by a plastic cage formed by a flanged bush, whose cylindrical part has longitudinal apertures in which the lugs slide axially and bosses at right angles to the apertures adjacent the flange. This assembly can be inserted from one side of a sheet into a corresponding slotted hole in the sheet, and on turning through 90 DEG is locked angularly by seating the bosses in the slots. The bush is of nylon and has an annular rib on a surface of the flange to seal the device against the sheet. The bush can have a radial protuberance and ramps on the bosses to prevent inadvertent extraction from the hole. The flange can overlap the cylindrical wall of the bush radially inwardly to form a guiding and sealing hole for the bolt to be engaged in the lug nut.

7 citations



01 Jan 1991
TL;DR: The purpose of this paper is to document the involvement in the development of the WSJ-CSR corpus, from recording and transcription to analyses and distribution, and to present the results of an experiment investigating the preprocessing of the prompt text.
Abstract: Recently, the DARPA community started a new data collection initiative in the Wall Street Journal (WSJ) domain to support research and development of very large vocabulary continuous speech recognition (CSR) systems. Since August 1991, our group has actively participated in the development of the WSJ-CSR corpus. The purpose of this paper is to document our involvement in this process, from recording and transcription to analyses and distribution. We will also present the results of an experiment investigating the preprocessing of the prompt text.

4 citations


Proceedings ArticleDOI
31 Oct 1991
TL;DR: Spoken language interfaces offer significant benefits over conventional user interfaces for certain classes of applications, particularly handsbusy or eyes-busy applications, where typed input and/or visual displays may not be possible or convenient.
Abstract: This paper describes research on spoken language interfaces for interactive problem solving A spoken language interface combines speech recognition technology with language understanding technology to provide an application-specific interface The interface converts acoustic input (speech) into a series of words which are interpreted to produce the appropriate response and/or action The system response may be spoken or it may be in the form of a display, as appropriate to the needs of the user Spoken language interfaces offer significant benefits over conventional user interfaces for certain classes of applications, particularly handsbusy or eyes-busy applications, where typed input and/or visual displays may not be possible or convenient To illustrate this, we present two examples of spoken language interfaces developed at MIT: an interactive system for urban navigation, VOYAGER; and an air travel planning system ATISThe VOYAGER system currently runs in a few times real time and is able to provide answers for more than 50% of user queries for untrained users