Expanding the scope of the ATIS task: the ATIS-3 corpus

doi:10.3115/1075812.1075823

Open AccessProceedings ArticleDOI

Expanding the scope of the ATIS task: the ATIS-3 corpus

Deborah A. Dahl, +8 more

- pp 43-48

Chats0

TLDR

The migration of the ATIS task to a richer relational database and development corpus (ATIS-3) and the ATis-3 corpus is described, including breakdowns of data by type (e.g. context-independent, context-dependent, and unevaluable) and variations in the data collected at different sites.

Abstract:

The Air Travel Information System (ATIS) domain serves as the common evaluation task for ARPA spoken language system developers. To support this task, the Multi-Site ATIS Data COllection Working group (MADCOW) coordinates data collection activities. This paper describes recent MADCOW activities. In particular, this paper describes the migration of the ATIS task to a richer relational database and development corpus (ATIS-3) and describes the ATIS-3 corpus. The expanded database, which includes information on 46 US and Canadian cities and 23,457 flights, was released in the fall of 1992, and data collection for the ATIS-3 corpus began shortly thereafter. The ATIS-3 corpus now consists of a total of 8297 released training utterances and 3211 utterances reserved for testing, collected at BBN, CMU, MIT, NIST and SRI. 2906 of the training utterances have been annotated with the correct information from the database. This paper describes the ATIS-3 corpus in detail, including breakdowns of data by type (e.g. context-independent, context-dependent, and unevaluable)and variations in the data collected at different sites. This paper also includes a description of the ATIS-3 database. Finally, we discuss future data collection and evaluation plans.

Expanding the scope of the ATIS task: the ATIS-3 corpus

Citations

QuAC: Question Answering in Context

Learning End-to-End Goal-Oriented Dialog

Online Learning of Relaxed CCG Grammars for Parsing to Logical Form

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

Evaluating Natural Language Processing Systems: An Analysis and Review

References

The ATIS spoken language systems pilot corpus

Evaluation of spoken language systems: the ATIS domain

Multi-site data collection and evaluation in spoken language understanding

Multi-site data collection for a spoken language corpus - MAD COW.

Multi-site data collection for a spoken language corpus

Related Papers (5)

Learning to map sentences to logical form: structured classification with probabilistic categorial grammars

The ATIS spoken language systems pilot corpus

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Neural Machine Translation by Jointly Learning to Align and Translate

Glove: Global Vectors for Word Representation