scispace - formally typeset
Open Access

1998 Broadcast News Benchmark Test Results: English and Non-English Word Error Rate Performance Measures

Reads0
Chats0
TLDR
This paper documents the use of Broadcast News test materials in DARPA-sponsored Automatic Speech Recognition (ASR) Benchmark Tests conducted late in 1998, and results are reported on non-English language Broadcast News materials in Spanish and Mandarin.
Abstract
This paper documents the use of Broadcast News test materials in DARPA-sponsored Automatic Speech Recognition (ASR) Benchmark Tests conducted late in 1998. As in last year’s tests [1], statistical selection procedures were used in selecting test materials. Two test epochs were used, each yielding (nominally) one and one-half hours of test material. One of the test sets was drawn from the same test epoch as was used for last year’s tests, and the other was drawn from a more recent period. Results are reported for two types of systems: one (the “Hub”, or “baseline” systems) for which there were no limits on computational resources, and another (the “less than 10X realtime spoke” systems) for systems that ran in less than 10 times real-time. The lowest word error rate reported this year for the “Hub” systems was 13.5%, contrasting with last year’s lowest word error rate of 16.2%. For the “less than 10X real-time spoke” systems, the lowest reported word error rate was 16.1%. Results are also reported, for the second year, on non-English language Broadcast News materials in Spanish and Mandarin.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Application of Hidden Markov Models in Speech Recognition

TL;DR: The aim of this review is first to present the core architecture of a HMM-based LVCSR system and then to describe the various refinements which are needed to achieve state-of-the-art performance.
Proceedings Article

The TREC spoken document retrieval track: a success story

TL;DR: The SDR Track can be declared a success in that it has provided objective, demonstrable proof that this technology can be successfully applied to realistic audio collections using a combination of existing technologies and that it can be objectively evaluated.
Journal ArticleDOI

Lightly supervised and unsupervised acoustic model training

TL;DR: Experiments providing supervision only via the language model training materials show that including texts which are contemporaneous with the audio data is not crucial for success of the approach, and that the acoustic models can be initialized with as little as 10 min of manually annotated data.
Journal ArticleDOI

Speech and language processing for next-millennium communications services

TL;DR: Speech technologies have progressed to the point where they are now viable for a broad range of communications services, including: compression of speech for use over wired and wireless networks; speech synthesis, recognition, and understanding for dialogue access to information, people, and messaging; and speaker verification for secure access to Information and services.
Proceedings ArticleDOI

Suede: a Wizard of Oz prototyping tool for speech user interfaces

TL;DR: SUEDE, the speech interface prototyping tool, allows designers to rapidly create prompt/response speech interfaces and offers an electronically supported Wizard of Oz technique that captures test data, allowing designers to analyze the interface after testing.
References
More filters
Proceedings ArticleDOI

A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)

TL;DR: The NIST Recognizer Output Voting Error Reduction (ROVER) system as discussed by the authors was developed at NIST to produce a composite automatic speech recognition (ASR) system output when the outputs of multiple ASR systems are available, and for which the composite ASR output has a lower error rate than any of the individual systems.

Overview of MUC-7

TL;DR: The task of Coreference (CO) had its origins in Semeval, an attempt after MUC-5 to define semantic research tasks that needed to be solved to be successful at generating scenario templates.

1997 broadcast news benchmark test results: english and non-english

TL;DR: This paper documents use of Broadcast News test materials in DARPA-sponsored Automatic Speech Recognition (ASR) Benchmark Tests conducted late in 1997, with the lowest word-error rate reported this year and the completion of tests in languages other than English Mandarin and Spanish.
Proceedings ArticleDOI

Named Entity Scoring for Speech Input

TL;DR: A new scoring algorithm is described that supports comparison of linguistically annotated data from noisy sources and scores for content (transcription correctness) of the tagged region, a useful distinction when dealing with noisy data that may differ from a reference transcription.

Data Selection for Broadcast News CSR Evaluations

TL;DR: This paper discusses both the principles involved and the specific algorithms used of the 1997 Hub-4 broadcast news test set, based on concurrent selection of a statistically-equivalent test set for a future evaluation.