Proceedings ArticleDOI
SWITCHBOARD: telephone speech corpus for research and development
J.J. Godfrey,E. Holliman,J. McDaniel +2 more
- Vol. 1, pp 517-520
TLDR
SWITCHBOARD as mentioned in this paper is a large multispeaker corpus of conversational speech and text which should be of interest to researchers in speaker authentication and large vocabulary speech recognition.Abstract:
SWITCHBOARD is a large multispeaker corpus of conversational speech and text which should be of interest to researchers in speaker authentication and large vocabulary speech recognition. About 2500 conversations by 500 speakers from around the US were collected automatically over T1 lines at Texas Instruments. Designed for training and testing of a variety of speech processing algorithms, especially in speaker verification, it has over an 1 h of speech from each of 50 speakers, and several minutes each from hundreds of others. A time-aligned word for word transcription accompanies each recording. >read more
Citations
More filters
Proceedings ArticleDOI
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
TL;DR: This work presents SpecAugment, a simple data augmentation method for speech recognition that is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients) and achieves state-of-the-art performance on the LibriSpeech 960h and Swichboard 300h tasks, outperforming all prior work.
Journal ArticleDOI
An empirical study of smoothing techniques for language modeling
TL;DR: This work surveys the most widely-used algorithms for smoothing models for language n -gram modeling, and presents an extensive empirical comparison of several of these smoothing techniques, including those described by Jelinek and Mercer (1980), and introduces methodologies for analyzing smoothing algorithm efficacy in detail.
Book
Introduction to Semi-Supervised Learning
TL;DR: This introductory book presents some popular semi-supervised learning models, including self-training, mixture models, co-training and multiview learning, graph-based methods, and semi- supervised support vector machines, and discusses their basic mathematical formulation.
Journal ArticleDOI
Speaker identification and verification using Gaussian mixture speaker models
TL;DR: High performance speaker identification and verification systems based on Gaussian mixture speaker models: robust, statistically based representations of speaker identity, evaluated on four publically available speech databases.
Journal ArticleDOI
Dialogue act modeling for automatic tagging and recognition of conversational speech
Andreas Stolcke,Noah Coccaro,Rebecca Bates,Paul Taylor,Carol Van Ess-Dykema,Klaus Ries,Elizabeth Shriberg,Dan Jurafsky,Rachel Martin,Marie Meteer +9 more
TL;DR: The authors proposed a statistical approach for modeling dialogue acts in conversational speech, i.e., speech-act-like units such as STATEMENT, QUESTION, BACKCHANNEL, AGREEMENT, DISAGREEMENT and APOLOGY.
References
More filters
Proceedings ArticleDOI
The ATIS spoken language systems pilot corpus
TL;DR: This pilot marks the first full-scale attempt to collect a corpus to measure progress in Spoken Language Systems that include both a speech and natural language component and provides guidelines for future efforts.
Proceedings ArticleDOI
The DARPA 1000-word resource management database for continuous speech recognition
TL;DR: A database of continuous read speech has been designed and recorded within the DARPA strategic computing speech recognition program for use in designing and evaluating algorithms for speaker-independent, speaker-adaptive and speaker-dependent speech recognition.
Proceedings ArticleDOI
Robust automatic time alignment of orthographic transcriptions with unconstrained speech
Barbara J. Wheatley,George R. Doddington,Charles T. Hemphill,J.J. Godfrey,E. Holliman,J. McDaniel,D. Fisher +6 more
TL;DR: This method successfully aligns transcriptions with speech in unconstrained 5 to 10 min conversations collected over long-distance telephone lines and requires minimal manual processing and generally produces correct alignments despite the challenging nature of the data.