M
Michael Picheny
Researcher at IBM
Publications - 246
Citations - 12344
Michael Picheny is an academic researcher from IBM. The author has contributed to research in topics: Language model & Word error rate. The author has an hindex of 57, co-authored 244 publications receiving 11759 citations. Previous affiliations of Michael Picheny include University of Iowa & Nuance Communications.
Papers
More filters
Proceedings ArticleDOI
Speaker adaptation of neural network acoustic models using i-vectors
TL;DR: This work proposes to adapt deep neural network acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the network in parallel with the regular acoustic features for ASR, comparable in performance to DNNs trained on speaker-adapted features with the advantage that only one decoding pass is needed.
Journal ArticleDOI
Speaking Clearly for the Hard of Hearing II: Acoustic Characteristics of Clear and Conversational Speech.
TL;DR: In this article, the authors report the results of acoustic analyses performed on the conversational and clear speech, and show that speaking clearly cannot be regarded as equivalent to the application of high-frequency emphasis.
Journal ArticleDOI
Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech.
TL;DR: The authors found that the average intelligibility difference between clear and conversational speech averaged 17 percentage points across talker was found to be independent of the listener, level, and frequency-gain characteristic.
Patent
Automatic indexing and aligning of audio and text using speech recognition
Hamed Abdelfattah Ellozy,Dimitri Kanevsky,Michelle Y Kim,David Nahamoo,Michael Picheny,Wlodek W. Zadrozny +5 more
TL;DR: In this paper, a method of automatically aligning a written transcript with speech in video and audio clips is presented. But it does not address the problem of automatic alignment of the transcript with the original transcript.
Proceedings ArticleDOI
English Conversational Telephone Speech Recognition by Humans and Machines
George Saon,Gakuto Kurata,Tom Sercu,Kartik Audhkhasi,Samuel Thomas,Dimitrios Dimitriadis,Xiaodong Cui,Bhuvana Ramabhadran,Michael Picheny,Lynn-Li Lim,Bergul Roomi,Phil Hall +11 more
TL;DR: In this article, a set of acoustic and language modeling techniques were used to lower the word error rate of a conversational telephone LVCSR system to 5.5%/10.3% on the Switchboard/CallHome subsets of the Hub5 2000 evaluation.