scispace - formally typeset
M

Michael Picheny

Researcher at IBM

Publications -  246
Citations -  12344

Michael Picheny is an academic researcher from IBM. The author has contributed to research in topics: Language model & Word error rate. The author has an hindex of 57, co-authored 244 publications receiving 11759 citations. Previous affiliations of Michael Picheny include University of Iowa & Nuance Communications.

Papers
More filters
Proceedings ArticleDOI

Speaker adaptation of neural network acoustic models using i-vectors

TL;DR: This work proposes to adapt deep neural network acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the network in parallel with the regular acoustic features for ASR, comparable in performance to DNNs trained on speaker-adapted features with the advantage that only one decoding pass is needed.
Journal ArticleDOI

Speaking Clearly for the Hard of Hearing II: Acoustic Characteristics of Clear and Conversational Speech.

TL;DR: In this article, the authors report the results of acoustic analyses performed on the conversational and clear speech, and show that speaking clearly cannot be regarded as equivalent to the application of high-frequency emphasis.
Journal ArticleDOI

Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech.

TL;DR: The authors found that the average intelligibility difference between clear and conversational speech averaged 17 percentage points across talker was found to be independent of the listener, level, and frequency-gain characteristic.
Patent

Automatic indexing and aligning of audio and text using speech recognition

TL;DR: In this paper, a method of automatically aligning a written transcript with speech in video and audio clips is presented. But it does not address the problem of automatic alignment of the transcript with the original transcript.
Proceedings ArticleDOI

English Conversational Telephone Speech Recognition by Humans and Machines

TL;DR: In this article, a set of acoustic and language modeling techniques were used to lower the word error rate of a conversational telephone LVCSR system to 5.5%/10.3% on the Switchboard/CallHome subsets of the Hub5 2000 evaluation.