scispace - formally typeset
G

George Saon

Researcher at IBM

Publications -  161
Citations -  6882

George Saon is an academic researcher from IBM. The author has contributed to research in topics: Word error rate & Language model. The author has an hindex of 40, co-authored 149 publications receiving 6185 citations.

Papers
More filters
Proceedings ArticleDOI

Speaker adaptation of neural network acoustic models using i-vectors

TL;DR: This work proposes to adapt deep neural network acoustic models to a target speaker by supplying speaker identity vectors (i-vectors) as input features to the network in parallel with the regular acoustic features for ASR, comparable in performance to DNNs trained on speaker-adapted features with the advantage that only one decoding pass is needed.
Journal ArticleDOI

Deep Convolutional Neural Networks for Large-scale Speech Tasks

TL;DR: This paper determines the appropriate architecture to make CNNs effective compared to DNNs for LVCSR tasks, and investigates how to incorporate speaker-adapted features, which cannot directly be modeled by CNNs as they do not obey locality in frequency, into the CNN framework.
Proceedings ArticleDOI

Boosted MMI for model and feature-space discriminative training

TL;DR: A modified form of the maximum mutual information (MMI) objective function which gives improved results for discriminative training by boosting the likelihoods of paths in the denominator lattice that have a higher phone error relative to the correct transcript.
Proceedings ArticleDOI

fMPE: discriminatively trained features for speech recognition

TL;DR: In this paper, a matrix projection from posteriors of Gaussians to a normal size feature space is used to train a matrix and then add the projected features to normal features such as PLP.
Proceedings ArticleDOI

English Conversational Telephone Speech Recognition by Humans and Machines

TL;DR: In this article, a set of acoustic and language modeling techniques were used to lower the word error rate of a conversational telephone LVCSR system to 5.5%/10.3% on the Switchboard/CallHome subsets of the Hub5 2000 evaluation.