K
Kartik Audhkhasi
Researcher at IBM
Publications - 92
Citations - 2916
Kartik Audhkhasi is an academic researcher from IBM. The author has contributed to research in topics: Language model & Word error rate. The author has an hindex of 27, co-authored 86 publications receiving 2310 citations. Previous affiliations of Kartik Audhkhasi include University of Southern California & Google.
Papers
More filters
Proceedings ArticleDOI
English Conversational Telephone Speech Recognition by Humans and Machines
George Saon,Gakuto Kurata,Tom Sercu,Kartik Audhkhasi,Samuel Thomas,Dimitrios Dimitriadis,Xiaodong Cui,Bhuvana Ramabhadran,Michael Picheny,Lynn-Li Lim,Bergul Roomi,Phil Hall +11 more
TL;DR: In this article, a set of acoustic and language modeling techniques were used to lower the word error rate of a conversational telephone LVCSR system to 5.5%/10.3% on the Switchboard/CallHome subsets of the Hub5 2000 evaluation.
Journal ArticleDOI
Applying Machine Learning to Facilitate Autism Diagnostics: Pitfalls and Promises
Daniel Bone,Matthew S. Goodwin,Matthew P. Black,Chi-Chun Lee,Kartik Audhkhasi,Shrikanth S. Narayanan +5 more
TL;DR: Proposed best-practices when using machine learning in autism research are highlighted, and some especially promising areas for collaborative work at the intersection of computational and behavioral science are highlighted.
Proceedings ArticleDOI
Direct Acoustics-to-Word Models for English Conversational Speech Recognition
TL;DR: This paper presents the first results employing direct acoustics-to-word CTC models on two well-known public benchmark tasks: Switchboard and CallHome, and presents rescoring results on CTC word model lattices to quantify the performance benefits of a LM, and contrast the performance of word and phone C TC models.
Journal ArticleDOI
Noise-enhanced convolutional neural networks
TL;DR: The Noisy CNN algorithm speeds training on average because the backpropagation algorithm is a special case of the generalized expectation-maximization (EM) algorithm and because such carefully chosen noise always speeds up the EM algorithm on average.
Proceedings ArticleDOI
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition
TL;DR: In this paper, a joint word-character A2W model was proposed to learn to first spell the word and then recognize it, achieving a word error rate of 8.8%/13.9% on the Hub5-2000 Switchboard/CallHome test sets without any decoder, pronunciation lexicon, or externally-trained language model.