V
Vijayaditya Peddinti
Researcher at Johns Hopkins University
Publications - 25
Citations - 5069
Vijayaditya Peddinti is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Time delay neural network & Word error rate. The author has an hindex of 18, co-authored 25 publications receiving 3782 citations. Previous affiliations of Vijayaditya Peddinti include Google & IBM.
Papers
More filters
Proceedings ArticleDOI
Audio augmentation for speech recognition.
TL;DR: This paper investigates audio-level speech augmentation methods which directly process the raw signal, and presents results on 4 different LVCSR tasks with training data ranging from 100 hours to 1000 hours, to examine the effectiveness of audio augmentation in a variety of data scenarios.
Proceedings ArticleDOI
A time delay neural network architecture for efficient modeling of long temporal contexts.
TL;DR: This paper proposes a time delay neural network architecture which models long term temporal dependencies with training times comparable to standard feed-forward DNNs and uses sub-sampling to reduce computation during training.
Proceedings ArticleDOI
Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI.
Daniel Povey,Vijayaditya Peddinti,Daniel Galvez,Pegah Ghahremani,Vimal Manohar,Xingyu Na,Yiming Wang,Sanjeev Khudanpur +7 more
TL;DR: A method to perform sequencediscriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training is described, using the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI.
Proceedings ArticleDOI
A study on data augmentation of reverberant speech for robust speech recognition
TL;DR: It is found that the performance gap between using simulated and real RIRs can be eliminated when point-source noises are added, and the trained acoustic models not only perform well in the distant- talking scenario but also provide better results in the close-talking scenario.
Posted Content
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen,Patrick Nguyen,Yonghui Wu,Zhifeng Chen,Mia Xu Chen,Ye Jia,Anjuli Kannan,Tara N. Sainath,Yuan Cao,Chung-Cheng Chiu,Yanzhang He,Jan Chorowski,Smit Hinsu,Stella Marie Laurenzo,James Qin,Orhan Firat,Wolfgang Macherey,Suyog Gupta,Ankur Bapna,Shuyuan Zhang,Ruoming Pang,Ron Weiss,Rohit Prabhavalkar,Qiao Liang,Benoit Jacob,Bowen Liang,HyoukJoong Lee,Ciprian Chelba,Sébastien Jean,Bo Li,Melvin Johnson,Rohan Anil,Rajat Tibrewal,Xiaobing Liu,Akiko Eriguchi,Navdeep Jaitly,Naveen Ari,Colin Cherry,Parisa Haghani,Otavio Good,Youlong Cheng,Raziel Alvarez,Isaac Caswell,Wei-Ning Hsu,Zongheng Yang,Kuan-Chieh Wang,Ekaterina Gonina,Katrin Tomanek,Ben Vanik,Zelin Wu,Llion Jones,Mike Schuster,Yanping Huang,Dehao Chen,Kazuki Irie,George Foster,John Richardson,Klaus Macherey,Antoine Bruguier,Heiga Zen,Colin Raffel,Shankar Kumar,Kanishka Rao,David Rybach,Matthew Murray,Vijayaditya Peddinti,Maxim Krikun,Michiel Bacchiani,Thomas B. Jablin,Robert Suderman,Ian Williams,Benjamin N. Lee,Deepti Bhatia,Justin Carlson,Semih Yavuz,Yu Zhang,Ian McGraw,Max Galkin,Qi Ge,Golan Pundak,Chad Whipkey,Todd Wang,Uri Alon,Dmitry Lepikhin,Ye Tian,Sara Sabour,William Chan,Shubham Toshniwal,Baohua Liao,Michael Nirschl,Pat Rondon +90 more
TL;DR: This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the Framework.