Jocelyn Huang

Researcher at Nvidia

Publications - 9

Citations - 419

Jocelyn Huang is an academic researcher from Nvidia. The author has contributed to research in topics: Computer science & Acoustic model. The author has an hindex of 4, co-authored 6 publications receiving 190 citations.

Papers

PDF

Open Access

More filters

Proceedings ArticleDOI

Quartznet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions

Samuel Kriman, +8 more

TL;DR: A new end-to-end neural acoustic model for automatic speech recognition that achieves near state-of-the-art accuracy on LibriSpeech and Wall Street Journal, while having fewer parameters than all competing models.

...read moreread less

Posted Content

NeMo: a toolkit for building AI applications using Neural Modules.

Oleksii Kuchaiev, +13 more

- 14 Sep 2019 -

arXiv: Learning

TL;DR: NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition that provides built-in support for distributed training and mixed precision on latest NVIDIA GPUs.

...read moreread less

Posted Content

QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions.

Samuel Kriman, +8 more

- 22 Oct 2019 -

arXiv: Audio and Speech Processing

TL;DR: In this paper, an end-to-end neural acoustic model for automatic speech recognition is proposed, which is composed of multiple blocks with residual connections between them, each block consists of one or more modules with 1D time-channel separable convolutional layers.

...read moreread less

Posted Content

Cross-Language Transfer Learning, Continuous Learning, and Domain Adaptation for End-to-End Automatic Speech Recognition

Jocelyn Huang, +7 more

- 08 May 2020 -

arXiv: Audio and Speech Processing

TL;DR: This paper demonstrates the efficacy of transfer learning and continuous learning for various automatic speech recognition (ASR) tasks and shows that in all three cases, transfer learning from a good base model has higher accuracy than a model trained from scratch.

...read moreread less

Proceedings ArticleDOI

Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition

Luo Jian, +14 more

TL;DR: This paper demonstrates the efficacy of transfer learning and continuous learning for various automatic speech recognition (ASR) tasks using end-to-end models trained with CTC loss, and indicates that, for fine-tuning, larger pre-trained models are better than small pre- trained models, even if the dataset for Finetuning is small.

...read moreread less