Shibo Wang

Posted Content

Conformer: Convolution-augmented Transformer for Speech Recognition

- 16 May 2020 -

TL;DR: This work proposes the convolution-augmented transformer for speech recognition, named Conformer, which significantly outperforms the previous Transformer and CNN based models achieving state-of-the-art accuracies.

...read moreread less

Proceedings ArticleDOI

Conformer: Convolution-augmented Transformer for Speech Recognition

Anmol Gulati, +10 more

TL;DR: Conformer as mentioned in this paper combines convolution neural networks and transformers to model both local and global dependencies of an audio sequence in a parameter-efficient way, achieving state-of-the-art accuracies.

...read moreread less

Posted Content

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Yu Zhang, +25 more

- 27 Sep 2021 -

arXiv: Audio and Speech Processing

TL;DR: In this article, the authors show that the combination of pre-training, self-training and scaling up model size greatly increases data efficiency, even for extremely large tasks with tens of thousands of hours of labeled data.

...read moreread less

Posted Content

Scale MLPerf-0.6 models on Google TPU-v3 Pods.

Sameer Kumar, +11 more

- 21 Sep 2019 -

arXiv: Learning

TL;DR: This work discusses the optimizations and techniques including choice of optimizer, spatial partitioning and weight update sharding necessary to scale to 1024 TPU chips and identifies properties of models that make scaling them challenging, such as limited data parallelism and unscaled weights.

...read moreread less

Posted Content

Automatic Cross-Replica Sharding of Weight Update in Data-Parallel Training

Yuanzhong Xu, +5 more

- 28 Apr 2020 -

arXiv: Distributed, Parallel, and Cluste...

TL;DR: This paper presents an approach to automatically shard the weight update computation across replicas with efficient communication primitives and data formatting, using static analysis and transformations on the training computation graph, and achieves substantial speedups on typical image and language models on Cloud TPUs, requiring no change to model code.

...read moreread less

Papers

Conformer: Convolution-augmented Transformer for Speech Recognition

Conformer: Convolution-augmented Transformer for Speech Recognition

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Scale MLPerf-0.6 models on Google TPU-v3 Pods.

Automatic Cross-Replica Sharding of Weight Update in Data-Parallel Training