Michiel Bacchiani

Proceedings ArticleDOI

State-of-the-Art Speech Recognition with Sequence-to-Sequence Models

TL;DR: In this article, the authors explore a variety of structural and optimization improvements to the Listen, Attend, and Spell (LAS) encoder-decoder architecture, which significantly improves performance.

...read moreread less

Proceedings ArticleDOI

Generation of large-scale simulated utterances in virtual rooms to train deep-neural networks for far-field speech recognition in Google Home

Chanwoo Kim, +6 more

TL;DR: The structure and application of an acoustic room simulator to generate large-scale simulated data for training deep neural networks for far-field speech recognition and performance is evaluated using a factored complex Fast Fourier Transform (CFFT) acoustic model introduced in earlier work.

...read moreread less

Journal ArticleDOI

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition

Tara N. Sainath, +11 more

- 01 May 2017 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This paper introduces a neural network architecture, which performs multichannel filtering in the first layer of the network, and shows that this network learns to be robust to varying target speaker direction of arrival, performing as well as a model that is given oracle knowledge of the true target Speaker direction.

...read moreread less

Posted Content

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Jonathan Shen, +90 more

- 21 Feb 2019 -

arXiv: Learning

TL;DR: This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the Framework.

...read moreread less

Proceedings ArticleDOI

Acoustic Modeling for Google Home

Bo Li, +19 more

TL;DR: The technical and system building advances made to the Google Home multichannel speech recognition system, which was launched in November 2016, result in a reduction of WER of 8-28% relative to the current production system.

...read moreread less

Papers

State-of-the-Art Speech Recognition with Sequence-to-Sequence Models

Generation of large-scale simulated utterances in virtual rooms to train deep-neural networks for far-field speech recognition in Google Home

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Acoustic Modeling for Google Home