scispace - formally typeset
Proceedings ArticleDOI

Model Compression Applied to Small-Footprint Keyword Spotting.

TLDR
Two ways to improve deep neural network acoustic models for keyword spotting without increasing CPU usage by using low-rank weight matrices throughout the DNN and knowledge distilled from an ensemble of much larger DNNs used only during training are investigated.
Abstract
Several consumer speech devices feature voice interfaces that perform on-device keyword spotting to initiate user interactions. Accurate on-device keyword spotting within a tight CPU budget is crucial for such devices. Motivated by this, we investigated two ways to improve deep neural network (DNN) acoustic models for keyword spotting without increasing CPU usage. First, we used low-rank weight matrices throughout the DNN. This allowed us to increase representational power by increasing the number of hidden nodes per layer without changing the total number of multiplications. Second, we used knowledge distilled from an ensemble of much larger DNNs used only during training. We systematically evaluated these two approaches on a massive corpus of far-field utterances. Alone both techniques improve performance and together they combine to give significant reductions in false alarms and misses without increasing CPU or memory usage.

read more

Citations
More filters
Proceedings ArticleDOI

Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks.

TL;DR: A factored form of TDNNs (TDNN-F) is introduced which is structurally the same as a TDNN whose layers have been compressed via SVD, but is trained from a random start with one of the two factors of each matrix constrained to be semi-orthogonal.
Posted Content

Hello Edge: Keyword Spotting on Microcontrollers

TL;DR: It is shown that it is possible to optimize these neural network architectures to fit within the memory and compute constraints of microcontrollers without sacrificing accuracy, and the depthwise separable convolutional neural network (DS-CNN) is explored and compared against other neural network architecture.
PatentDOI

Convolutional recurrent neural networks for small-footprint keyword spotting

TL;DR: Systems and methods for creating and using Convolutional Recurrent Neural Networks for small-footprint keyword spotting (KWS) systems and a CRNN model embodiment demonstrated high accuracy and robust performance in a wide range of environments are described.
Proceedings ArticleDOI

Multi-task learning and Weighted Cross-entropy for DNN-based Keyword Spotting

TL;DR: It is shown that the combination of 3 techniques LVCSR-initialization, multi-task training and weighted cross-entropy gives the best results, with significantly lower False Alarm Rate than the LV CSR- initialization technique alone, across a wide range of Miss Rates.
Proceedings ArticleDOI

Compressed Time Delay Neural Network for Small-Footprint Keyword Spotting.

TL;DR: This paper proposes to apply singular value decomposition (SVD) to further reduce TDNN complexity, and results show that the full-rank TDNN achieves a 19.7% DET AUC reduction compared to a similar-size deep neural network baseline.
References
More filters
Posted Content

Distilling the Knowledge in a Neural Network

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
Posted Content

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler
- 22 Dec 2012 - 
TL;DR: A novel per-dimension learning rate method for gradient descent called ADADELTA that dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent is presented.
Proceedings Article

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

TL;DR: In this paper, the authors exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation, while keeping the accuracy within 1% of the original model.
Proceedings ArticleDOI

Speeding up Convolutional Neural Networks with Low Rank Expansions

TL;DR: Two simple schemes for drastically speeding up convolutional neural networks are presented, achieved by exploiting cross-channel or filter redundancy to construct a low rank basis of filters that are rank-1 in the spatial domain.
Posted Content

Do Deep Nets Really Need to be Deep

Lei Jimmy Ba, +1 more
- 21 Dec 2013 - 
TL;DR: This paper showed that shallow feed-forward networks can learn the complex functions previously learned by deep networks and achieve accuracies previously only achievable with deep models, and in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model.