scispace - formally typeset
P

Petr Motlicek

Researcher at Idiap Research Institute

Publications -  209
Citations -  8177

Petr Motlicek is an academic researcher from Idiap Research Institute. The author has contributed to research in topics: Computer science & Speaker recognition. The author has an hindex of 25, co-authored 179 publications receiving 7093 citations. Previous affiliations of Petr Motlicek include Oregon Health & Science University & DuPont.

Papers
More filters
Proceedings Article

The Kaldi Speech Recognition Toolkit

TL;DR: The design of Kaldi is described, a free, open-source toolkit for speech recognition research that provides a speech recognition system based on finite-state automata together with detailed documentation and a comprehensive set of scripts for building complete recognition systems.
Proceedings ArticleDOI

Generating exact lattices in the WFST framework

TL;DR: A lattice generation method that is exact, i.e. it satisfies all the natural properties the authors would want from a lattice of alternative transcriptions of an utterance, and does not introduce substantial overhead above one-best decoding.
Proceedings ArticleDOI

Multilingual deep neural network based acoustic modeling for rapid language adaptation

TL;DR: The studies reveal that crosslingual acoustic model transfer through multilingual DNNs is superior to unsupervised RBM pre-training and greedy layer-wise supervised training and that KL-HMM based decoding consistently outperforms conventional hybrid decoding, especially in low-resource scenarios.
Proceedings ArticleDOI

Deep Neural Networks for Multiple Speaker Detection and Localization

TL;DR: This paper proposes a likelihood-based encoding of the network output, which naturally allows the detection of an arbitrary number of sources, and investigates the use of sub-band cross-correlation information as features for better localization in sound mixtures.
Proceedings ArticleDOI

End-to-End Accented Speech Recognition.

TL;DR: This work explores the use of multi-task training and accent embedding in the context of end-to-end ASR trained with the connectionist temporal classification loss and shows relative improvement in word error rate.