scispace - formally typeset
G

Gilles Boulianne

Researcher at École de technologie supérieure

Publications -  77
Citations -  8385

Gilles Boulianne is an academic researcher from École de technologie supérieure. The author has contributed to research in topics: Speaker recognition & Word error rate. The author has an hindex of 19, co-authored 73 publications receiving 7592 citations. Previous affiliations of Gilles Boulianne include Institut national de la recherche scientifique.

Papers
More filters
Proceedings Article

The Kaldi Speech Recognition Toolkit

TL;DR: The design of Kaldi is described, a free, open-source toolkit for speech recognition research that provides a speech recognition system based on finite-state automata together with detailed documentation and a comprehensive set of scripts for building complete recognition systems.
Journal ArticleDOI

Joint Factor Analysis Versus Eigenchannels in Speaker Recognition

TL;DR: It is shown how the two approaches to the problem of session variability in Gaussian mixture model (GMM)-based speaker verification, eigenchannels, and joint factor analysis can be implemented using essentially the same software at all stages except for the enrollment of target speakers.
Journal ArticleDOI

Eigenvoice modeling with sparse training data

TL;DR: This work derives an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and shows how it can be regarded as a new method of eigenvoice estimation.
Journal ArticleDOI

Speaker and Session Variability in GMM-Based Speaker Verification

TL;DR: A corpus-based approach to speaker verification in which maximum-likelihood II criteria are used to train a large-scale generative model of speaker and session variability which is called joint factor analysis is presented.
Proceedings ArticleDOI

Generating exact lattices in the WFST framework

TL;DR: A lattice generation method that is exact, i.e. it satisfies all the natural properties the authors would want from a lattice of alternative transcriptions of an utterance, and does not introduce substantial overhead above one-best decoding.