scispace - formally typeset
D

Daniel Garcia-Romero

Researcher at Johns Hopkins University

Publications -  73
Citations -  8251

Daniel Garcia-Romero is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Speaker recognition & Speaker diarisation. The author has an hindex of 30, co-authored 71 publications receiving 6150 citations. Previous affiliations of Daniel Garcia-Romero include Technical University of Madrid & Autonomous University of Madrid.

Papers
More filters
Proceedings ArticleDOI

X-Vectors: Robust DNN Embeddings for Speaker Recognition

TL;DR: This paper uses data augmentation, consisting of added noise and reverberation, as an inexpensive method to multiply the amount of training data and improve robustness of deep neural network embeddings for speaker recognition.
Proceedings Article

Analysis of i-vector Length Normalization in Speaker Recognition Systems.

TL;DR: The proposed approach deals with the nonGaussian behavior of i-vectors by performing a simple length normalization, which allows the use of probabilistic models with Gaussian assumptions that yield equivalent performance to that of more complicated systems based on Heavy-Tailed assumptions.
Proceedings ArticleDOI

Deep Neural Network Embeddings for Text-Independent Speaker Verification.

TL;DR: It is found that the embeddings outperform i-vectors for short speech segments and are competitive on long duration test conditions, which are the best results reported for speaker-discriminative neural networks when trained and tested on publicly available corpora.
Proceedings ArticleDOI

Deep neural network-based speaker embeddings for end-to-end speaker verification

TL;DR: It is shown that given a large number of training speakers, the proposed system outperforms an i-vector baseline in equal error-rate (EER) and at low miss rates.
Proceedings ArticleDOI

Speaker Recognition for Multi-speaker Conversations Using X-vectors

TL;DR: It is found that diarization substantially reduces error rate when there are multiple speakers, while maintaining excellent performance on single-speaker recordings.