scispace - formally typeset
Search or ask a question
Author

Eduardo Lleida

Other affiliations: Bell Labs, Aalborg University, ETSI  ...read more
Bio: Eduardo Lleida is an academic researcher from University of Zaragoza. The author has contributed to research in topics: Speaker recognition & Hidden Markov model. The author has an hindex of 23, co-authored 181 publications receiving 2172 citations. Previous affiliations of Eduardo Lleida include Bell Labs & Aalborg University.


Papers
More filters
Proceedings Article
01 Jan 1993
TL;DR: The phonetic content of Albayzin, a spoken database for Spanish designed for speech recognition purposes, and the phonetic and statistical criteria for the final constitution of the database are discussed.
Abstract: This paper describes the phonetic content of Albayzin, a spoken database for Spanish designed for speech recognition purposes A statistical study of a large sample of spontaneous speech is presented, and the phonetic and statistical criteria for the final constitution of the database are discussed Finally, the contents of the phonetic database are analyzed

159 citations

Proceedings ArticleDOI
08 Dec 2011
TL;DR: A system for detecting spoofing attacks on speaker verification systems and shows the degradation on the speaker verification performance in the presence of this kind of attack and how to use the spoofing detection to mitigate that degradation.
Abstract: In this paper, we describe a system for detecting spoofing attacks on speaker verification systems. We understand as spoofing the fact of impersonating a legitimate user. We focus on detecting two types of low technology spoofs. On the one side, we try to expose if the test segment is a far-field microphone recording of the victim that has been replayed on a telephone handset using a loudspeaker. On the other side, we want to determine if the recording has been created by cutting and pasting short recordings to forge the sentence requested by a text dependent system. This kind of attacks is of critical importance for security applications like access to bank accounts. To detect the first type of spoof we extract several acoustic features from the speech signal. Spoofs and non-spoof segments are classified using a support vector machine (SVM). The cut and paste is detected comparing the pitch and MFCC contours of the enrollment and test segments using dynamic time warping (DTW). We performed experiments using two databases created for this purpose. They include signals from land line and GSM telephone channels of 20 different speakers. We present results of the performance separately for each spoofing detection system and the fusion of both. We have achieved error rates under 10% for all the conditions evaluated. We show the degradation on the speaker verification performance in the presence of this kind of attack and how to use the spoofing detection to mitigate that degradation.

121 citations

Book ChapterDOI
08 Mar 2011
TL;DR: A system for detecting spoofing attacks on speaker verification systems and shows the degradation on the speaker verification performance in the presence of this kind of attack and how to use the spoofing detection to mitigate that degradation.
Abstract: In this paper, we describe a system for detecting spoofing attacks on speaker verification systems By spoofing we mean an attempt to impersonate a legitimate user We focus on detecting if the test segment is a far-field microphone recording of the victim This kind of attack is of critical importance in security applications like access to bank accounts We present experiments on databases created for this purpose, including land line and GSM telephone channels We present spoofing detection results with EER between 0% and 9% depending on the condition We show the degradation on the speaker verification performance in the presence of this kind of attack and how to use the spoofing detection to mitigate that degradation

110 citations

Journal ArticleDOI
TL;DR: The results indicate that ASR and PV systems configured from speech utterances taken from the impaired speech domain can provide adequate performance, similar to the experts' agreement rate, for supporting the presented CASLT applications.

105 citations

Book ChapterDOI
01 Jan 2012
TL;DR: A set of experiments on pathological voice detection over the Saarbrucken Voice Database is presented by using the MultiFocal toolkit for a discriminative calibration and fusion, which makes possible to see that SVD is much more challenging.
Abstract: The paper presents a set of experiments on pathological voice detection over the Saarbrucken Voice Database (SVD) by using the MultiFocal toolkit for a discriminative calibration and fusion. The SVD is freely available online containing a collection of voice recordings of different pathologies, including both functional and organic. A generative Gaussian mixture model trained with mel-frequency cepstral coefficients, harmonics-to-noise ratio, normalized noise energy and glottal-to-noise excitation ratio, is used as classifier. Scores are calibrated to increase performance at the desired operating point. Finally, the fusion of different recordings for each speaker, in which vowels /a/, /i/ and /u/ are pronounced with normal, low, high, and low-high-low intonations, offers a great increase in the performance. Results are compared with the Massachusetts Eye and Ear Infirmary (MEEI) database, which makes possible to see that SVD is much more challenging.

85 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
01 Oct 1980

1,565 citations

Journal ArticleDOI
TL;DR: In this paper, a comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field is provided, and the challenges and suggested solutions to help researchers understand the existing research gaps.
Abstract: In the last few years, the deep learning (DL) computing paradigm has been deemed the Gold Standard in the machine learning (ML) community. Moreover, it has gradually become the most widely used computational approach in the field of ML, thus achieving outstanding results on several complex cognitive tasks, matching or even beating those provided by human performance. One of the benefits of DL is the ability to learn massive amounts of data. The DL field has grown fast in the last few years and it has been extensively used to successfully address a wide range of traditional applications. More importantly, DL has outperformed well-known ML techniques in many domains, e.g., cybersecurity, natural language processing, bioinformatics, robotics and control, and medical information processing, among many others. Despite it has been contributed several works reviewing the State-of-the-Art on DL, all of them only tackled one aspect of the DL, which leads to an overall lack of knowledge about it. Therefore, in this contribution, we propose using a more holistic approach in order to provide a more suitable starting point from which to develop a full understanding of DL. Specifically, this review attempts to provide a more comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field. In particular, this paper outlines the importance of DL, presents the types of DL techniques and networks. It then presents convolutional neural networks (CNNs) which the most utilized DL network type and describes the development of CNNs architectures together with their main features, e.g., starting with the AlexNet network and closing with the High-Resolution network (HR.Net). Finally, we further present the challenges and suggested solutions to help researchers understand the existing research gaps. It is followed by a list of the major DL applications. Computational tools including FPGA, GPU, and CPU are summarized along with a description of their influence on DL. The paper ends with the evolution matrix, benchmark datasets, and summary and conclusion.

1,084 citations

Journal ArticleDOI

1,008 citations

Book
14 Jan 2010
TL;DR: In this article, the authors present a glossary for language analysis and understanding in the context of spoken language input and output technologies, and evaluate their work with a set of annotated corpora.
Abstract: 1. Spoken language input Ronald Cole, Victor Zue, Wayne Ward, Melvyn J. Hunt, Richard M. Stern, Renato De Mori, Fabio Brugnara, Salim Roukos, Sadaoki Furui and Patti Price 2. Written language input Joseph Mariani, Sargur N. Srihari, Rohini K. Srihari, Richard G. Casey, Abdel Belaid, Claudie Faure, Eric Lecolinet, Isabelle Guyo, Colin Warwick and Rejean Plamondon 3. Language analysis and understanding Annie Zaenen, Hans Uszkoreit, Fred Karlsson, Lauri Karttunen, Antonio Sanfilippo, Stephen F. Pulman, Fernando Pereira and Ted Briscoe 4. Language generation Hans Uszkoreit, Eduard Hovy, Gertjan van Noord, Gunter Neumann and John Bateman 5. Spoken output technologies Ronald Cole, Yoshinori Sagisaka, Christophe d'Alessandro, Jean-Sylvain Lienard, Richard Sproat, Kathleen R. McKeown and Johanna D. Moore 6. Discourse and dialogue Hans Uszkoreit, Barbara Grosz, Donia Scott, Hans Kamp, Phil Cohe and Egidio Giachin 7. Document processing Annie Zaenen, Per-Kristian Halvorsen, Donna Harman, Peter Schauble, Alan Smeaton, Paul Jacobs, Karen Sparck Jones, Robert Dale, Richard H. Wojcik and James E. Hoard 8. Multilinguality Annie Zaenen, Martin Kay, Christian Boitet, Christian Fluhr, Alexander Waibel, Yeshwant K. Muthusamy and A. Lawrence Spitz 9. Multimodality Joseph Mariani, James L. Flanagan, Gerard Ligozat, Wolfgang Wahlster, Yacine Bellik, Alan J. Goldschen, Christian Benoit, Dominic W. Massaro and Michael M. Cohen 10. Transmission and storage Victor Zue, Isabel Trancoso, Bishnu S. Atal, Nikil S. Jayant and Dirk Van Compernolle 11. Mathematical methods Ronald Cole, Hans Uszkoreit, Steve Levinson, John Makhoul, Aravind Joshi, Herve Bourlard, Nelson Morgan, Ronald M. Kaplan and John Bridle 12. Language resources Ronald Cole, Antonio Zampolli, Eva Ejerhed, Ken Church, Lori Lamel, Ralph Grishman, Nicoletta Calzolari, Christian Galinski and Gerhard Budin 13. Evaluation Joseph Mariani, Lynette Hirschman, Henry S. Thompson, Beth Sundheim, John Hutchins, Ezra Black, Margaret King, David S. Pallett, Adrian Fourcin, Louis C. W. Pols, Sharon Oviatt, Herman J. M. Steeneken and Junichi Kanai Glossary Citation index Index.

569 citations