Showing papers by "Pavel Korshunov published in 2017"

PDF

Open Access

Proceedings Article•

Continuously Reproducing Toolchains in Pattern Recognition and Machine Learning Experiments

[...]

André Anjos¹, Manuel Günther², Tiago de Freitas Pereira¹, Pavel Korshunov¹, Amir H. Mohammadi³, Sébastien Marcel¹ - Show less +2 more•Institutions (3)

Idiap Research Institute¹, University of Colorado Colorado Springs², Shahid Sadoughi University of Medical Sciences and Health Services³

17 Jun 2017

TL;DR: This paper focuses on a specific use-case of face recognition and describes in details how to make the recognition experiments reproducible in practice, and emphasizes that a reproducible research work should be repeatable, shareable, extensible, and stable.

...read moreread less

Abstract: Pattern recognition and machine learning research work often contains experimental results on real-world data, which corroborates hypotheses and provides a canvas for the development and comparison of new ideas. Results, in this context, are typically summarized as a set of tables and figures, allowing the comparison of various methods, highlighting the advantages of the proposed ideas. Unfortunately, result reproducibility is often an overlooked feature of original research publications, competitions, or benchmark evaluations. The main reason for such a gap is the complexity on the development of software associated with these reports. Software frameworks are difficult to install, maintain, and distribute, while scientific experiments often consist of many steps and parameters that are difficult to report. The increasingly rising complexity of research challenges make it even more difficult to reproduce experiments and results. In this paper, we emphasize that a reproducible research work should be repeatable, shareable, extensible, and stable, and discuss important lessons we learned in creating, distributing, and maintaining software and data for reproducible research in pattern recognition and machine learning. We focus on a specific use-case of face recognition and describe in details how we can make the recognition experiments reproducible in practice.

...read moreread less

51 citations

Journal Article•DOI•

Long-Term Spectral Statistics for Voice Presentation Attack Detection

[...]

Hannah Muckenhirn¹, Pavel Korshunov², Mathew Magimai-Doss², Sébastien Marcel²•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Idiap Research Institute²

01 Nov 2017-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Investigations on ASVspoof 2015 challenge database and AVspoof database show that the proposed approach with a linear discriminative classifier yields a better system, irrespective of whether the spoofed signal is replayed to the microphone or is directly injected into the system software process.

...read moreread less

Abstract: Automatic speaker verification systems can be spoofed through recorded, synthetic, or voice converted speech of target speakers. To make these systems practically viable, the detection of such attacks, referred to as presentation attacks, is of paramount interest. In that direction, this paper investigates two aspects: 1) a novel approach to detect presentation attacks where, unlike conventional approaches, no speech signal modeling related assumptions are made, rather the attacks are detected by computing first-order and second-order spectral statistics and feeding them to a classifier, and 2) generalization of the presentation attack detection systems across databases. Our investigations on ASVspoof 2015 challenge database and AVspoof database show that, when compared to the approaches based on conventional short-term spectral features, the proposed approach with a linear discriminative classifier yields a better system, irrespective of whether the spoofed signal is replayed to the microphone or is directly injected into the system software process. Cross-database investigations show that neither the short-term spectral processing-based approaches nor the proposed approach yield systems which are able to generalize across databases or methods of attack. Thus, revealing the difficulty of the problem and the need for further resources and research.

...read moreread less

45 citations

Journal Article•DOI•

Impact of Score Fusion on Voice Biometrics and Presentation Attack Detection in Cross-Database Evaluations

[...]

Pavel Korshunov¹, Sébastien Marcel¹•Institutions (1)

Idiap Research Institute¹

07 Apr 2017-IEEE Journal of Selected Topics in Signal Processing

TL;DR: An extensive study of eight state-of-the-art audio-based presentation attack detection methods and evaluates their ability to detect known and unknown attacks using two major publicly available speaker databases with spoofing attacks: AVspoof and ASVspoof.

...read moreread less

Abstract: Research in the area of automatic speaker verification (ASV) has been advanced enough for the industry to start using ASV systems in practical applications. However, these systems are highly vulnerable to spoofing or presentation attacks, limiting their wide deployment. Therefore, it is important to develop mechanisms that can detect such attacks, and it is equally important for these mechanisms to be seamlessly integrated into existing ASV systems for practical and attack-resistant solutions. To be practical, however, an attack detection should (i) have high accuracy, (ii) be well-generalized for different attacks, and (iii) be simple and efficient. Several audio-based presentation attack detection (PAD) methods have been proposed recently but their evaluation was usually done on a single, often obscure, database with limited number of attacks. Therefore, in this paper, we conduct an extensive study of eight state-of-the-art PAD methods and evaluate their ability to detect known and unknown attacks (e.g., in a cross-database scenario) using two major publicly available speaker databases with spoofing attacks: AVspoof and ASVspoof. We investigate whether combining several PAD systems via score fusion can improve attack detection accuracy. We also study the impact of fusing PAD systems (via parallel and cascading schemes) with two i-vector and inter-session variability based ASV systems on the overall performance in both bona fide (no attacks) and spoof scenarios. The evaluation results question the efficiency and practicality of the existing PAD systems, especially when comparing results for individual databases and cross-database data. Fusing several PAD systems can lead to a slightly improved performance; however, how to select which systems to fuse remains an open question. Joint ASV-PAD systems show a significantly increased resistance to the attacks at the expense of slightly degraded performance for bona fide scenarios.

...read moreread less

29 citations

Proceedings Article•DOI•

On the Generalization of Fused Systems in Voice Presentation Attack Detection

[...]

André R. Gonçalves¹, Ricardo Paranhos Velloso Violato, Pavel Korshunov², Sébastien Marcel², Flávio Olmos Simões - Show less +1 more•Institutions (2)

Lawrence Livermore National Laboratory¹, Idiap Research Institute²

01 Sep 2017

TL;DR: Extended evaluation results show that the fusion-based system, although successful in the scope of the evaluation, lacks the ability to accurately discriminate genuine data from attacks in unknown conditions, which raises the question on how to assess the generalization ability of attack detection systems in practical application scenarios.

...read moreread less

Abstract: This paper describes presentation attack detection systems developed for the Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017). The submitted systems, using calibration and score fusion techniques, combine different sub-systems (up to 18), which are based on eight state of the art features and rely on Gaussian mixture models and feed-forward neural network classifiers. The systems achieved the top five performances in the competition. We present the proposed systems and analyze the calibration and fusion strategies employed. To assess the systems' generalization capacity, we evaluated it on an unrelated larger database recorded in Portuguese language, which is different from the English language used in the competition. These extended evaluation results show that the fusion-based system, although successful in the scope of the evaluation, lacks the ability to accurately discriminate genuine data from attacks in unknown conditions, which raises the question on how to assess the generalization ability of attack detection systems in practical application scenarios.

...read moreread less

15 citations

Book Chapter•DOI•

Presentation attack detection in voice biometrics

[...]

Pavel Korshunov¹, Sébastien Marcel¹•Institutions (1)

Idiap Research Institute¹

30 Nov 2017

TL;DR: This chapter discusses vulnerabilities of these systems to presentation attacks (PAs), present different state-of-the-art PAD systems, give the insights into their performances, and discuss the integration of PAD andASV systems.

...read moreread less

Abstract: In this chapter, however, we focus on PAD in voice biometrics, i.e., automatic speaker verification (ASV) systems. We discuss vulnerabilities of these systems to presentation attacks (PAs), present different state-of-the-art PAD systems, give the insights into their performances, and discuss the integration of PAD andASV systems.

...read moreread less

3 citations