Proceedings ArticleDOI
Factor analysis methods for joint speaker verification and spoof detection
Dhanush B K,Suparna S,Aarthy R,Likhita C,Shashank D,Harish H,Sriram Ganapathy +6 more
- pp 5385-5389
Reads0
Chats0
TLDR
This paper attempts to develop a joint modelling approach which can detect the presence of spoofing attacks while also performing the speaker verification task and proposes a factor modelling approach where the spoof variability subspace and the speaker variability sub space are jointly trained.Abstract:
The performance of a speaker verification system is severely degraded by spoofing attacks generated from artificial speech synthesizers. Recently, several approaches have been proposed for classifying natural and synthetic speech (spoof detection) which can be used in conjunction with a speaker verification system. In this paper, we attempt to develop a joint modelling approach which can detect the presence of spoofing attacks while also performing the speaker verification task. We propose a factor modelling approach where the spoof variability subspace and the speaker variability subspace are jointly trained. The lower dimensional projections in these subspaces are used for speaker verification as well as spoof detection tasks. We also investigate the benefits of linear discriminant analysis (LDA), widely used in speaker recognition, for the spoof detection task. Several experiments are performed using the speaker and spoofing (SAS) database. For speaker verification, we compare the performance of the proposed method with a baseline method of fusing a conventional speaker verification system and a spoof detection system. In these experiments, the proposed approach provides substantial improvements for spoof detection (relative improvements of 20% in EER over the baseline) as well as speaker verification under spoofing conditions (relative improvements of 40% in EER over the baseline).read more
Citations
More filters
Proceedings ArticleDOI
ResNet and Model Fusion for Automatic Spoofing Detection.
TL;DR: Inspired by the success of ResNet in image recognition, the effectiveness of using ResNet for automatic spoofing detection is investigated and it is found that if the same feature is used for different fused models, the resulting system can hardly be improved.
Journal ArticleDOI
rVAD: An unsupervised segment-based robust voice activity detection method
TL;DR: A modified version of rVAD is presented where computationally intensive pitch extraction is replaced by computationally efficient spectral flatness calculation, which significantly reduces the computational complexity at the cost of moderately inferior VAD performance, which is an advantage when processing a large amount of data and running on low resource devices.
Journal ArticleDOI
Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection
TL;DR: It is shown that the end-to-end approach based on a raw waveform input can outperform common cepstral features, without the use of context-dependent frame extensions, and that the proposed model is capable of distinguishing device-invariant spoofing attempts.
Posted Content
rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method
TL;DR: In this article, an unsupervised segment-based method for robust voice activity detection (rVAD) is presented, which consists of two passes of denoising followed by a VAD stage, where high-energy segments in a speech signal are detected by using a posteriori signal-to-noise ratio (SNR) weighted energy difference.
Posted Content
Multi-task Learning Based Spoofing-Robust Automatic Speaker Verification System.
TL;DR: A spoofing-robust automatic speaker verification system for diverse attacks based on a multi-task learning architecture that is jointly trained with time-frequency representations from utterances to provide recognition decisions for both tasks simultaneously.
References
More filters
Journal ArticleDOI
LIBSVM: A library for support vector machines
Chih-Chung Chang,Chih-Jen Lin +1 more
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Journal ArticleDOI
Front-End Factor Analysis for Speaker Verification
TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.
Journal ArticleDOI
An overview of text-independent speaker recognition: From features to supervectors
Tomi Kinnunen,Haizhou Li +1 more
TL;DR: This paper starts with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling and elaborate advanced computational techniques to address robustness and session variability.
Proceedings ArticleDOI
Probabilistic Linear Discriminant Analysis for Inferences About Identity
TL;DR: This paper describes face data as resulting from a generative model which incorporates both within- individual and between-individual variation, and calculates the likelihood that the differences between face images are entirely due to within-individual variability.
Proceedings Article
Analysis of i-vector Length Normalization in Speaker Recognition Systems.
TL;DR: The proposed approach deals with the nonGaussian behavior of i-vectors by performing a simple length normalization, which allows the use of probabilistic models with Gaussian assumptions that yield equivalent performance to that of more complicated systems based on Heavy-Tailed assumptions.