scispace - formally typeset
Proceedings ArticleDOI

Factor analysis methods for joint speaker verification and spoof detection

Reads0
Chats0
TLDR
This paper attempts to develop a joint modelling approach which can detect the presence of spoofing attacks while also performing the speaker verification task and proposes a factor modelling approach where the spoof variability subspace and the speaker variability sub space are jointly trained.
Abstract
The performance of a speaker verification system is severely degraded by spoofing attacks generated from artificial speech synthesizers. Recently, several approaches have been proposed for classifying natural and synthetic speech (spoof detection) which can be used in conjunction with a speaker verification system. In this paper, we attempt to develop a joint modelling approach which can detect the presence of spoofing attacks while also performing the speaker verification task. We propose a factor modelling approach where the spoof variability subspace and the speaker variability subspace are jointly trained. The lower dimensional projections in these subspaces are used for speaker verification as well as spoof detection tasks. We also investigate the benefits of linear discriminant analysis (LDA), widely used in speaker recognition, for the spoof detection task. Several experiments are performed using the speaker and spoofing (SAS) database. For speaker verification, we compare the performance of the proposed method with a baseline method of fusing a conventional speaker verification system and a spoof detection system. In these experiments, the proposed approach provides substantial improvements for spoof detection (relative improvements of 20% in EER over the baseline) as well as speaker verification under spoofing conditions (relative improvements of 40% in EER over the baseline).

read more

Citations
More filters
Proceedings ArticleDOI

ResNet and Model Fusion for Automatic Spoofing Detection.

TL;DR: Inspired by the success of ResNet in image recognition, the effectiveness of using ResNet for automatic spoofing detection is investigated and it is found that if the same feature is used for different fused models, the resulting system can hardly be improved.
Journal ArticleDOI

rVAD: An unsupervised segment-based robust voice activity detection method

TL;DR: A modified version of rVAD is presented where computationally intensive pitch extraction is replaced by computationally efficient spectral flatness calculation, which significantly reduces the computational complexity at the cost of moderately inferior VAD performance, which is an advantage when processing a large amount of data and running on low resource devices.
Journal ArticleDOI

Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection

TL;DR: It is shown that the end-to-end approach based on a raw waveform input can outperform common cepstral features, without the use of context-dependent frame extensions, and that the proposed model is capable of distinguishing device-invariant spoofing attempts.
Posted Content

rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method

TL;DR: In this article, an unsupervised segment-based method for robust voice activity detection (rVAD) is presented, which consists of two passes of denoising followed by a VAD stage, where high-energy segments in a speech signal are detected by using a posteriori signal-to-noise ratio (SNR) weighted energy difference.
Posted Content

Multi-task Learning Based Spoofing-Robust Automatic Speaker Verification System.

TL;DR: A spoofing-robust automatic speaker verification system for diverse attacks based on a multi-task learning architecture that is jointly trained with time-frequency representations from utterances to provide recognition decisions for both tasks simultaneously.
References
More filters
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Journal ArticleDOI

Front-End Factor Analysis for Speaker Verification

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.
Journal ArticleDOI

An overview of text-independent speaker recognition: From features to supervectors

TL;DR: This paper starts with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling and elaborate advanced computational techniques to address robustness and session variability.
Proceedings ArticleDOI

Probabilistic Linear Discriminant Analysis for Inferences About Identity

TL;DR: This paper describes face data as resulting from a generative model which incorporates both within- individual and between-individual variation, and calculates the likelihood that the differences between face images are entirely due to within-individual variability.
Proceedings Article

Analysis of i-vector Length Normalization in Speaker Recognition Systems.

TL;DR: The proposed approach deals with the nonGaussian behavior of i-vectors by performing a simple length normalization, which allows the use of probabilistic models with Gaussian assumptions that yield equivalent performance to that of more complicated systems based on Heavy-Tailed assumptions.
Related Papers (5)