scispace - formally typeset
Journal ArticleDOI

Voice Conversion Based on Weighted Frequency Warping

Reads0
Chats0
TLDR
Compared to standard probabilistic systems, Weighted Frequency Warping results in a significant increase in quality scores, whereas the conversion scores remain almost unaltered.
Abstract
Any modification applied to speech signals has an impact on their perceptual quality. In particular, voice conversion to modify a source voice so that it is perceived as a specific target voice involves prosodic and spectral transformations that produce significant quality degradation. Choosing among the current voice conversion methods represents a trade-off between the similarity of the converted voice to the target voice and the quality of the resulting converted speech, both rated by listeners. This paper presents a new voice conversion method termed Weighted Frequency Warping that has a good balance between similarity and quality. This method uses a time-varying piecewise-linear frequency warping function and an energy correction filter, and it combines typical probabilistic techniques and frequency warping transformations. Compared to standard probabilistic systems, Weighted Frequency Warping results in a significant increase in quality scores, whereas the conversion scores remain almost unaltered. This paper carefully discusses the theoretical aspects of the method and the details of its implementation, and the results of an international evaluation of the new system are also included.

read more

Citations
More filters
Journal ArticleDOI

Spoofing and countermeasures for speaker verification

TL;DR: A survey of past work and priority research directions for the future is provided, showing that future research should address the lack of standard datasets and the over-fitting of existing countermeasures to specific, known spoofing attacks.

Spoofing and countermeasures for speaker verification: a sur vey

TL;DR: In this paper, the authors provide a survey of spoofing countermeasures for automatic speaker verificati on, highlighting the need for more effort in the future to ensure adequate protection against spoofing attacks.
Journal ArticleDOI

Voice conversion using deep neural networks with layer-wise generative training

TL;DR: A DNN is used to construct a global non-linear mapping relationship between the spectral envelopes of two speakers to significantly improve the performance in terms of both similarity and naturalness compared to conventional methods.
Proceedings ArticleDOI

Voice conversion from non-parallel corpora using variational auto-encoder

TL;DR: In this article, a variational auto-encoder-decoder framework for spectral conversion with unaligned corpora is proposed. But it does not use parallel corpora or phonetic alignments to train a spectral conversion system.
Journal ArticleDOI

An overview of voice conversion systems

TL;DR: An overview of real-world applications of VC systems, extensively study existing systems proposed in the literature, and discuss remaining challenges are provided.
References
More filters
Journal ArticleDOI

Statistical Parametric Speech Synthesis

TL;DR: This paper gives a general overview of techniques in statistical parametric speech synthesis, and contrasts these techniques with the more conventional unit selection technology that has dominated speech synthesis over the last ten years.
Journal ArticleDOI

Continuous probabilistic transform for voice conversion

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.
Journal ArticleDOI

Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

TL;DR: In this article, a Gaussian mixture model (GMM) of the joint probability density of source and target features is employed for performing spectral conversion between speakers, and a conversion method based on the maximum-likelihood estimation of a spectral parameter trajectory is proposed.
Proceedings ArticleDOI

Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory.

TL;DR: The 9th Annual Conference of the International Speech Communication Association, September 22-26, 2008, Brisbane, Australia as discussed by the authors, was held at the University of Queensland, Queensland, Australia.
Proceedings ArticleDOI

Voice conversion through vector quantization

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.
Related Papers (5)