Voice Conversion Based on Weighted Frequency Warping

doi:10.1109/TASL.2009.2038663

Journal ArticleDOI

Voice Conversion Based on Weighted Frequency Warping

Daniel Erro, +2 more

- 01 Jul 2010 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 18, Iss: 5, pp 922-931

Chats0

TLDR

Compared to standard probabilistic systems, Weighted Frequency Warping results in a significant increase in quality scores, whereas the conversion scores remain almost unaltered.

Abstract:

Any modification applied to speech signals has an impact on their perceptual quality. In particular, voice conversion to modify a source voice so that it is perceived as a specific target voice involves prosodic and spectral transformations that produce significant quality degradation. Choosing among the current voice conversion methods represents a trade-off between the similarity of the converted voice to the target voice and the quality of the resulting converted speech, both rated by listeners. This paper presents a new voice conversion method termed Weighted Frequency Warping that has a good balance between similarity and quality. This method uses a time-varying piecewise-linear frequency warping function and an energy correction filter, and it combines typical probabilistic techniques and frequency warping transformations. Compared to standard probabilistic systems, Weighted Frequency Warping results in a significant increase in quality scores, whereas the conversion scores remain almost unaltered. This paper carefully discusses the theoretical aspects of the method and the details of its implementation, and the results of an international evaluation of the new system are also included.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Spoofing and countermeasures for speaker verification

Zhizheng Wu, +5 more

- 01 Feb 2015 -

Speech Communication

TL;DR: A survey of past work and priority research directions for the future is provided, showing that future research should address the lack of standard datasets and the over-fitting of existing countermeasures to specific, known spoofing attacks.

...read moreread less

Spoofing and countermeasures for speaker verification: a sur vey

Zhizheng Wu, +5 more

TL;DR: In this paper, the authors provide a survey of spoofing countermeasures for automatic speaker verificati on, highlighting the need for more effort in the future to ensure adequate protection against spoofing attacks.

...read moreread less

Journal ArticleDOI

Voice conversion using deep neural networks with layer-wise generative training

Ling-Hui Chen, +3 more

- 01 Dec 2014 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: A DNN is used to construct a global non-linear mapping relationship between the spectral envelopes of two speakers to significantly improve the performance in terms of both similarity and naturalness compared to conventional methods.

...read moreread less

Proceedings ArticleDOI

Voice conversion from non-parallel corpora using variational auto-encoder

Chin-Cheng Hsu, +4 more

TL;DR: In this article, a variational auto-encoder-decoder framework for spectral conversion with unaligned corpora is proposed. But it does not use parallel corpora or phonetic alignments to train a spectral conversion system.

...read moreread less

Journal ArticleDOI

An overview of voice conversion systems

Seyed Hamidreza Mohammadi, +1 more

- 01 Apr 2017 -

Speech Communication

TL;DR: An overview of real-world applications of VC systems, extensively study existing systems proposed in the literature, and discuss remaining challenges are provided.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Statistical Parametric Speech Synthesis

Alan W. Black, +2 more

TL;DR: This paper gives a general overview of techniques in statistical parametric speech synthesis, and contrasts these techniques with the more conventional unit selection technology that has dominated speech synthesis over the last ten years.

...read moreread less

Journal ArticleDOI

Continuous probabilistic transform for voice conversion

Yannis Stylianou, +2 more

- 01 Mar 1998 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

Journal ArticleDOI

Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

Tomoki Toda, +2 more

- 01 Nov 2007 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: In this article, a Gaussian mixture model (GMM) of the joint probability density of source and target features is employed for performing spectral conversion between speakers, and a conversion method based on the maximum-likelihood estimation of a spectral parameter trajectory is proposed.

...read moreread less

Proceedings ArticleDOI

Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory.

Takashi Muramatsu, +4 more

TL;DR: The 9th Annual Conference of the International Speech Communication Association, September 22-26, 2008, Brisbane, Australia as discussed by the authors, was held at the University of Queensland, Queensland, Australia.

...read moreread less

Proceedings ArticleDOI

Voice conversion through vector quantization

Masanobu Abe, +3 more

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.

...read moreread less