Voice transformation using PSOLA technique

doi:10.1016/0167-6393(92)90012-V

Journal ArticleDOI

Voice transformation using PSOLA technique

H. Valbret, +2 more

- Vol. 11, Iss: 2, pp 175-187

Chats0

TLDR

A new system for voice conversion is described that combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation, which produces a satisfyingly natural “transformed” voice.

Abstract:

In this contribution, a new system for voice conversion is described. The proposed architecture combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation. The synthesizer based on the classical source-filter decomposition allows prosodic and spectral transformations to be performed independently. Prosodic modifications are applied on the excitation signal using the TD-PSOLA scheme; converted speech is then synthesized using the transformed spectral parameters. Two different approaches to derive spectral transformations, borrowed from the speech-recognition domain, are compared: Linear Multivariate Regression (LMR) and Dynamic Frequency Warping (DFW). Vector-quantization is carried out as a preliminary stage to render the spectral transformations dependent of the acoustical realization of sounds. A formal listening test shows that the synthesizer produces a satisfyingly natural “transformed” voice. LMR proves yet to allow a slightly better conversion than DFW. Still there is room for improvement in the spectral transformation stage.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Continuous probabilistic transform for voice conversion

Yannis Stylianou, +2 more

- 01 Mar 1998 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

Journal ArticleDOI

Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory

Tomoki Toda, +2 more

- 01 Nov 2007 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: In this article, a Gaussian mixture model (GMM) of the joint probability density of source and target features is employed for performing spectral conversion between speakers, and a conversion method based on the maximum-likelihood estimation of a spectral parameter trajectory is proposed.

...read moreread less

Proceedings ArticleDOI

Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory.

Takashi Muramatsu, +4 more

TL;DR: The 9th Annual Conference of the International Speech Communication Association, September 22-26, 2008, Brisbane, Australia as discussed by the authors, was held at the University of Queensland, Queensland, Australia.

...read moreread less

Proceedings ArticleDOI

Spectral voice conversion for text-to-speech synthesis

Alexander Kain, +1 more

TL;DR: A new voice conversion algorithm that modifies a source speaker's speech to sound as if produced by a target speaker is presented and is found to perform more reliably for small training sets than a previous approach.

...read moreread less

Journal ArticleDOI

Non-parametric techniques for pitch-scale and time-scale modification of speech

Eric Moulines, +1 more

- 01 Feb 1995 -

Speech Communication

TL;DR: This contribution reviews frequency-domain algorithms (phase-vocoder) and time- domain algorithms (Time-Domain Pitch-Synchronous Overlap/Add and the like) in the same framework and presents more recent variations of these schemes.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Speaker adaptation through vector quantization

Kiyohiro Shikano, +2 more

TL;DR: Vector quantization (VQ) is a technique that reduces the computation amount and memory size drastically and is proposed in order to improve speaker-independent recognition.

...read moreread less

Journal ArticleDOI

Normalization of vowels by vocal-tract length and its application to vowel identification

H. Wakita

- 01 Apr 1977 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: In this paper, a new approach to speech parameter normalization is presented in which no prior knowledge about the input speakers is required, and the vocal tract length and area function are first estimated from the acoustic speech waveform, and then the area function is normalized to an acoustic tube of the same shape having a certain reference length.

...read moreread less

Speaker recognition. An interpretive survey of the literature.

M H Hecker

Proceedings Article

An Improved Cepstral Method for Deconvolution of Source-Filter Systems with Discrete Spectra: Application to Musical Sound Signals

Thierry Galas, +1 more

Proceedings ArticleDOI

A segment-based approach to voice conversion

Masanobu Abe

TL;DR: The proposed voice conversion algorithm was used with two male speakers and, in terms of speaker identification accuracy, the speech converted by segment-sized units gave a score 20% higher than thespeech converted frame-by-frame.

...read moreread less