Voice transformation using PSOLA technique

doi:10.1016/0167-6393(92)90012-V

Journal ArticleDOI

Voice transformation using PSOLA technique

H. Valbret, +2 more

- Vol. 11, Iss: 2, pp 175-187

Chats0

TLDR

A new system for voice conversion is described that combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation, which produces a satisfyingly natural “transformed” voice.

Abstract:

In this contribution, a new system for voice conversion is described. The proposed architecture combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation. The synthesizer based on the classical source-filter decomposition allows prosodic and spectral transformations to be performed independently. Prosodic modifications are applied on the excitation signal using the TD-PSOLA scheme; converted speech is then synthesized using the transformed spectral parameters. Two different approaches to derive spectral transformations, borrowed from the speech-recognition domain, are compared: Linear Multivariate Regression (LMR) and Dynamic Frequency Warping (DFW). Vector-quantization is carried out as a preliminary stage to render the spectral transformations dependent of the acoustical realization of sounds. A formal listening test shows that the synthesizer produces a satisfyingly natural “transformed” voice. LMR proves yet to allow a slightly better conversion than DFW. Still there is room for improvement in the spectral transformation stage.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

A DNN-based emotional speech synthesis by speaker adaptation

Hongwu Yang, +2 more

TL;DR: Subjective evaluations show that comparing with the traditional hidden Markov model (HMM)-based method, the proposed DNN-based emotional speech synthesis method can achieve higher opinion scores and improve the emotion express and naturalness of synthesized emotional speech.

...read moreread less

Journal ArticleDOI

Voice conversion with SI-DNN and KL divergence based mapping without parallel training data

Feng-Long Xie, +3 more

- 01 Jan 2019 -

Speech Communication

TL;DR: Both objective and subjective measures used for evaluating voice conversion performance show that the new algorithm performs better than the sequential error minimization based DNN baseline trained with parallel training data.

...read moreread less

Journal ArticleDOI

Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars

Ryo Aihara, +2 more

- 11 May 2015 -

ACM Transactions on Accessible Computing

TL;DR: A phoneme-categorized subdictionary and a dictionary selection method using NMF is proposed to reduce the mismatching of phoneme alignment in a voice conversion method for a person with an articulation disorder resulting from athetoid cerebral palsy.

...read moreread less

Proceedings Article

Maximum a posteriori voice conversion using sequential Monte Carlo methods

Elina Helander, +3 more

TL;DR: This paper proposes to optimize the speech feature sequence after a frame-based conversion algorithm has been applied, and select the sequence of speech features through the minimization of a cost function that involves both conversion error and smoothness of the sequence.

...read moreread less

Journal ArticleDOI

Neural speech-rate conversion with multispeaker WaveNet vocoder

Takuma Okamoto, +4 more

- 01 Jan 2022 -

Speech Communication

TL;DR: In this paper , a machine-learning-based approach using neural vocoders, to perform neural speech-rate conversion is proposed, which can expand or compress speech waveforms while preserving the pitch of the sound, is traditionally realized by signal processing-based approaches.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

An Algorithm for Vector Quantizer Design

Y. Linde, +2 more

- 01 Jan 1980 -

IEEE Transactions on Communications

TL;DR: An efficient and intuitive algorithm is presented for the design of vector quantizers based either on a known probabilistic model or on a long training sequence of data.

...read moreread less

Book

Linear Prediction of Speech

John E. Markel, +1 more

TL;DR: Speech Analysis and Synthesis Models: Basic Physical Principles, Speech Synthesis Structures, and Considerations in Choice of Analysis.

...read moreread less

Journal ArticleDOI

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Eric Moulines, +1 more

- 01 Dec 1990 -

Speech Communication

TL;DR: In a common framework several algorithms that have been proposed recently, in order to improve the voice quality of a text-to-speech synthesis based on acoustical units concatenation based on pitch-synchronous overlap-add approach are reviewed.

...read moreread less

Proceedings ArticleDOI

Voice conversion through vector quantization

Masanobu Abe, +3 more

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.

...read moreread less

Journal Article