scispace - formally typeset
Journal ArticleDOI

Voice transformation using PSOLA technique

H. Valbret, +2 more
- Vol. 11, Iss: 2, pp 175-187
Reads0
Chats0
TLDR
A new system for voice conversion is described that combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation, which produces a satisfyingly natural “transformed” voice.
Abstract
In this contribution, a new system for voice conversion is described. The proposed architecture combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation. The synthesizer based on the classical source-filter decomposition allows prosodic and spectral transformations to be performed independently. Prosodic modifications are applied on the excitation signal using the TD-PSOLA scheme; converted speech is then synthesized using the transformed spectral parameters. Two different approaches to derive spectral transformations, borrowed from the speech-recognition domain, are compared: Linear Multivariate Regression (LMR) and Dynamic Frequency Warping (DFW). Vector-quantization is carried out as a preliminary stage to render the spectral transformations dependent of the acoustical realization of sounds. A formal listening test shows that the synthesizer produces a satisfyingly natural “transformed” voice. LMR proves yet to allow a slightly better conversion than DFW. Still there is room for improvement in the spectral transformation stage.

read more

Citations
More filters
Proceedings ArticleDOI

New algorithm for spectral smoothing and envelope modification for LP-PSOLA synthesis

TL;DR: An algorithm capable of modifying the LPC envelope in a flexible way is presented, along with results concerning pitch marking, which is the heart of a spectral smoothing module for a diphone-based linear prediction pitch-synchronous overlap-add (LP-PSOLA) concatenation system.
Journal ArticleDOI

A Multi-level GMM-Based Cross-Lingual Voice Conversion Using Language-Specific Mixture Weights for Polyglot Synthesis

TL;DR: The current work focuses on alleviating the oversmoothing effect in GMM-based voice conversion technique, using (source) language-specific mixture weights in a multi-level GMM followed by selective pole focusing in the unvoiced speech segments.
Dissertation

Glottal source and vocal-tract separation Estimation of glottal parameters, voice transformation and synthesis using a glottal model

TL;DR: In this article, the authors present a procedure d'analyse/synthese which estime le filtre du conduit vocal en utilisant un spectre observe and sa source estimee.
Journal ArticleDOI

Analysis of gender and identity issues in depression detection on de-identified speech

TL;DR: The results suggest that speaker-independent and gender-dependent de-identification is the most suitable option for depression level estimation since the trade-off between de-Identification and depression estimation performances was superior to the other alternatives.
Proceedings ArticleDOI

Spectral modification for concatenative speech synthesis

TL;DR: This work investigates two speech modification strategies, one based on inverse filtering and the other on sinusoidal modeling, and explains their merits and shortcomings for changing the spectral envelope in speech, and proposes a method which uses sinuoidal modeling and represents the complex sinusoid amplitudes by an all-pole model.
References
More filters
Journal ArticleDOI

An Algorithm for Vector Quantizer Design

TL;DR: An efficient and intuitive algorithm is presented for the design of vector quantizers based either on a known probabilistic model or on a long training sequence of data.
Book

Linear Prediction of Speech

John E. Markel, +1 more
TL;DR: Speech Analysis and Synthesis Models: Basic Physical Principles, Speech Synthesis Structures, and Considerations in Choice of Analysis.
Journal ArticleDOI

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

TL;DR: In a common framework several algorithms that have been proposed recently, in order to improve the voice quality of a text-to-speech synthesis based on acoustical units concatenation based on pitch-synchronous overlap-add approach are reviewed.
Proceedings ArticleDOI

Voice conversion through vector quantization

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.
Related Papers (5)