scispace - formally typeset
Journal ArticleDOI

Voice transformation using PSOLA technique

H. Valbret, +2 more
- Vol. 11, Iss: 2, pp 175-187
Reads0
Chats0
TLDR
A new system for voice conversion is described that combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation, which produces a satisfyingly natural “transformed” voice.
Abstract
In this contribution, a new system for voice conversion is described. The proposed architecture combines a PSOLA (Pitch Synchronous Overlap and Add)-derived synthesizer and a module for spectral transformation. The synthesizer based on the classical source-filter decomposition allows prosodic and spectral transformations to be performed independently. Prosodic modifications are applied on the excitation signal using the TD-PSOLA scheme; converted speech is then synthesized using the transformed spectral parameters. Two different approaches to derive spectral transformations, borrowed from the speech-recognition domain, are compared: Linear Multivariate Regression (LMR) and Dynamic Frequency Warping (DFW). Vector-quantization is carried out as a preliminary stage to render the spectral transformations dependent of the acoustical realization of sounds. A formal listening test shows that the synthesizer produces a satisfyingly natural “transformed” voice. LMR proves yet to allow a slightly better conversion than DFW. Still there is room for improvement in the spectral transformation stage.

read more

Citations
More filters
Journal ArticleDOI

Feature Selection-based Voice Transformation

TL;DR: In this paper, a two-pass transformation method was proposed, where the feature parameters were first selected from a database and then the optimal sequence of the features was then constructed in the second pass.
DissertationDOI

Using Sonic Enhancement to Augment Non-Visual Tabular Navigation

TL;DR: Results demonstrated that stereo panning was an effective technique for audio-spatially orienting non-visual navigation in a five-row, six-column HTML table as compared to a centered, stationary synthesized voice.
Book ChapterDOI

Analysis of Durations of Sound Units

TL;DR: It is observed that durations of sound units depend on several factors at various levels, and it is very difficult to derive the precise rules for accurate estimation of durations.
Journal ArticleDOI

Mapping Articulatory-Features to Vocal-Tract Parameters for Voice Conversion

TL;DR: The proposed voice conversion based on articulatory features to vocal-tract parameters mapping is not only text-independent VC, in which it does not need parallel utterances between source and target-speakers, but can also be used for an arbitrary sourcespeaker.
DissertationDOI

Voice conversion with parallel/non-parallel data and synthetic speech detection

Xiaohai Tian
TL;DR: This thesis proposes two novel voice conversion methods to improve the system performance for both parallel and non-parallel data based voice conversion and investigates the use of different feature representations to discriminate between live and synthetic speech.
References
More filters
Journal ArticleDOI

An Algorithm for Vector Quantizer Design

TL;DR: An efficient and intuitive algorithm is presented for the design of vector quantizers based either on a known probabilistic model or on a long training sequence of data.
Book

Linear Prediction of Speech

John E. Markel, +1 more
TL;DR: Speech Analysis and Synthesis Models: Basic Physical Principles, Speech Synthesis Structures, and Considerations in Choice of Analysis.
Journal ArticleDOI

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

TL;DR: In a common framework several algorithms that have been proposed recently, in order to improve the voice quality of a text-to-speech synthesis based on acoustical units concatenation based on pitch-synchronous overlap-add approach are reviewed.
Proceedings ArticleDOI

Voice conversion through vector quantization

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.
Related Papers (5)