scispace - formally typeset
Open AccessJournal ArticleDOI

Transformation of formants for voice conversion using artificial neural networks

Reads0
Chats0
TLDR
A scheme for developing a voice conversion system that converts the speech signal uttered by a source speaker to a speech signal having the voice characteristics of the target speaker using formants and a formant vocoder is proposed.
About
This article is published in Speech Communication.The article was published on 1995-02-01 and is currently open access. It has received 207 citations till now. The article focuses on the topics: Formant & Voice analysis.

read more

Citations
More filters
Journal ArticleDOI

A novel algorithm for voice conversion using Canonical Correlation Analysis

TL;DR: A novel algorithm for voice conversion based on Canonical Correlation Analysis (CCA) estimation based on Gaussian mixture models which can achieve better performance than the previous method which uses MMSE estimation criterion.
Dissertation

Accent Conversion via Formant-based Spectral Mapping and Pitch Contour Modification

TL;DR: A thesis submitted in partial fulfilment of requirements of the University of Wolverhampton for the degree of Master of Philosophy.
Book ChapterDOI

Voice Conversion Between Synthesized Bilingual Voices Using Line Spectral Frequencies

TL;DR: This paper applied LSF to voice conversion because LSF are not overly sensitive to quantization noise and can be interpolated and from experimental results, LSF based voice conversion shows good results for ABX and MOS tests than the direct frequency warping approaches.
Book ChapterDOI

A Study of Speech Phase in Dysarthria Voice Conversion System

TL;DR: The results of automatic speech recognition and spectrum analysis show that intelligibility is improved by replacing the Dysarthria phase with the normal phase during the synthesis step, which implies that the correct phase information must be considered for the dysarthria VC system.
Proceedings ArticleDOI

Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016

TL;DR: Comunicacio presentada a l'Interspeech 2016, celebrat els dies 8 a 12 de setembre de 2016 a San Francisco, California.
References
More filters
Journal ArticleDOI

Multilayer feedforward networks are universal approximators

TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.
Book

Fundamentals of speech recognition

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.
Journal ArticleDOI

Analysis, synthesis, and perception of voice quality variations among female and male talkers

TL;DR: Perceptual validation of the relative importance of acoustic cues for signaling a breathy voice quality has been accomplished using a new voicing source model for synthesis of more natural male and female voices.
Journal ArticleDOI

Speech analysis and synthesis by linear prediction of the speech wave.

TL;DR: Application of this method for efficient transmission and storage of speech signals as well as procedures for determining other speechcharacteristics, such as formant frequencies and bandwidths, the spectral envelope, and the autocorrelation function, are discussed.
Proceedings ArticleDOI

Voice conversion through vector quantization

TL;DR: The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.
Related Papers (5)
Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Transformation of formants for voice conversion using artificial neural networks" ?

In this paper the authors propose a scheme for developing a voice conversion system that converts the speech signal uttered by a source speaker to a speech signal having the voice characteristics of the target speaker. The scheme consists of a formant analysis phase, followed by a learning phase in which the implicit formant transformation is captured by a neural network. 

In this paper the authors train a neural network to learn a transformation function which can transform the speaker dependent parameters extracted from the speech of the source speaker to match with that of the target speaker. 

But in continuous speech, since the vocal tract changes its shape continuously, the extracted formants will have many transitions. 

Fant’s model (Fant, 1986) was used to excite the formant synthesizer for voiced frames and random noise for the case of unvoiced frames. 

The first three formants from these two corresponding steady voiced regions are used as a pair of input and output formant vectors to a neural network. 

prosodic modifications were incorporated in the excitation signal using PSOLA (Pitch Synchronous Overlap Add) technique and speech was synthesized using the transformed spectral parameters. 

In the present study suprasegmental features of the source speaker are retained, while using the transformed vocal tract parameters for synthesis. 

They are (1) identification of speaker characteristics or acquisition of speaker dependent knowledge in the analysis phase and (2) incorporation of the speaker specific knowledge while synthesis during the transformation phase. 

Trending Questions (1)
How do I save a voice message from signal?

In this paper we propose a scheme for developing a voice conversion system that converts the speech signal uttered by a source speaker to a speech signal having the voice characteristics of the target speaker.