scispace - formally typeset
Open Access

Improved Average-Voice-based Speech Synthesis Using Gender-Mixed Modeling and a Parameter Generation Algorithm Considering GV

Reads0
Chats0
TLDR
This paper incorporates a high-quality speech vocoding method STRAIGHT and a parameter generation algorithm with global variance into the system for improving quality of synthetic speech and introduces a feature-space speaker adaptive training algorithm and a gender mixed modeling technique for conducting further normalization of the average voice model.
Abstract
For constructing a speech synthesis system which can achieve diverse voices, we have been developing a speaker independent approach of HMM-based speech synthesis in which statistical average voice models are adapted to a target speaker using a small amount of speech data. In this paper, we incorporate a high-quality speech vocoding method STRAIGHT and a parameter generation algorithm with global variance into the system for improving quality of synthetic speech. Furthermore, we introduce a feature-space speaker adaptive training algorithm and a gender mixed modeling technique for conducting further normalization of the average voice model. We build an English text-to-speech system using these techniques and show the performance of the system.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm

TL;DR: A new adaptation algorithm is proposed called constrained structural maximum a posteriori linear regression (CSMAPLR) whose derivation is based on the knowledge obtained in this analysis and on the results of comparing several conventional adaptation algorithms.

Speaker-Independent HMM-based Speech Synthesis System: HTS-2007 System for the Blizzard Challenge 2007

TL;DR: This paper describes an HMM-based speech synthesis system developed by the HTS working group for the Blizzard Challenge 2007, and incorporates new features in the conventional system which underpin a speaker-independent approach: speaker adaptation techniques; adaptive training for HSMMs; and full covariance modeling using the CSMAPLR transforms.
Journal ArticleDOI

Personalized Spectral and Prosody Conversion Using Frame-Based Codeword Distribution and Adaptive CRF

TL;DR: The experimental results showed that the proposed voice conversion method, based on distribution-based alignment and prosodic word boundary detection, can improve the speech quality and speaker similarity of the converted speech.
Proceedings ArticleDOI

Performance evaluation of the speaker-independent HMM-based speech synthesis system “HTS 2007” for the Blizzard Challenge 2007

TL;DR: This paper describes a speaker-independent/adaptive HMM-based speech synthesis system developed for the Blizzard Challenge 2007, which employs speaker adaptation, feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in previous systems.
References
More filters
Journal ArticleDOI

Maximum likelihood linear transformations for HMM-based speech recognition

TL;DR: The paper compares the two possible forms of model-based transforms: unconstrained, where any combination of mean and variance transform may be used, and constrained, which requires the variance transform to have the same form as the mean transform.
Journal ArticleDOI

Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds

TL;DR: A set of simple new procedures has been developed to enable the real-time manipulation of speech parameters by using pitch-adaptive spectral analysis combined with a surface reconstruction method in the time–frequency region.
Proceedings ArticleDOI

A compact model for speaker-adaptive training

TL;DR: A novel approach to estimating the parameters of continuous density HMMs for speaker-independent (SI) continuous speech recognition that jointly annihilates the inter-speaker variation and estimates the HMM parameters of the SI acoustic models.
Journal ArticleDOI

A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

TL;DR: In this article, the authors proposed a parameter generation algorithm for an HMM-based speech synthesis technique. But the generated trajectory is often excessively smoothed due to the statistical processing. And the over-smoothing effect usually causes muffled sounds.

An hmm-based speech synthesis system applied to english

TL;DR: This paper describes an HMM-based speech synthesis system (HTS), in which speech waveform is generated from HMMs themselves, and applies it to English speech synthesis using the general speech synthesis architecture of Festival.
Related Papers (5)