scispace - formally typeset
Proceedings ArticleDOI

A hybrid text-to-speech based on sub-band approach

Reads0
Chats0
TLDR
This paper proposes a sub-band speech synthesis approach to develop high-quality Text-to-Speech (TTS) that combines the inherent benefits from both waveform- based speech synthesis and HMM-based speech synthesis.
Abstract
This paper proposes a sub-band speech synthesis approach to develop high-quality Text-to-Speech (TTS). For the low-frequency band and high-frequency band, Hidden Markov Model (HMM)-based speech synthesis and waveform-based speech synthesis are used, respectively. Both speech synthesis methods are widely known to show good performance and to have benefits and shortcomings from different points of view. One motivation is to apply the right speech synthesis method in the right frequency band. Experiment results show that in terms of the smoothness the proposed approach shows better performance than waveform-based speech synthesis, and in terms of the clarity it shows better than HMM-based speech synthesis. Consequently, the proposed approach combines the inherent benefits from both waveform-based speech synthesis and HMM-based speech synthesis.

read more

Citations
More filters
Proceedings ArticleDOI

Sub-band text-to-speech combining sample-based spectrum with statistically generated spectrum.

TL;DR: A sub-band speech synthesis approach to develop a high quality Text-to-Speech (TTS) system with a sample-based spectrum selected from a phoneme database such that it is the most similar to spectrum generated by HMM-based speech synthesis.
Proceedings ArticleDOI

Multi-stream spectral representation for statistical parametric speech synthesis

TL;DR: An approach in which the high frequency spectrum is modelled separately from the low frequency spectrum, which makes samples synthesised using the proposed approach sound less muffled and more natural.
Posted Content

A Fully Time-domain Neural Model for Subband-based Speech Synthesizer.

TL;DR: A fully time-domain neural model for subband-based text-to-speech (TTS) synthesizer, which is nearly end- to-end, and shows comparable quality as the fullband one with a slighter network architecture for each subband.
References
More filters
Journal ArticleDOI

Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds

TL;DR: A set of simple new procedures has been developed to enable the real-time manipulation of speech parameters by using pitch-adaptive spectral analysis combined with a surface reconstruction method in the time–frequency region.
Journal ArticleDOI

Statistical Parametric Speech Synthesis

TL;DR: This paper gives a general overview of techniques in statistical parametric speech synthesis, and contrasts these techniques with the more conventional unit selection technology that has dominated speech synthesis over the last ten years.
Proceedings Article

Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis

TL;DR: An HMM-based speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM is described.
Journal ArticleDOI

A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

TL;DR: In this article, the authors proposed a parameter generation algorithm for an HMM-based speech synthesis technique. But the generated trajectory is often excessively smoothed due to the statistical processing. And the over-smoothing effect usually causes muffled sounds.
Proceedings ArticleDOI

Statistical Parametric Speech Synthesis

Black, +2 more
Related Papers (5)