scispace - formally typeset
Journal ArticleDOI

Nonstationary spectral modeling of voiced speech

Reads0
Chats0
TLDR
A novel model for voiced speech that allows for local non-stationarities not only in terms of pitch perturbations, but in Terms of vocal tract variations as well, and supports new forms of spectral prediction, which can be put to advantage in speech coding applications.
Abstract
The main purpose of this paper is to present a novel model for voiced speech. The classical model, which is being used in many applications, assumes local stationarity, and consequently imposes a simple and well known line structure to the short-time spectrum of voiced speech. The model derived in this paper allows for local non-stationarities not only in terms of pitch perturbations, but in terms of vocal tract variations as well. The resulting structure of the short-time spectrum becomes more complex, but can still be interpreted in terms of generalized lines. The proposed model supports new forms of spectral prediction, which can be put to advantage in speech coding applications. Experimental results are presented supporting the validity of both the model itself and the prediction relationships. Finally, a new class of speech coders, denoted harmonic coders, based on the presented model, is proposed, and a specific implementation is presented.

read more

Citations
More filters
Journal ArticleDOI

Perceptual coding of digital audio

TL;DR: This paper reviews methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model.
Journal ArticleDOI

Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model

TL;DR: The proposed analysis-by-synthesis/overlap-add (ABS/OLA) system allows for both fixed and time-varying time-, frequency-, and pitch-scale modifications, and computational shortcuts using the FFT algorithm make its implementation feasible using currently available hardware.
Patent

Method and apparatus for hybrid coding of speech at 4kbps

TL;DR: In this article, a method and apparatus for encoding speech for communication to a decoder for reproduction of the speech where the speech signal is classified into steady state voiced (harmonic), stationary unvoiced, and "transitory" or "transition" speech.
Book

Adaptive Signal Models: Theory, Algorithms, and Audio Applications

TL;DR: This paper presents a meta-modelling framework for Fourier Series Representations of Signal Models and Analysis-Synthesis and concludes with a comparison of these models against known models for Pitch-Synchronous Modeling.
Journal ArticleDOI

Encoding speech using prototype waveforms

TL;DR: The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals and excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s.
References
More filters
Journal ArticleDOI

Quantizing for minimum distortion

TL;DR: This paper discusses the problem of the minimization of the distortion of a signal by a quantizer when the number of output levels of the quantizer is fixed and an algorithm is developed to simplify their numerical solution.
Journal ArticleDOI

Frequency domain coding of speech

TL;DR: In this article, the authors present a theoretical framework for the design of subband and transform coder for low bit-rate speech decoding, which is based on spectral estimation and models of speech production and perception.
Journal ArticleDOI

Real-time digital hardware pitch detector

TL;DR: Computing of the autocorrelation function of the clipped speech is easily implemented in digital hardware using simple combinatorial logic, i.e., an up-down counter can be used to compute each correlation point.
Journal ArticleDOI

Short-time Fourier analysis of sampled speech

TL;DR: In this article, the theoretical basis for the representation of a speech signal by its short-time Fourier transform is developed, and a time-frequency representation for linear time-varying systems is applied to the speech-production model to formulate a quasi-stationary representation for the speech waveform.
Proceedings ArticleDOI

A model for short-time phase prediction of speech

TL;DR: This paper discusses a form of non-linear prediction, namely, the prediction of the phase of speech signals, based upon a new treatment of the classical speech production model within a short-time analysis/synthesis framework.
Related Papers (5)