scispace - formally typeset
Open Access

Efficient spectral envelope estimation and its application to pitch shifting and envelope preservation

Axel Roebel, +1 more
- pp 30-35
Reads0
Chats0
TLDR
In this article, a cepstrum-based iterative true envelope estimator is proposed for pitch shifting with preservation of the spectral envelope in the phase vocoder, which can reduce the run time by a factor of 2.5-11.
Abstract
In this article the estimation of the spectral envelope of sound signals is addressed. The intended application for the developed algorithm is pitch shifting with preservation of the spectral envelope in the phase vocoder. As a first step the different existing envelope estimation algorithms are investigated and their specific properties discussed. As the most promising algorithm the cepstrum based iterative true envelope estimator is selected. By means of controlled sub-sampling of the log amplitude spectrum and by means of a simple step size control for the iterative algorithm the run time of the algorithm can be decreased by a factor of 2.5-11. As a remedy for the ringing effects in the the spectral envelope that are due to the rectangular filter used for spectral smoothing we propose the use of a Hamming window as smoothing filter. The resulting implementation of the algorithm has slightly increased computational complexity compared to the standard LPC algorithm but offers significantly improved control over the envelope characteristics. The application of the true envelope estimator in a pitch shifting application is investigated. The main problems for pitch shifting with envelope preservation in a phase vocoder are identified and a simple yet efficient remedy is proposed.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Essentia: An Audio Analysis Library for Music Information Retrieval.

TL;DR: Comunicacio presentada a la 14th International Society for Music Information Retrieval Conference, celebrada a Curitiba (Brasil) els dies 4 a 8 de novembre de 2013.
Book

An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics

TL;DR: This book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike.
Proceedings ArticleDOI

Detecting Depression using Vocal, Facial and Semantic Communication Cues

TL;DR: In this article, the authors derived biomarkers from all of these modalities, drawing first from previously developed neurophysiologically-motivated speech and facial coordination and timing features, and incorporated a novel indicator of lower vocal tract constriction in articulation that relates to vocal projection.
Proceedings ArticleDOI

Multi-Modal Audio, Video and Physiological Sensor Learning for Continuous Emotion Prediction

TL;DR: This paper provides an overview of the AVEC Emotion Challenge system, which uses multi-feature learning and fusion across all available modalities, including the development of novel high- and low-level features for modeling emotion in the audio, video, and physiological channels.
Journal ArticleDOI

Towards a small set of robust acoustic features for emotion recognition: challenges

TL;DR: The goal of the present study is to select a consensual set of acoustic features for valence recognition using classification and non-classification based feature ranking and cross-corpus experiments, and to optimize emotional models simultaneously.
References
More filters
Book

Linear Prediction of Speech

John E. Markel, +1 more
TL;DR: Speech Analysis and Synthesis Models: Basic Physical Principles, Speech Synthesis Structures, and Considerations in Choice of Analysis.
Journal ArticleDOI

Improved phase vocoder time-scale modification of audio

TL;DR: This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes, and two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved.
Journal ArticleDOI

The Phase Vocoder: A Tutorial

Mark Dolson
TL;DR: This article attempts to explain the operation of the phase vocoder in terms accessible to musicians, relying heavily on the familiar concepts of sine waves, filters, and additive synthesis, and employing a minimum of mathematics.
Journal ArticleDOI

Discrete all-pole modeling

TL;DR: One result is an autocorrelation matching condition that overcomes the limitations of linear prediction and produces better fitting spectral envelopes for spectra that are representable by a relatively small discrete set of values, such as in voiced speech.
Proceedings ArticleDOI

New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects

Jean Laroche, +1 more
TL;DR: The phase-vocoder is usually presented as a high-quality solution for time-scale modification of signals, pitch-scale modifications usually being implemented as a combination of timescaling and sampling rate conversion.
Related Papers (5)