Open Access
Efficient spectral envelope estimation and its application to pitch shifting and envelope preservation
Axel Roebel,Xavier Rodet +1 more
- pp 30-35
Reads0
Chats0
TLDR
In this article, a cepstrum-based iterative true envelope estimator is proposed for pitch shifting with preservation of the spectral envelope in the phase vocoder, which can reduce the run time by a factor of 2.5-11.Abstract:
In this article the estimation of the spectral envelope of sound signals is addressed. The intended application for the developed algorithm is pitch shifting with preservation of the spectral envelope in the phase vocoder. As a first step the different existing envelope estimation algorithms are investigated and their specific properties discussed. As the most promising algorithm the cepstrum based iterative true envelope estimator is selected. By means of controlled sub-sampling of the log amplitude spectrum and by means of a simple step size control for the iterative algorithm the run time of the algorithm can be decreased by a factor of 2.5-11. As a remedy for the ringing effects in the the spectral envelope that are due to the rectangular filter used for spectral smoothing we propose the use of a Hamming window as smoothing filter. The resulting implementation of the algorithm has slightly increased computational complexity compared to the standard LPC algorithm but offers significantly improved control over the envelope characteristics. The application of the true envelope estimator in a pitch shifting application is investigated. The main problems for pitch shifting with envelope preservation in a phase vocoder are identified and a simple yet efficient remedy is proposed.read more
Citations
More filters
Proceedings ArticleDOI
Essentia: An Audio Analysis Library for Music Information Retrieval.
Dmitry Bogdanov,Nicolas Wack,Emilia Gómez,Sankalp Gulati,Perfecto Herrera,Oscar Mayor,Gerard Roma,Justin Salamon,Jose R. Zapata,Xavier Serra +9 more
TL;DR: Comunicacio presentada a la 14th International Society for Music Information Retrieval Conference, celebrada a Curitiba (Brasil) els dies 4 a 8 de novembre de 2013.
Book
An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics
TL;DR: This book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike.
Proceedings ArticleDOI
Detecting Depression using Vocal, Facial and Semantic Communication Cues
James R. Williamson,Elizabeth Godoy,Miriam Cha,Adrianne Schwarzentruber,Pooya Khorrami,Youngjune Gwon,Hsiang-Tsung Kung,Charlie K. Dagli,Thomas F. Quatieri +8 more
TL;DR: In this article, the authors derived biomarkers from all of these modalities, drawing first from previously developed neurophysiologically-motivated speech and facial coordination and timing features, and incorporated a novel indicator of lower vocal tract constriction in articulation that relates to vocal projection.
Proceedings ArticleDOI
Multi-Modal Audio, Video and Physiological Sensor Learning for Continuous Emotion Prediction
Kevin Brady,Youngjune Gwon,Pooya Khorrami,Elizabeth Godoy,William M. Campbell,Charlie K. Dagli,Thomas S. Huang +6 more
TL;DR: This paper provides an overview of the AVEC Emotion Challenge system, which uses multi-feature learning and fusion across all available modalities, including the development of novel high- and low-level features for modeling emotion in the audio, video, and physiological channels.
Journal ArticleDOI
Towards a small set of robust acoustic features for emotion recognition: challenges
Marie Tahon,Laurence Devillers +1 more
TL;DR: The goal of the present study is to select a consensual set of acoustic features for valence recognition using classification and non-classification based feature ranking and cross-corpus experiments, and to optimize emotional models simultaneously.
References
More filters
Book
Linear Prediction of Speech
John E. Markel,A. Gray +1 more
TL;DR: Speech Analysis and Synthesis Models: Basic Physical Principles, Speech Synthesis Structures, and Considerations in Choice of Analysis.
Journal ArticleDOI
Improved phase vocoder time-scale modification of audio
Jean Laroche,Mark Dolson +1 more
TL;DR: This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes, and two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved.
Journal ArticleDOI
The Phase Vocoder: A Tutorial
TL;DR: This article attempts to explain the operation of the phase vocoder in terms accessible to musicians, relying heavily on the familiar concepts of sine waves, filters, and additive synthesis, and employing a minimum of mathematics.
Journal ArticleDOI
Discrete all-pole modeling
A. El-Jaroudi,John Makhoul +1 more
TL;DR: One result is an autocorrelation matching condition that overcomes the limitations of linear prediction and produces better fitting spectral envelopes for spectra that are representable by a relatively small discrete set of values, such as in voiced speech.
Proceedings ArticleDOI
New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects
Jean Laroche,Mark Dolson +1 more
TL;DR: The phase-vocoder is usually presented as a high-quality solution for time-scale modification of signals, pitch-scale modifications usually being implemented as a combination of timescaling and sampling rate conversion.
Related Papers (5)
Speech analysis/Synthesis based on a sinusoidal representation
R.J. McAulay,Thomas F. Quatieri +1 more