Topic

Spectrogram

About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Adaptive window zero-crossing-based instantaneous frequency estimation

[...]

S. Chandra Sekhar¹, Thippur V. Sreenivas¹•Institutions (1)

Indian Institute of Science¹

01 Jan 2004-EURASIP Journal on Advances in Signal Processing

TL;DR: Simulation results show that the adaptive window zero-crossing-based IF estimation method is superior to fixed window methods and is also better than adaptive spectrogram and adaptive Wigner-Ville distribution (WVD)-based IF estimators for different signal-to-noise ratio (SNR).

...read moreread less

Abstract: We address the problem of estimating instantaneous frequency (IF) of a real-valued constant amplitude time-varying sinusoid. Estimation of polynomial IF is formulated using the zero-crossings of the signal. We propose an algorithm to estimate nonpolynomial IF by local approximation using a low-order polynomial, over a short segment of the signal. This involves the choice of window length to minimize the mean square error (MSE). The optimal window length found by directly minimizing the MSE is a function of the higher-order derivatives of the IF which are not available a priori. However, an optimum solution is formulated using an adaptive window technique based on the concept of intersection of confidence intervals. The adaptive algorithm enables minimum MSE-IF (MMSE-IF) estimation without requiring a priori information about the IF. Simulation results show that the adaptive window zero-crossing-based IF estimation method is superior to fixed window methods and is also better than adaptive spectrogram and adaptive Wigner-Ville distribution (WVD)-based IF estimators for different signal-to-noise ratio (SNR).

...read moreread less

27 citations

Journal Article•DOI•

An Artificial Talker Driven from a Phonetic Input

[...]

John L. Kelly, Louis J. Gerstman

01 Jun 1961-Journal of the Acoustical Society of America

TL;DR: This paper describes a method of producing artificial speech from a phonetic input, i.e., symbols representing the names of phonemes corresponding to a given text are fed into a machine and the acoustic waveforms of connected speech emerge.

...read moreread less

Abstract: This paper describes a method of producing artificial speech from a phonetic input, i.e., symbols representing the names of phonemes corresponding to a given text are fed into a machine and the acoustic waveforms of connected speech emerge. The experimental work was accomplished on an electronic computer (IBM 7090), but the scheme is simple enough to permit realization with analog hardware. The talking machine program is divided into two parts. The first part simulates a more or less conventional resonance synthesizer of the tandem variety, requiring nine control signals; buzz intensity, hiss intensity, pitch, plus the center frequencies and bandwidths of three formants. Initially, this part of the program was used alone in experiments for which the inputs were detailed specifications of the control signals derived from spectrograms and physiological data, sampled at approximately three times the phonemic rate. Results from this phase were later combined with known results in speech perception to produce ...

...read moreread less

27 citations

Journal Article•DOI•

Cardiac Doppler blood-flow signal analysis. Part 2. Time/frequency representation based on autoregressive modelling.

[...]

Zhenyu Guo, Louis-Gilles Durand, Louis Allard, Guy Cloutier, H. C. Lee, Y. E. Langlois - Show less +2 more

01 May 1993-Medical & Biological Engineering & Computing

TL;DR: The white-noise characteristics of the AR modelling error signal indicated that the Doppler blood-flow signal can be adequately modelled as a complex AR process and with appropriate model orders, AR modelling provided better doppler spectrogram estimates than the periodogram.

...read moreread less

Abstract: Doppler spectrograms obtained by using autoregressive (AR) modelling based on the Yule-Walker equations were investigated. A complex AR model using the in-phase and the quadrature components of the Doppler signal was used to provide blood-flow directions. The effect of model orders on the spectrogram estimation was studied using cardiac Doppler blood flow signals taken from 20 patients. The 'final prediction error' (FPE) and the 'Akaike's information criterion' (AIC) provided almost identical results in model-order selection. An index, the spectral envelope area (SEA), was used to evaluate the effect of window duration and sampling frequency on AR Doppler spectrogram estimation. The statistical analysis revealed that the SEA obtained from AR modelling was not sensitive to window duration and sampling frequency. This result verified the consistency of the AR Doppler spectrogram. The white-noise characteristics of the AR modelling error signal indicated that the Doppler blood-flow signal can be adequately modelled as a complex AR process. With appropriate model orders, AR modelling provided better Doppler spectrogram estimates than the periodogram.

...read moreread less

27 citations

Proceedings Article•

Are Sparse Representations Rich Enough for Acoustic Modeling

[...]

Oriol Vinyals¹, Li Deng²•Institutions (2)

University of California, Berkeley¹, Microsoft²

01 Jan 2012

TL;DR: This study compute the local representation on speech spectrogram as the raw “signal” and use it as the local sparse code to perform a standard phone classification task and demonstrates meaningful acoustic-phonetic properties that are captured by a collection of the dictionary entries.

...read moreread less

Abstract: We propose a novel approach to acoustic modeling based on recent advances in sparse representations. The key idea in sparse coding is to compute a compressed local representation of a signal via an over-complete basis or dictionary that is learned in an unsupervised way. In this study, we compute the local representation on speech spectrogram as the raw “signal” and use it as the local sparse code to perform a standard phone classification task. A linear classifier is used that directly receives the coding space for making the classification decision. The simplicity of the linear classifier allows us to assess whether the sparse representations are sufficiently rich to serve as effective acoustic features for discriminating speech classes. Our experiments demonstrate competitive error rates when compared to other shallow approaches. An examination of the dictionary learned in sparse feature extraction demonstrates meaningful acoustic-phonetic properties that are captured by a collection of the dictionary entries.

...read moreread less

27 citations

Journal Article•DOI•

Coupled dictionaries for exemplar-based speech enhancement and automatic speech recognition

[...]

Deepak Baby¹, Tuomas Virtanen², Jort F. Gemmeke¹, Hugo Van hamme¹•Institutions (2)

Katholieke Universiteit Leuven¹, Tampere University of Technology²

01 Nov 2015-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: An efficient way to directly compute the full-resolution frequency estimates of speech and noise using coupled dictionaries, which results in improved word error rates for the speech recognition tasks using HMM-GMM and deep-neural network (DNN) based systems.

...read moreread less

Abstract: Exemplar-based speech enhancement systems work by decomposing the noisy speech as a weighted sum of speech and noise exemplars stored in a dictionary and use the resulting speech and noise estimates to obtain a time-varying filter in the full-resolution frequency domain to enhance the noisy speech. To obtain the decomposition, exemplars sampled in lower dimensional spaces are preferred over the full-resolution frequency domain for their reduced computational complexity and the ability to better generalize to unseen cases. But the resulting filter may be sub-optimal as the mapping of the obtained speech and noise estimates to the full-resolution frequency domain yields a low-rank approximation. This paper proposes an efficient way to directly compute the full-resolution frequency estimates of speech and noise using coupled dictionaries: an input dictionary containing atoms from the desired exemplar space to obtain the decomposition and a coupled output dictionary containing exemplars from the full-resolution frequency domain. We also introduce modulation spectrogram features for the exemplar-based tasks using this approach. The proposed system was evaluated for various choices of input exemplars and yielded improved speech enhancement performances on the AURORA-2 and AURORA-4 databases. We further show that the proposed approach also results in improved word error rates (WERs) for the speech recognition tasks using HMM-GMM and deep-neural network (DNN) based systems.

...read moreread less

27 citations

Collapse

Network Information

Performance

Metrics

7,848

Papers

107,060

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	627
2022	1,396
2021	488
2020	595
2019	593

Spectrogram

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics