Topic
Spectrogram
About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.
Papers published on a yearly basis
Papers
More filters
••
01 Jan 2020TL;DR: In this paper, a mixed frequency masking data augmentation method is proposed for audio classification, which adopts a nonlinear combination method to construct new samples and a linear method for constructing labels.
Abstract: Deep learning focuses on the representation of the input data and generalization of the model. It is well known that data augmentation can combat overfitting and improve the generalization ability of deep neural network. In this paper, we summarize and compare multiple data augmentation methods for audio classification. These strategies include traditional methods on raw audio signal, as well as the current popular augmentation of linear interpolation and nonlinear mixing on the spectrum. We explore the generation of new samples, the transformation of labels, and the combination patterns of samples and labels of each data augmentation method. Finally, inspired by SpecAugment and Mixup, we propose an effective and easy to implement data augmentation method, which we call Mixed frequency Masking data augmentation. This method adopts nonlinear combination method to construct new samples and linear method to construct labels. All methods are verified on the Freesound Dataset Kaggle2018 dataset, and ResNet is adopted as the classifier. The baseline system uses the log-mel spectrogram feature as the input. We use mean Average Precision @3 (mAP@3) as the evaluation metric to evaluate the performance of all data augmentation methods.
37 citations
••
TL;DR: An approach to extract features from EEG signals is proposed based on spectrograms using STFT to obtain time-frequency representations, and is evaluated using the dataset from Bonn University, identifying a healthy person and an epileptic attack classes as main task.
37 citations
••
25 Oct 1994
TL;DR: A general methodology providing a better readability of any bilinear distribution, referred to as reassignment, is essentially a generalization of an improvement of the spectrogram proposed by Kodera, Gendrin and de Villedary (1978).
Abstract: A general methodology providing a better readability of any bilinear distribution has been proposed. This methodology, referred to as reassignment, is essentially a generalization of an improvement of the spectrogram proposed by Kodera, Gendrin and de Villedary (1978). After a presentation of this original work, its generalization to a wide range of distributions is shown. The close connections of this method with some related approaches are also underlined. >
37 citations
••
TL;DR: Two novel algorithms based on GLA and the alternating direction method of multipliers (ADMM) are proposed for better recovery with fewer iteration for better perceptual quality in some cases.
Abstract: Recovering a signal from its amplitude spectrogram, or phase recovery, exhibits many applications in acoustic signal processing. When only an amplitude spectrogram is available and no explicit information is given for the phases, the Griffin–Lim algorithm (GLA) is one of the most utilized methods for phase recovery. However, GLA often requires many iterations and results in low perceptual quality in some cases. In this letter, we propose two novel algorithms based on GLA and the alternating direction method of multipliers (ADMM) for better recovery with fewer iteration. Some interpretation of the existing methods and their relation to the proposed method are also provided. Evaluations are performed with both objective measure and subjective test.
37 citations
••
TL;DR: A new method for VFR using the norm of the derivative parameters in deciding to retain or to discard a frame is introduced, and informal inspection of speech spectrograms shows that this new method puts more emphasis on the transient regions of the speech signal.
Abstract: Variable frame rate (VFR) analysis is a technique used in speech processing and recognition for discarding frames that are too much alike. The article introduces a new method for VFR. Instead of calculating the distance between frames, the norm of the derivative parameters is used in deciding to retain or to discard a frame, informal inspection of speech spectrograms shows that this new method puts more emphasis on the transient regions of the speech signal. Experimental results with a hidden Markov model (HMM) based system show that the new method outperforms the classical method. >
37 citations