Topic
Spectrogram
About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.
Papers published on a yearly basis
Papers
More filters
••
21 Sep 2015
TL;DR: Preliminary experiments show the proposed emotion recognition system based on DCNNs achieves about 40% classification accuracy and outperforms the SVM based classification using the hand-crafted acoustic features.
Abstract: Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach to implement an effectively emotion recognition system based on deep convolution neural networks (DCNNs) using labeled training audio data. Specifically, the log-spectrogram is computed and the principle component analysis (PCA) technique is used to reduce the dimensionality and suppress the interferences. Then the PCA whitened spectrogram is split into non-overlapping segments. The DCNN is constructed to learn the representation of the emotion from the segments with labeled training speech data. Our preliminary experiments show the proposed emotion recognition system based on DCNNs (containing 2 convolution and 2 pooling layers) achieves about 40% classification accuracy. Moreover, it also outperforms the SVM based classification using the hand-crafted acoustic features.
154 citations
••
TL;DR: A new strategy for reliable automatic classification of local seismic signals and volcano-tectonic earthquakes (vt) is presented, based on a supervised neural network in which a new approach for feature extraction from short period seismic signals is applied.
Abstract: We present a new strategy for reliable automatic classification of local seismic signals and volcano-tectonic earthquakes (vt). The method is based on a supervised neural network in which a new approach for feature extraction from short period seismic signals is applied. To reduce the number of records required for the analysis we set up a specialized neural classifier, able to distinguish two classes of signals, for each of the selected stations. The neural network architecture is a multilayer perceptron (mlp) with a single hidden layer. Spectral features of the signals and the parameterized attributes of their waveform have been used as input for this network. Feature extraction is done by using both the linear predictor coding technique for computing the spectrograms, and a function of the amplitude for characterizing waveforms. Compared to strategies that use only spectral signatures, the inclusion of properly normalized amplitude features improves the performance of the classifiers, and allows the network to better generalize. To train the mlp network we compared the performance of the quasi-Newton algorithm with the scaled conjugate gradient method. We found that the scaled conjugate gradient approach is the faster of the two, with quite equally good performance. Our method was tested on a dataset recorded by four selected stations of the Mt. Vesuvius monitoring network, for the discrimination of low magnitude vt events and transient signals caused by either artificial (quarry blasts, underwater explosions) and natural (thunder) sources. In this test application we obtained 100% correct classification for one of the possible pairs of signal types (vt versus quarry blasts). Because this method was developed independently of this particular discrimination task, it can be applied to a broad range of other applications.
153 citations
••
TL;DR: A new kernel for the design of a high resolution time-frequency distribution (TFD) is introduced and it is shown that this distribution can solve problems that the Wigner-Ville distribution (WVD) or the spectrogram cannot.
Abstract: The paper introduces a new kernel for the design of a high resolution time-frequency distribution (TFD). We show that this distribution can solve problems that the Wigner-Ville distribution (WVD) or the spectrogram cannot. In particular, the proposed distribution can resolve two close signals in the time-frequency domain that the two other distributions cannot. Moreover, we show that the proposed distribution is more accurate than the WVD and the spectrogram in the estimation of the instantaneous frequency of a stepped FM signal embedded in additive Gaussian noise. Synthetic and real data collected from real-world applications are shown to validate the proposed distribution.
150 citations
••
Abstract: . Seismic methods used in the study of snow avalanches may be employed to detect and characterize landslides and other mass movements, using standard spectrogram/sonogram analysis. For snow avalanches, the spectrogram for a station that is approached by a sliding mass exhibits a triangular time/frequency signature due to an increase over time in the higher-frequency constituents. Recognition of this characteristic footprint in a spectrogram suggests a useful metric for identifying other mass-movement events such as landslides. The 1 June 2005 slide at Laguna Beach, California is examined using data obtained from the Caltech/USGS Regional Seismic Network. This event exhibits the same general spectrogram features observed in studies of Alpine snow avalanches. We propose that these features are due to the systematic relative increase in high-frequency energy transmitted to a seismometer in the path of a mass slide owing to a reduction of distance from the source signal. This phenomenon is related to the path of the waves whose high frequencies are less attenuated as they traverse shorter source-receiver paths. Entrainment of material in the course of the slide may also contribute to the triangular time/frequency signature as a consequence of the increase in the energy involved in the process; in this case the contribution would be a source effect. By applying this commonly observed characteristic to routine monitoring algorithms, along with custom adjustments for local site effects, we seek to contribute to the improvement in automatic detection and monitoring methods of landslides and other mass movements.
149 citations
••
15 Sep 2019TL;DR: A novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker, by training two separate neural networks.
149 citations