Topic
Spectrogram
About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.
Papers published on a yearly basis
Papers
More filters
•
TL;DR: A hybrid framework that leverages the trade-off between temporal and frequency precision in audio representations to improve the performance of speech enhancement task and yields better performance and robustness than using each model individually.
Abstract: We present a hybrid framework that leverages the trade-off between temporal and frequency precision in audio representations to improve the performance of speech enhancement task. We first show that conventional approaches using specific representations such as raw-audio and spectrograms are each effective at targeting different types of noise. By integrating both approaches, our model can learn multi-scale and multi-domain features, effectively removing noise existing on different regions on the time-frequency space in a complementary way. Experimental results show that the proposed hybrid model yields better performance and robustness than using each model individually.
23 citations
••
TL;DR: The aim of the work is to obtain an accurate and flexible tool for consistently executing and managing the unmanned monitoring of construction sites by using distributed acoustic sensors by using a Deep Belief Network based approach.
Abstract: In this paper, we propose a Deep Belief Network (DBN) based approach for the classification of audio signals to improve work activity identification and remote surveillance of construction projects. The aim of the work is to obtain an accurate and flexible tool for consistently executing and managing the unmanned monitoring of construction sites by using distributed acoustic sensors. In this paper, ten classes of multiple construction equipment and tools, frequently and broadly used in construction sites, have been collected and examined to conduct and validate the proposed approach. The input provided to the DBN consists in the concatenation of several statistics evaluated by a set of spectral features, like MFCCs and mel-scaled spectrogram. The proposed architecture, along with the preprocessing and the feature extraction steps, has been described in details while the effectiveness of the proposed idea has been demonstrated by some numerical results, evaluated by using real-world recordings. The final overall accuracy on the test set is up to 98% and is a significantly improved performance compared to other state-of-the-are approaches. A practical and real-time application of the presented method has been also proposed in order to apply the classification scheme to sound data recorded in different environmental scenarios.
23 citations
••
TL;DR: An objective function is defined that exploits the temporal/spectral continuities of harmonic/percussive sounds and the sparsity of vocal sounds in the spectrogram domain and produces a performance significantly better than that of conventional algorithms or comparable to the state-of-the-art algorithms.
Abstract: In this letter, we describe a novel approach for separating a vocal signal from monaural music. We assume that the accompaniment in a music signal can be represented as the sum of the sustained harmonic and percussive sounds. Based on the observation that singing voices usually contain rapidly changing harmonic signals such as fast vibratos, slides, and/or glissandos, we propose a statistical model for the separation of harmonic/percussive and vocal sounds. To this end, we define an objective function that exploits the temporal/spectral continuities of harmonic/percussive sounds and the sparsity of vocal sounds in the spectrogram domain. Experimental results show that the proposed algorithm successfully separates the vocal from the accompaniment, resulting in a performance significantly better than that of conventional algorithms or comparable to the state-of-the-art algorithms.
23 citations
••
10 Feb 2014TL;DR: A system capable of monitoring food intake by means of a throat microphone, classifying the data based on the food being consumed among several categories through spectrogram analysis, and providing user feedback in the form of mobile application is described.
Abstract: Acoustic monitoring of food intake in an unobtrusive, wearable form-factor can encourage healthy dietary choices by enabling individuals to monitor their eating patterns, maintain regularity in their meal times, and ensure adequate hydration levels. In this paper, we describe a system capable of monitoring food intake by means of a throat microphone, classifying the data based on the food being consumed among several categories through spectrogram analysis, and providing user feedback in the form of mobile application. We are able to classify sandwich swallows, sandwich chewing, water swallows, and none, with an F-Measure of 0.836.
23 citations
••
TL;DR: In this paper, a Doppler feature matching search (DFMS) algorithm is proposed to locate the time centers of different bearings in the TFD spectrogram, with the determined time centers, timefrequency filters (TFF) are designed with thresholds to separate the acoustic signals in the time-frequency domain.
23 citations