scispace - formally typeset
Search or ask a question
Topic

Spectrogram

About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.


Papers
More filters
Posted Content
TL;DR: A hybrid framework that leverages the trade-off between temporal and frequency precision in audio representations to improve the performance of speech enhancement task and yields better performance and robustness than using each model individually.
Abstract: We present a hybrid framework that leverages the trade-off between temporal and frequency precision in audio representations to improve the performance of speech enhancement task. We first show that conventional approaches using specific representations such as raw-audio and spectrograms are each effective at targeting different types of noise. By integrating both approaches, our model can learn multi-scale and multi-domain features, effectively removing noise existing on different regions on the time-frequency space in a complementary way. Experimental results show that the proposed hybrid model yields better performance and robustness than using each model individually.

23 citations

Journal ArticleDOI
TL;DR: The aim of the work is to obtain an accurate and flexible tool for consistently executing and managing the unmanned monitoring of construction sites by using distributed acoustic sensors by using a Deep Belief Network based approach.
Abstract: In this paper, we propose a Deep Belief Network (DBN) based approach for the classification of audio signals to improve work activity identification and remote surveillance of construction projects. The aim of the work is to obtain an accurate and flexible tool for consistently executing and managing the unmanned monitoring of construction sites by using distributed acoustic sensors. In this paper, ten classes of multiple construction equipment and tools, frequently and broadly used in construction sites, have been collected and examined to conduct and validate the proposed approach. The input provided to the DBN consists in the concatenation of several statistics evaluated by a set of spectral features, like MFCCs and mel-scaled spectrogram. The proposed architecture, along with the preprocessing and the feature extraction steps, has been described in details while the effectiveness of the proposed idea has been demonstrated by some numerical results, evaluated by using real-world recordings. The final overall accuracy on the test set is up to 98% and is a significantly improved performance compared to other state-of-the-are approaches. A practical and real-time application of the presented method has been also proposed in order to apply the classification scheme to sound data recorded in different environmental scenarios.

23 citations

Journal ArticleDOI
TL;DR: An objective function is defined that exploits the temporal/spectral continuities of harmonic/percussive sounds and the sparsity of vocal sounds in the spectrogram domain and produces a performance significantly better than that of conventional algorithms or comparable to the state-of-the-art algorithms.
Abstract: In this letter, we describe a novel approach for separating a vocal signal from monaural music. We assume that the accompaniment in a music signal can be represented as the sum of the sustained harmonic and percussive sounds. Based on the observation that singing voices usually contain rapidly changing harmonic signals such as fast vibratos, slides, and/or glissandos, we propose a statistical model for the separation of harmonic/percussive and vocal sounds. To this end, we define an objective function that exploits the temporal/spectral continuities of harmonic/percussive sounds and the sparsity of vocal sounds in the spectrogram domain. Experimental results show that the proposed algorithm successfully separates the vocal from the accompaniment, resulting in a performance significantly better than that of conventional algorithms or comparable to the state-of-the-art algorithms.

23 citations

Proceedings ArticleDOI
10 Feb 2014
TL;DR: A system capable of monitoring food intake by means of a throat microphone, classifying the data based on the food being consumed among several categories through spectrogram analysis, and providing user feedback in the form of mobile application is described.
Abstract: Acoustic monitoring of food intake in an unobtrusive, wearable form-factor can encourage healthy dietary choices by enabling individuals to monitor their eating patterns, maintain regularity in their meal times, and ensure adequate hydration levels. In this paper, we describe a system capable of monitoring food intake by means of a throat microphone, classifying the data based on the food being consumed among several categories through spectrogram analysis, and providing user feedback in the form of mobile application. We are able to classify sandwich swallows, sandwich chewing, water swallows, and none, with an F-Measure of 0.836.

23 citations

Journal ArticleDOI
TL;DR: In this paper, a Doppler feature matching search (DFMS) algorithm is proposed to locate the time centers of different bearings in the TFD spectrogram, with the determined time centers, timefrequency filters (TFF) are designed with thresholds to separate the acoustic signals in the time-frequency domain.

23 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
79% related
Convolutional neural network
74.7K papers, 2M citations
78% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Wavelet
78K papers, 1.3M citations
76% related
Support vector machine
73.6K papers, 1.7M citations
75% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023627
20221,396
2021488
2020595
2019593