scispace - formally typeset
Search or ask a question
Topic

Spectrogram

About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.


Papers
More filters
Proceedings ArticleDOI
22 May 2011
TL;DR: A semi-supervised source separation methodology to denoise speech by modeling speech as one source and noise as the other source using the recently proposed non-negative factorial hidden Markov model is presented.
Abstract: We present a semi-supervised source separation methodology to denoise speech by modeling speech as one source and noise as the other source. We model speech using the recently proposed non-negative hidden Markov model, which uses multiple non-negative dictionaries and a Markov chain to jointly model spectral structure and temporal dynamics of speech. We perform separation of the speech and noise using the recently proposed non-negative factorial hidden Markov model. Although the speech model is learned from training data, the noise model is learned during the separation process and requires no training data. We show that the proposed method achieves superior results to using non-negative spectrogram factorization, which ignores the non-stationarity and temporal dynamics of speech.

96 citations

Journal ArticleDOI
TL;DR: The analysis and comparisons of the spectrogram, Wigner distribution and wavelet transform techniques to the phonocardiogram signal (PCG) are presented to be able to distinguish the various techniques in their aptitude to separate and present suitably the internal components of these sounds.

95 citations

Journal ArticleDOI
TL;DR: An algorithm is presented for the detection of frequency contour sounds-whistles of dolphins and many other odontocetes, moans of baleen whales, chirps of birds, and numerous other animal and non-animal sounds.
Abstract: An algorithm is presented for the detection of frequency contour sounds-whistles of dolphins and many other odontocetes, moans of baleen whales, chirps of birds, and numerous other animal and non-animal sounds. The algorithm works by tracking spectral peaks over time, grouping together peaks in successive time slices in a spectrogram if the peaks are sufficiently near in frequency and form a smooth contour over time. The algorithm has nine parameters, including the ones needed for spectrogram calculation and normalization. Finding optimal values for all of these parameters simultaneously requires a search of parameter space, and a grid search technique is described. The frequency contour detection method and parameter optimization technique are applied to the problem of detecting "boing" sounds of minke whales from near Hawaii. The test data set contained many humpback whale sounds in the frequency range of interest. Detection performance is quantified, and the method is found to work well at detecting boings, with a false-detection rate of 3% for the target missed-call rate of 25%. It has also worked well anecdotally for other marine and some terrestrial species, and could be applied to any species that produces a frequency contour, or to non-animal sounds as well.

95 citations

Journal Article
TL;DR: The detection system is capable of picking out a high proportion of right whale calls logged by a human operator, while at the same time working at a false alarm rate of only one or two calls per day, even in the presence of background noise from humpback whales and seismic exploration.
Abstract: A detector has been developed which can reliably detect right whale calls and distinguish them from those of other marine mammals and industrial noise. Detection is a two stage process. In the first, the spectrogram is smoothed by convolving it with a Gaussian kernel and the 'outlines' of sounds are extracted using an edge detection algorithm. This allows a number of parameters to be measured for each sound, including duration, bandwidth and details of the frequency contour such as the positions of maximum and minimum frequency. In the second stage, these parameters are used in a classification function in order to determine which sounds are from right whales. The classifier has been tuned by comparing data from a period when large numbers of right whales were known to be in the vicinity of bottom mounted recorders with data collected on days when it was believed, based on ship and aerial surveys, that no right whales were present. Overall, the detection system is capable of picking out a high proportion of right whale calls logged by a human operator, while at the same time working at a false alarm rate of only one or two calls per day, even in the presence of background noise from humpback whales and seismic exploration. Although it is impossible to reduce the false alarm rate for individual calls to zero whilst still maintaining adequate efficiency, by requiring the detection of several calls within a set waiting time, it is possible to reduce false alarm rate to a negligible level.

95 citations

Proceedings ArticleDOI
12 May 2019
TL;DR: In this article, a self-supervised neural network model for visual object segmentation and sound source separation is proposed. But the model is not suitable for audio-visual training on videos.
Abstract: Segmenting objects in images and separating sound sources in audio are challenging tasks, in part because traditional approaches require large amounts of labeled data. In this paper we develop a neural network model for visual object segmentation and sound source separation that learns from natural videos through self-supervision. The model is an extension of recently proposed work that maps image pixels to sounds [1]. Here, we introduce a learning approach to disentangle concepts in the neural networks, and assign semantic categories to network feature channels to enable independent image segmentation and sound source separation after audio-visual training on videos. Our evaluations show that the disentangled model outperforms several baselines in semantic segmentation and sound source separation.

95 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
79% related
Convolutional neural network
74.7K papers, 2M citations
78% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Wavelet
78K papers, 1.3M citations
76% related
Support vector machine
73.6K papers, 1.7M citations
75% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023627
20221,396
2021488
2020595
2019593