scispace - formally typeset
Search or ask a question
Topic

Spectrogram

About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.


Papers
More filters
Journal ArticleDOI
26 Apr 2018-J3ea
TL;DR: A visual approach to the analysis of environmental recordings using long-duration false-colour (LDFC) spectrograms, prepared from combinations of spectral indices, which bridges the gap between bioacoustic and ecoacoustics, encompassing temporal scales across three orders of magnitude.
Abstract: Long-duration recordings of the natural environment have many advantages in passive monitoring of animal diversity. Technological advances now enable the collection of far more audio than can be listened to, necessitating the development of scalable approaches for distinguishing signal from noise. Computational methods, using automated species recognisers, have improved in accuracy but require considerable coding expertise. The content of environmental recordings is unconstrained, and the creation of labelled datasets required for machine learning purposes is a time-consuming, expensive enterprise. Here, we describe a visual approach to the analysis of environmental recordings using long-duration false-colour (LDFC) spectrograms, prepared from combinations of spectral indices. The technique was originally developed to visualize 24-hour “soundscapes.” A soundscape is an ecoacoustics concept that encompasses the totality of sound in an ecosystem. We describe three case studies to demonstrate how LDFC spectrograms can be used, not only to study soundscapes, but also to monitor individual species within them. In the first case, LDFC spectrograms help to solve a “needle in the haystack” problem—to locate vocalisations of the furtive Lewin’s Rail (Tasmanian), Lewinia pectoralis brachipus. We extend the technique by using a machine learning method to scan multiple days of LDFC spectrograms. In the second case study, we demonstrate that frog choruses are easily identified in LDFC spectrograms because of their extended time-scale. Although calls of individual frogs are lost in the cacophony of sound, spectral indices can distinguish different chorus characteristics. Third, we demonstrate that the method can be extended to the detection of bat echolocation calls. By converting complex acoustic data into readily interpretable images, our practical approach bridges the gap between bioacoustics and ecoacoustics, encompassing temporal scales across three orders of magnitude. Using the one methodology, it is possible to monitor entire soundscapes and individual species within those soundscapes.

42 citations

Patent
22 Feb 1999
TL;DR: In this paper, a method for identifying mutations, if any, present in a biological sample, from a pre-selected set of known mutations, is described, which can be applied to DNA, RNA and peptide nucleic acid (PNA) microarrays.
Abstract: A technique is described for identifying mutations, if any, present in a biological sample, from a pre-selected set of known mutations. The method can be applied to DNA, RNA and peptide nucleic acid (PNA) microarrays. The method analyzes a dot spectrogram representative of quantized hybridization activity of oligonucleotides in the sample to identify the mutations. In accordance with the method, a resonance pattern is generated which is representative of nonlinear resonances between a stimulus pattern associated with the set of known mutations and the dot spectrogram. The resonance pattern is interpreted to a yield a set of confirmed mutations by comparing resonances found therein with predetermined resonances expected for the selected set of mutations. In a particular example, the resonance pattern is generated by iteratively processing the dot spectrogram by performing a convergent reverberation to yield a resonance pattern representative of resonances between a predetermined set of selected Quantum Expressor Functions and the dot spectrogram until a predetermined degree of convergence is achieved between the resonances found in the resonance pattern and resonances expected for the set of mutations. The resonance pattern is analyzed to a yield a set of confirmed mutations by mapping the confirmed mutations to known diseases associated with the pre-selected set of known mutations to identify diseases, if any, indicated by the biological sample. By exploiting a resonant interaction, mutation signatures may be robustly identified even in circumstances involving low signal to noise ratios or, in some cases, negative signal to noise ratios.

42 citations

Journal ArticleDOI
TL;DR: This work investigates CNNs for the task of speech music discrimination and the first that exploits transfer learning across very different domains for audio modeling using deep-learning in general, and fine-tune a deep architecture originally trained for the Imagenet classification task.
Abstract: Speech music discrimination is a traditional task in audio analytics, useful for a wide range of applications, such as automatic speech recognition and radio broadcast monitoring, that focuses on segmenting audio streams and classifying each segment as either speech or music. In this paper we investigate the capabilities of Convolutional Neural Networks (CNNs) with regards to the speech - music discrimination task. Instead of representing the audio content using handcrafted audio features, as traditional methods do, we use deep structures to learn visual feature dependencies as they appear on the spectrogram domain (i.e. train a CNN using audio spectrograms as input images). The main contribution of our work focuses on the potentials of using pre-trained deep architectures along with transfer-learning to train robust audio classifiers for the particular task of speech music discrimination. We highlight the supremacy of the proposed methods, compared both to the typical audio-based and deep-learning methods that adopt handcrafted features, and we evaluate our system in terms of classification success and run-time execution. To our knowledge this is the first work that investigates CNNs for the task of speech music discrimination and the first that exploits transfer learning across very different domains for audio modeling using deep-learning in general. In particular, we fine-tune a deep architecture originally trained for the Imagenet classification task, using a relatively small amount of data (almost 80 min of training audio samples) along with data augmentation. We evaluate our system through extensive experimentation against three different datasets. Firstly we experiment on a real-world dataset of more than 10 h of uninterrupted radio broadcasts and secondly, for comparison purposes, we evaluate our best method on two publicly available datasets that were designed specifically for the task of speech-music discrimination. Our results indicate that CNNs can significantly outperform current state-of-the-art in terms of performance especially when transfer learning is applied, in all three test-datasets. All the discussed methods, along with the whole experimental setup and the respective datasets, are openly provided for reproduction and further experimentation.

42 citations

Journal ArticleDOI
TL;DR: In this article, the authors developed a method that uses the dynamic Allan variance and the spectrogram to detect and to identify the typical anomalies of an atomic clock, and applied the method to simulated data.
Abstract: When an anomaly occurs in an atomic clock, its stability and frequency spectrum change with time. The variation with time of the stability can be evaluated with the dynamic Allan variance. The variation with time of the frequency spectrum can be described with the spectrogram, a time–frequency distribution. We develop a method that uses the dynamic Allan variance and the spectrogram to detect and to identify the typical anomalies of an atomic clock. We apply the method to simulated data.

42 citations

Proceedings ArticleDOI
05 Mar 2017
TL;DR: A novel optimization problem, involving the minimization of nuclear norms and matrix ℓ1-norms is solved and the proposed method is evaluated in 1) visual localization and audio separation and 2) visual-assisted audio denoising.
Abstract: The ability to localize visual objects that are associated with an audio source and at the same time seperate the audio signal is a corner stone in several audio-visual signal processing applications. Past efforts usually focused on localizing only the visual objects, without audio separation abilities. Besides, they often rely computational expensive pre-processing steps to segment images pixels into object regions before applying localization approaches. We aim to address the problem of audio-visual source localization and separation in an unsupervised manner. The proposed approach employs low-rank in order to model the background visual and audio information and sparsity in order to extract the sparsely correlated components between the audio and visual modalities. In particular, this model decomposes each dataset into a sum of two terms: the low-rank matrices capturing the background uncorrelated information, while the sparse correlated components modelling the sound source in visual modality and the associated sound in audio modality. To this end a novel optimization problem, involving the minimization of nuclear norms and matrix l 1 -norms is solved. We evaluated the proposed method in 1) visual localization and audio separation and 2) visual-assisted audio denoising. The experimental results demonstrate the effectiveness of the proposed method.

42 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
79% related
Convolutional neural network
74.7K papers, 2M citations
78% related
Feature extraction
111.8K papers, 2.1M citations
77% related
Wavelet
78K papers, 1.3M citations
76% related
Support vector machine
73.6K papers, 1.7M citations
75% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
2023627
20221,396
2021488
2020595
2019593