Topic
Spectrogram
About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.
Papers published on a yearly basis
Papers
More filters
•
16 Jun 2011
TL;DR: An alternative approach for music genre classification which converts the audio signal into spectrograms and then extracts features from this visual representation and demonstrates that the classifier trained with texture compares similarly to the literature.
Abstract: In this paper we present an alternative approach for music genre classification which converts the audio signal into spectrograms and then extracts features from this visual representation. The idea is that treating the time-frequency representation as a texture image we can extract features to build reliable music genre classification systems. The proposed approach also takes into account a zoning mechanism to perform local feature extraction, which has been proved to be quite efficient. On a very challenging dataset of 900 music pieces divided among 10 music genres, we have demonstrated that the classifier trained with texture compares similarly to the literature. Besides, when it was combined with other classifiers trained with short-term, low-level characteristics of the music audio signal we got an improvement of about 7 percentage points in the recognition rate.
95 citations
••
27 Apr 2007TL;DR: This paper examines the problem of human target detection and identification using single-channel, airborne, synthetic aperture radar (SAR) using a MATLAB simulation environment and shows that spectrograms have some ability to detect and identify human targets in low noise.
Abstract: Radar offers unique advantages over other sensors, such as visual or seismic sensors, for human target detection.
Many situations, especially military applications, prevent the placement of video cameras or implantment seismic
sensors in the area being observed, because of security or other threats. However, radar can operate far away
from potential targets, and functions during daytime as well as nighttime, in virtually all weather conditions. In
this paper, we examine the problem of human target detection and identification using single-channel, airborne,
synthetic aperture radar (SAR). Human targets are differentiated from other detected slow-moving targets by
analyzing the spectrogram of each potential target. Human spectrograms are unique, and can be used not
just to identify targets as human, but also to determine features about the human target being observed, such
as size, gender, action, and speed. A 12-point human model, together with kinematic equations of motion
for each body part, is used to calculate the expected target return and spectrogram. A MATLAB simulation
environment is developed including ground clutter, human and non-human targets for the testing of spectrogram-based
detection and identification algorithms. Simulations show that spectrograms have some ability to detect
and identify human targets in low noise. An example gender discrimination system correctly detected 83.97%
of males and 91.11% of females. The problems and limitations of spectrogram-based methods in high clutter
environments are discussed. The SNR loss inherent to spectrogram-based methods is quantified. An alternate
detection and identification method that will be used as a basis for future work is proposed.
94 citations
••
06 Jul 2011
TL;DR: Experimental results show that using masks after NMF improves the separation process even when calculating NMF with fewer iterations, which yields a faster separation process.
Abstract: A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with spectral masks is proposed in this work. The proposed algorithm uses training data of speech and music signals with nonnegative matrix factorization followed by masking to separate the mixed signal. In the training stage, NMF uses the training data to train a set of basis vectors for each source. These bases are trained using NMF in the magnitude spectrum domain. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a linear combination of the trained bases for both sources. The decomposition results are used to build a mask, which explains the contribution of each source in the mixed signal. Experimental results show that using masks after NMF improves the separation process even when calculating NMF with fewer iterations, which yields a faster separation process.
94 citations
01 Jan 2000
TL;DR: In this article, the authors propose a method to solve the problem of "uniformity" and "uncertainty" in the context of education.iii.iiiiii.
Abstract: iii
94 citations
••
07 May 1996TL;DR: This paper describes techniques to automatically morph from one sound to another, representations for morphing, techniques for matching, and algorithms for interpolating and morphing each sound component.
Abstract: This paper describes techniques to automatically morph from one sound to another. Audio morphing is accomplished by representing the sound in a multi-dimensional space that is warped or modified to produce a desired result. The multi-dimensional space encodes the spectral shape and pitch on orthogonal axes. After matching components of the sound, a morph smoothly interpolates the amplitudes to describe a new sound in the same perceptual space. Finally, the representation is inverted to produce a sound. This paper describes representations for morphing, techniques for matching, and algorithms for interpolating and morphing each sound component. Spectrographic images of a complete morph are shown at the end.
94 citations