Topic

Spectrogram

About: Spectrogram is a research topic. Over the lifetime, 5813 publications have been published within this topic receiving 81547 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Method of separating sound signal

[...]

Shigeki Sagayama¹, 茂樹嵯峨山¹, Nobutaka Ono¹, 小野順貴¹, Hirokazu Kameoka¹, 弘和亀岡, Kenichi Miyamoto¹, 賢一宮本, Roux Jonathan Le, ジョナトンルルー - Show less +6 more•Institutions (1)

University of Tokyo¹

27 Aug 2008

TL;DR: In this article, the authors obtained a separated signal from an audio signal based on the anisotropy of smoothness of spectral elements in the time-frequency domain, where a spectrogram of the audio signal is assumed to be a sum of a plurality of sub-spectrograms.

...read moreread less

Abstract: The present invention obtains a separated signal from an audio signal based on the anisotropy of smoothness of spectral elements in the time-frequency domain. A spectrogram of the audio signal is assumed to be a sum of a plurality of sub-spectrograms, and smoothness of spectral elements of each sub-spectrogram in the time-frequency domain has directionality on the time-frequency plane. The method comprises obtaining a distribution coefficient for distributing spectral elements of said audio signal in the time-frequency domain to at least one sub-spectrogram based on the directionality of the smoothness of each sub-spectrogram on the time-frequency plane, and separating at least one sub-spectrogram from said spectral elements of said audio signal using said distribution coefficient.

...read moreread less

29 citations

Journal Article•DOI•

New electronic white cane for stair case detection and recognition using ultrasonic sensor

[...]

Sonda Ammar Bouhamed, Imene Khanfir Kallel, Dorra Sellami Masmoudi

01 Jul 2013-International Journal of Advanced Computer Science and Applications

TL;DR: This paper is involved in using only one ultrasonic sensor to detect stair-cases in electronic cane using a multiclass SVM approach and recognition rates of 82.4% has been achieved.

...read moreread less

Abstract: Blinds people need some aid to interact with their environment with more security. A new device is then proposed to enable them to see the world with their ears. Considering not only system requirements but also technology cost, we used, for the conception of our tool, ultrasonic sensors and one monocular camera to enable user being aware of the presence and nature of potential encountered obstacles. In this paper, we are involved in using only one ultrasonic sensor to detect stair-cases in electronic cane. In this context, no previous work has considered such a challenge. Aware that the performance of an object recognition system depends on both object representation and classification algorithms, we have used in our system, one representation of ultrasonic signal in frequency domain: spectrogram representation explaining how the spectral density of signal varies with time, spectrum representation showing the amplitudes as a function of the frequency, periodogram representation estimating the spectral density of signal. Several features, thus extracted from each representation, contribute in the classification process. Our system was evaluated on a set of ultrasonic signal where stair-cases occur with different shapes. Using a multiclass SVM approach, recognition rates of 82.4% has been achieved.

...read moreread less

29 citations

Posted Content•

A breakthrough in Speech emotion recognition using Deep Retinal Convolution Neural Networks

[...]

Yafeng Niu, Dongsheng Zou, Yadong Niu, Zhongshi He, Hua Tan - Show less +1 more

12 Jul 2017-arXiv: Sound

TL;DR: A data augmentation algorithm based on the imaging principle of the retina and convex lens is proposed, to acquire the different sizes of spectrogram and increase the amount of training data by changing the distance between the Spectrogram and the conveX lens.

...read moreread less

Abstract: Speech emotion recognition (SER) is to study the formation and change of speaker's emotional state from the speech signal perspective, so as to make the interaction between human and computer more intelligent. SER is a challenging task that has encountered the problem of less training data and low prediction accuracy. Here we propose a data augmentation algorithm based on the imaging principle of the retina and convex lens, to acquire the different sizes of spectrogram and increase the amount of training data by changing the distance between the spectrogram and the convex lens. Meanwhile, with the help of deep learning to get the high-level features, we propose the Deep Retinal Convolution Neural Networks (DRCNNs) for SER and achieve the average accuracy over 99%. The experimental results indicate that DRCNNs outperforms the previous studies in terms of both the number of emotions and the accuracy of recognition. Predictably, our results will dramatically improve human-computer interaction.

...read moreread less

29 citations

Posted Content•

Phase reconstruction of spectrograms with linear unwrapping: application to audio signal restoration

[...]

Paul Magron¹, Roland Badeau¹, Bertrand David¹•Institutions (1)

Institut Mines-Télécom¹

24 May 2016-arXiv: Sound

TL;DR: This paper introduces a novel technique for reconstructing the phase of modified spectrograms of audio signals from the analysis of mixtures of sinusoids and introduces an audio restoration framework, observing that the technique outperforms traditional methods.

...read moreread less

Abstract: This paper introduces a novel technique for reconstructing the phase of modified spectrograms of audio signals. From the analysis of mixtures of sinusoids we obtain relationships between phases of successive time frames in the Time-Frequency (TF) domain. To obtain similar relationships over frequencies, in particular within onset frames, we study an impulse model. Instantaneous frequencies and attack times are estimated locally to encompass the class of non-stationary signals such as vibratos. These techniques ensure both the vertical coherence of partials (over frequencies) and the horizontal coherence (over time). The method is tested on a variety of data and demonstrates better performance than traditional consistency-based approaches. We also introduce an audio restoration framework and observe that our technique outperforms traditional methods.

...read moreread less

29 citations

Proceedings Article•DOI•

Learning Multiscale Features Directly From Waveforms

[...]

Zhenyao Zhu¹, Jesse Engel¹, Awni Hannun¹•Institutions (1)

Baidu¹

08 Sep 2016

TL;DR: In this paper, the authors use convolutional filters to push past the inherent tradeoff of temporal and frequency resolution that exists for spectral representations, and show that increasing temporal resolution via reduced stride and increasing frequency resolution via additional filters delivers significant performance improvements.

...read moreread less

Abstract: Deep learning has dramatically improved the performance of speech recognition systems through learning hierarchies of features optimized for the task at hand. However, true end-to-end learning, where features are learned directly from waveforms, has only recently reached the performance of hand-tailored representations based on the Fourier transform. In this paper, we detail an approach to use convolutional filters to push past the inherent tradeoff of temporal and frequency resolution that exists for spectral representations. At increased computational cost, we show that increasing temporal resolution via reduced stride and increasing frequency resolution via additional filters delivers significant performance improvements. Further, we find more efficient representations by simultaneously learning at multiple scales, leading to an overall decrease in word error rate on a difficult internal speech test set by 20.7% relative to networks with the same number of parameters trained on spectrograms.

...read moreread less

29 citations

Collapse

Network Information

Performance

Metrics

7,848

Papers

107,060

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	627
2022	1,396
2021	488
2020	595
2019	593

Spectrogram

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics