scispace - formally typeset
Search or ask a question
Book ChapterDOI

Noise Removal from Audio Using CNN and Denoiser

TL;DR: In this article, an efficient algorithm for noise detection which works on the principles of deep learning, specifically convolutional neural networks (CNNs) and the removal of similar noise from the audio using the Python module ‘noise reducer.
Abstract: As there is aggrandizement in the sector of artificial intelligence relating to the speech domain, it becomes a necessity to have efficient noise removal models with greater efficiency and less complexity. The presence of noise in audio signals poses a great complication when working on speech recognition, enhancement, improvement, and transmission. Hence, there is a necessity to develop the most efficient algorithm for noise reduction which works in real time and is successful in removing maximum noise. To be above this difficulty, this paper presents an efficient algorithm for noise detection which works on the principles of deep learning, specifically convolutional neural networks (CNNs) and the removal of similar noise from the audio using the Python module ‘noise reducer.’
References
More filters
Proceedings ArticleDOI
15 Sep 2019
TL;DR: In this article, a fully-convolutional context aggregation network using a deep feature loss is proposed to denoise speech signals by processing the raw waveform directly, which achieves state-of-the-art performance in objective speech quality metrics and in large-scale perceptual experiments with human listeners.
Abstract: We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly. Given input audio containing speech corrupted by an additive background signal, the system aims to produce a processed signal that contains only the speech content. Recent approaches have shown promising results using various deep network architectures. In this paper, we propose to train a fully-convolutional context aggregation network using a deep feature loss. That loss is based on comparing the internal feature activations in a different network, trained for acoustic environment detection and domestic audio tagging. Our approach outperforms the state-of-the-art in objective speech quality metrics and in large-scale perceptual experiments with human listeners. It also outperforms an identical network trained using traditional regression losses. The advantage of the new approach is particularly pronounced for the hardest data with the most intrusive background noise, for which denoising is most needed and most challenging.

152 citations

Journal ArticleDOI
TL;DR: In this article, the authors describe novel Bayesian models for time-frequency inverse modelling of non-stationary signals, based on the idea of a Gabor regression, in which a time series is represented as a superposition of translated, modulated versions of a window function exhibiting good timefrequency concentration.
Abstract: Summary. We describe novel Bayesian models for time–frequency inverse modelling of non-stationary signals. These models are based on the idea of a Gabor regression, in which a time series is represented as a superposition of translated, modulated versions of a window function exhibiting good time–frequency concentration. As a necessary consequence, the resultant set of potential predictors is in general overcomplete—constituting a frame rather than a basis—and hence the resultant models require careful regularization through appropriate choices of variable selection schemes and prior distributions. We introduce prior specifications that are tailored to representative time series, and we develop effective Markov chain Monte Carlo methods for inference. To highlight the potential applications of such methods, we provide examples using two of the most distinctive time–frequency surfaces—speech and music signals—as well as standard test functions from the wavelet regression literature.

112 citations

Proceedings ArticleDOI
22 May 2011
TL;DR: Improved signal to noise ratios and perceived audio quality is shown by explicitly modelling impulses with a discrete switching process and a new heavy-tailed amplitude model.
Abstract: We present a method for the removal of noise including non-Gaussian impulses from a signal. Impulse noise is removed jointly a homogenous Gaussian noise floor using a Gabor regression model [1]. The problem is formulated in a joint Bayesian framework and we use a Gibbs MCMC sampler to estimate parameters. We show how to deal with variable magnitude impulses using a shifted inverse gamma distribution for their variance. Our results show improved signal to noise ratios and perceived audio quality by explicitly modelling impulses with a discrete switching process and a new heavy-tailed amplitude model.

21 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: To remove white Gaussian noise, discrete wavelet transform technique is used, and each of the techniques show an increased Signal to Noise Ratio (SNR) after processing, as seen in the simulation results.
Abstract: For greater advancement in future communication, efficient noise reduction algorithms with lesser complexity are a necessity. Noise in audio signal poses a great challenge in speech recognition, speech communication, speech enhancement and transmission. Hence the most efficient algorithm for noise reduction must be chosen in such a way that the cost for noise removal is a less as possible, but a large portion of noise is removed. The common method for the removal of noise is optimal linear filtering method, and some algorithms in this method are Wiener filtering, Kalman filtering and spectral subtraction technique. Here, the noise signal is passed through a filter or transformation. However, due to the complexity of these algorithms, there are better algorithms like Signal Dependent Rank Order Mean algorithm (SD-ROM), which removes noise from audio signals and retains the characteristics of the signal. The algorithm can be adjusted depending on the characteristics of noise signal too. To remove white Gaussian noise, discrete wavelet transform technique is used. After each of the techniques are applied to the samples, SNR and elapsed time are calculated. All of the above techniques show an increased Signal to Noise Ratio (SNR) after processing, as seen in the simulation results.

18 citations

Proceedings ArticleDOI
16 Sep 1997
TL;DR: In this article, a signal separation-based approach is proposed to remove low frequency transient noise from old gramophone recordings and film sound tracks, where audio signals and noise transients are modelled as autoregressive processes which are additively superimposed to give the observed waveform.
Abstract: This paper is concerned with the removal of low frequency transient noise from old gramophone recordings and film sound tracks. Low frequency transients occur as a result of large breakages or discontinuities in the recorded medium which excite a long-term resonance in the playback apparatus. We present a signal separation-based approach to this problem. Audio signals and noise transients are modelled as autoregressive (AR) processes which are additively superimposed to give the observed waveform. A maximum a posteriori method is presented for separation of the two processes. A modification of this scheme allows for modelling of the large discontinuity at the start of each noise transient and successful restorations are demonstrated. A more practical scheme is then developed which uses a Kalman filter to implement the separation. In order to avoid low frequency distortions to the audio signal, the excitation variance of the noise transient model is tapered exponentially to zero away from the discontinuity. The method is fully automated and more practical to implement than existing schemes for removal of such defects. Results indicate a high level of performance.

17 citations

Trending Questions (2)
How effective are these methods in reducing noise in realtime audio processing?

The paper mentions the development of an efficient algorithm for noise reduction in real-time audio using CNNs, but it does not explicitly state the effectiveness of these methods in reducing noise in real-time audio processing.

How can background noise and echo be removed or effects be added to audio with ai?

The provided paper focuses on the development of an efficient algorithm for noise detection and removal using convolutional neural networks (CNNs). It does not specifically mention the removal of echo or the addition of effects to audio.