scispace - formally typeset
Search or ask a question
Book ChapterDOI

FPGA-Based Novel Speech Enhancement System Using Microphone Activity Detector

01 Jan 2019-pp 117-127
TL;DR: The evaluation of the quality of speech of enhanced signal and its correctness of MAD to detect the single or dual microphone system implies that the proposed hardware can work as a proper embedded component for hardware-based execution for speech enhancement.
Abstract: In this paper, we have proposed field-programmable gate array (FPGA) based design and implementation of a novel speech enhancement system, which can work for a single microphone device as well as that of a dual microphone device providing background noise immunity. We proposed a microphone activity detector (MAD), which detects the presence of single or dual microphone scenario. After detecting the microphones, multiband spectral subtraction technique enhances the speech signal from different background noisy surrounds. We have implemented our proposed design in Spartan 6 LX45 FPGA using Xilinx system generator tools. The evaluation of the quality of speech of enhanced signal and its correctness of MAD to detect the single or dual microphone system implies that our proposed hardware can work as a proper embedded component for hardware-based execution for speech enhancement.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article , a low-power hardware implementation of the log-minimum mean square error (LMMSE) algorithm with statistical-model-based voice activity detection noise estimator is presented.
Abstract: In order to improve the voice quality and decrease the power consumption of the audio electronic products, the noise reduction algorithm and its low-power hardware implementation are studied. Several types of common noise reduction algorithms are analyzed and simulated, and the log-minimum mean square error algorithm with statistical-model-based voice activity detection noise estimator is selected, which can achieve best signal-to-noise ratio improvement of 5.56 dB. The ability of the algorithm to track nonstationary noise is analyzed, and the failure of noise reduction caused by an extreme situation is prevented, and the amount of calculation is reduced by multiplex at the algorithm level. In hardware implementation, look-up tables are used to implement special operations. High-significant-bits search is used to simplify the look-up table to reduce power consumption, and the speed of the search circuit is optimized by parallel design. The exponential integral and exponential operation are combined as one operation with lower precision requirements, leading to lower power consumption. The design presented in this article passed FPGA verification and taped out with a digital hearing aids chip on SMIC 0.13 μm process. The area and the power consumption of the noise reduction module is 0.206 mm2 and 15.3 μW, respectively, which makes the design suitable for low-power audio chip applications.
References
More filters
Journal ArticleDOI
S. Boll1
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

4,862 citations

Journal ArticleDOI
TL;DR: In this paper, a maximum likelihood estimator is developed for determining time delay between signals received at two spatially separated sensors in the presence of uncorrelated noise, where the role of the prefilters is to accentuate the signal passed to the correlator at frequencies for which the signal-to-noise (S/N) ratio is highest and suppress the noise power.
Abstract: A maximum likelihood (ML) estimator is developed for determining time delay between signals received at two spatially separated sensors in the presence of uncorrelated noise. This ML estimator can be realized as a pair of receiver prefilters followed by a cross correlator. The time argument at which the correlator achieves a maximum is the delay estimate. The ML estimator is compared with several other proposed processors of similar form. Under certain conditions the ML estimator is shown to be identical to one proposed by Hannan and Thomson [10] and MacDonald and Schultheiss [21]. Qualitatively, the role of the prefilters is to accentuate the signal passed to the correlator at frequencies for which the signal-to-noise (S/N) ratio is highest and, simultaneously, to suppress the noise power. The same type of prefiltering is provided by the generalized Eckart filter, which maximizes the S/N ratio of the correlator output. For low S/N ratio, the ML estimator is shown to be equivalent to Eckart prefiltering.

4,317 citations

Journal ArticleDOI
TL;DR: NoISEX-92 specifies a carefully controlled experiment on artificially noisy speech data, examining performance for a limited digit recognition task but with a relatively wide range of noises and signal-to-noise ratios.

1,960 citations

Proceedings ArticleDOI
13 May 2002
TL;DR: This paper proposes a multi-band spectral subtraction approach which takes into account the fact that colored noise affects the speech spectrum differently at various frequencies, resulting in superior speech quality and largely reduced musical noise.
Abstract: The spectral subtraction method is a well-known noise reduction technique. Most implementations and variations of the basic technique advocate subtraction of the noise spectrum estimate over the entire speech spectrum. However, real world noise is mostly colored and does not affect the speech signal uniformly over the entire spectrum. In this paper, we propose a multi-band spectral subtraction approach which takes into account the fact that colored noise affects the speech spectrum differently at various frequencies. This method outperforms the standard power spectral subtraction method resulting in superior speech quality and largely reduced musical noise.

554 citations

Journal ArticleDOI
TL;DR: Synthetic microphone signals generated with the image model technique are used to study the effects of room reverberation on the performance of the maximum likelihood estimator of the time delay, in which the estimate is obtained by maximizing the cross correlation between filtered versions of the microphone signals.
Abstract: Synthetic microphone signals generated with the image model technique are used to study the effects of room reverberation on the performance of the maximum likelihood (ML) estimator of the time delay, in which the estimate is obtained by maximizing the cross correlation between filtered versions of the microphone signals. The results underscore the adverse effects of reverberation on the bias, variance and probability of anomaly of the ML estimator. Explanations of these effects are provided.

189 citations