scispace - formally typeset
Search or ask a question
Topic

Noise

About: Noise is a research topic. Over the lifetime, 5111 publications have been published within this topic receiving 69407 citations. The topic is also known as: Мопсы танцуют под радио бандитов из сталкера 10 часов.


Papers
More filters
Proceedings ArticleDOI
Wupeng Wang1, Chao Xing1, Dong Wang2, Xiao Chen1, Fengyu Sun1 
04 May 2020
TL;DR: A safe AVSE approach that can make the visual stream contribute to audio speech enhancment (ASE) safely in conditions of various SNRs by late fusion is presented.
Abstract: Most existing audio-visual speech enhancement (AVSE) methods work well in conditions with strong noise, however when applied to conditions with a medium SNR, serious performance degradations are often observed. These degradations can be partly attributed to the feature-fusion(early fusion etc.) architecture that tightly couples the audio information that is very strong and the visual information that is relatively weak. In this paper, we present a safe AVSE approach that can make the visual stream contribute to audio speech enhancment(ASE) safely in conditions of various SNRs by late fusion.The key novelty is two-fold: Firstly, we define power binary masks (PBMs) as a rough representation of speech signals. This rough representation admits the weakness of the visual information and so can be easily predicted from the visual stream. Secondly, we design a posterior augmentation architecture that integrate the visual-derived PBMs to the audio-derived masks via a gating network. By this architecture, the entire performance is lower-bounded by the audio-based component. Our experiments on the Grid dataset demonstrated that this new approach consistently outperforms the audio-based system in all noise conditions, confirming that it is a safe way to incorporate visual knowledge in speech enhancement.

26 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: Experimental results show that the capacity of the proposed watermarking scheme is relatively high compared with existing spread spectrum based audioWatermarking schemes.
Abstract: A new method is proposed for robust audio watermarking using direct-sequence spread spectrum in combination with the subband decomposition of the audio signal. The method exploits the frequency masking characteristics of the human auditory system (HAS) and inserts the watermark into a randomly selected frequency band of the input audio signal. Performance of the proposed system is evaluated for robustness to signal manipulations such as contamination with additive noise, resampling, compression, filtering, multiple watermark insertion, and random chopping. Experimental results show that the capacity of the proposed watermarking scheme is relatively high compared with existing spread spectrum based audio watermarking schemes.

26 citations

Journal ArticleDOI
TL;DR: Normal hearing listeners' ability to localize the backup alarm in 360-degrees azimuth did not improve when wearing augmented hearing protectors as compared to when wearing conventional passive earmuffs or earplugs of the foam or flanged types, and these results have implications for the updating of backup alarm standards.
Abstract: A human factors experiment employed a hemi-anechoic sound field in which listeners were required to localize a vehicular backup alarm warning signal (both a standard and a frequency-augmented alarm) in 360-degrees azimuth in pink noise of 60 dBA and 90 dBA. Measures of localization performance included: (1) percentage correct localization, (2) percentage of right--left localization errors, (3) percentage of front-rear localization errors, and (4) localization absolute deviation in degrees from the alarm's actual location. In summary, the data demonstrated that, with some exceptions, normal hearing listeners' ability to localize the backup alarm in 360-degrees azimuth did not improve when wearing augmented hearing protectors (including dichotic sound transmission earmuffs, flat attenuation earplugs, and level-dependent earplugs) as compared to when wearing conventional passive earmuffs or earplugs of the foam or flanged types. Exceptions were that in the 90 dBA pink noise, the flat attenuation earplug yielded significantly better accuracy than the polyurethane foam earplug and both the dichotic and the custom-made diotic electronic sound transmission earmuffs. However, the flat attenuation earplug showed no benefit over the standard pre-molded earplug, the arc earplug, and the passive earmuff. Confusions of front-rear alarm directions were most significant in the 90 dBA noise condition, wherein two types of triple-flanged earplugs exhibited significantly fewer front-rear confusions than either of the electronic muffs. On all measures, the diotic sound transmission earmuff resulted in the poorest localization of any of the protectors due to the fact that its single-microphone design did not enable interaural cues to be heard. Localization was consistently more degraded in the 90 dBA pink noise as compared with the relatively quiet condition of the 60 dBA pink noise. A frequency-augmented backup alarm, which incorporated 400 Hz and 4000 Hz components to exploit the benefits of interaural phase and intensity cues respectively, slightly but significantly improved localization compared with the standard, more narrow-bandwidth backup alarm, and these results have implications for the updating of backup alarm standards.

26 citations

Journal ArticleDOI
TL;DR: By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, several new and promising approaches to LP-based audio modeling are obtained.
Abstract: While linear prediction (LP) has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole) LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

26 citations

Patent
24 May 1999
TL;DR: An audio signal noise reduction system comprises a noise detecting circuit 11 for detecting a noise from an audio signal and outputting a detection signal indicating a start time and an end time of a noise period of the noise as mentioned in this paper.
Abstract: An audio signal noise reduction system comprises a noise detecting circuit 11 for detecting a noise from an audio signal and outputting a detection signal indicating a start time and an end time of a noise period of the noise, an LPF 12 for extracting a low frequency component of the audio signal, an HPF 14 for extracting intermediate and high frequency components of the audio signal, a polynomial interpolation circuit 13 for polynomial-interpolating the noise period of the low frequency component being extracted, a mute circuit 15 for muting an output level of the noise period of the intermediate and high frequency components being extracted, and a signal synthesizing circuit 16 for synthesizing the low frequency component whose noise period is polynomial-interpolated and the intermediate and high frequency components the level of whose noise period is suppressed to thus output the audio signal

26 citations


Network Information
Related Topics (5)
Speech processing
24.2K papers, 637K citations
73% related
Noise
110.4K papers, 1.3M citations
72% related
Signal processing
73.4K papers, 983.5K citations
69% related
Piston
176.1K papers, 825.4K citations
69% related
Hidden Markov model
28.3K papers, 725.3K citations
67% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20221
2021125
2020217
2019224
2018243
2017214