Sudden Noise Reduction Based on GMM with Noise Power Estimation

doi:10.4236/JSEA.2010.34039

Home
/
Papers
/
Sudden Noise Reduction Based on GMM with Noise Power Estimation

Journal Article•DOI•

Sudden Noise Reduction Based on GMM with Noise Power Estimation

Nobuyuki Miyake, Tetsuya Takiguchi¹, Yasuo Ariki•Institutions (1)

30 Apr 2010-Journal of Software Engineering and Applications (Scientific Research Publishing)-Vol. 03, Iss: 04, pp 341-346

TL;DR: In this article, a method for reducing sudden noise using noise detection and classification methods, and noise power estimation, was described, which achieved good performance for recognition of utterances overlapped by sudden noises.

read less

Abstract: This paper describes a method for reducing sudden noise using noise detection and classification methods, and noise power estimation. Sudden noise detection and classification have been dealt with in our previous study. In this paper, GMM-based noise reduction is performed using the detection and classification results. As a result of classification, we can determine the kind of noise we are dealing with, but the power is unknown. In this paper, this problem is solved by combining an estimation of noise power with the noise reduction method. In our experiments, the proposed method achieved good performance for recognition of utterances overlapped by sudden noises.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

A method for impact noise reduction from speech using a stationary-nonstationary separating filter

[...]

Naoki Kyoya¹, Kaoru Arakawa¹•Institutions (1)

Meiji University¹

28 Sep 2009

TL;DR: A method for reducing impact noise mixed into speech by first detects noisy part of the input signal, contaminated with impact noise, using a nonlinear digital filter named as a stationary-nonstationary separating filter, and then applies time-frequency domain masking only to the noisy parts.

...read moreread less

Abstract: A method for reducing impact noise mixed into speech is proposed. This method first detects noisy part of the input signal, contaminated with impact noise, using a nonlinear digital filter named as a stationary-nonstationary separating filter, and then applies time-frequency domain masking only to the noisy parts. The time-frequency domain masking is realized with a voice model and a noise model. The voice model is generated from both of training speech data and the part of the input signal, judged as a clean part where noise is not involved. The noise model is generated from training noise data. These two models are utilized to determine the masking function. Computer simulations verify the high performance of the proposed method.

...read moreread less

11 citations

Journal Article•DOI•

A Zero Phase Noise Reduction Method with Damped Oscillation Estimator

[...]

Sayuri Kohmura¹, Arata Kawamura¹, Youji Iiguni¹•Institutions (1)

Osaka University¹

01 Oct 2014-IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

7 citations

Journal Article•DOI•

Noise Reduction Technique

[...]

Naoto Sasaoka, Yoshio Itoh

01 Jan 2011-IEICE ESS Fundamentals Review

TL;DR: In this paper, minimum mean square error (MMSEI) is defined as the minimum SEI of the SEI for short time spectrum amplitude (STSA) measure.

...read moreread less

Abstract: 我々の生活には，様々な騒音が氾濫しており，携帯電話，補聴器などのマイクロホンには所望信号である音声以外に騒音が混入し，会話の妨げとなる．そこで快適な通話品質を確保するためにマイクロホンアレー，スペクトルサブトラクションやMMSE（Minimum Mean Square Error）-STSA（Short Time Spectral Amplitude）などに代表される騒音抑圧技術が古くから研究されてきた．また騒音抑圧技術の実現には信号や音場の推定に用いられる適応信号処理技術が必須である．そこで本稿では騒音抑圧技術の基礎である適応フィルタを利用したノイズキャンセラについて説明する．次に，音声強調手法に焦点を当てMMSE-STSA 手法並びに適応信号処理を用いる騒音抑圧技術について述べる．ところで，従来の音声強調手法により定常性の高い騒音に対して良好な抑圧効果が得られていたが，衝撃音などの時間変動の急しゅんな突発性騒音の抑圧は困難であった．しかし近年，そのような突発性騒音を抑圧する手法が提案されている．そこで最後に突発性騒音の特性並びに突発性騒音抑圧法に関する近年の研究動向についても述べる．

...read moreread less

5 citations

Proceedings Article•DOI•

Speech enhancement based on 4th order cumulant backward linear predictor for impulsive noise

[...]

Naoto Sasaoka¹, Kazumasa Ono¹, Yoshio Itoh¹•Institutions (1)

Tottori University¹

01 Oct 2012

TL;DR: In this article, a speech enhancement based on a linear predictor with 4th order cumulant adaptive algorithm is proposed to reduce the impulsive noise which is suddenly generated by hitting an object.

...read moreread less

Abstract: Although a speech enhancement has been proposed, it is difficult to reduce the impulsive noise whose power changes fast. In this paper the speech enhancement based on a linear predictor with 4th order cumulant adaptive algorithm is proposed to reduce the impulsive noise which is suddenly generated by hitting an object. The proposed method takes advantages of the high kurtosis of the impulsive noise. The proposed linear predictor converges such that the output of the linear predictor becomes the component with high kurtosis. The impulsive noise has high kurtosis, on the other hand speech has low kurtosis. Therefore, the proposed linear predictor can estimate only the impulsive noise. In addition, in order to improve a convergence rate the backward linear predictor is used herein.

...read moreread less

4 citations

Journal Article•DOI•

Impact and High-Pitch Noise Suppression Based on Spectral Entropy

[...]

Arata Kawamura¹, Noboru Hayasaka², Naoto Sasaoka³•Institutions (3)

Osaka University¹, Osaka Electro-Communication University², Tottori University³

01 Apr 2016-IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

3 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

[...]

Yoav Freund¹, Robert E. Schapire¹•Institutions (1)

AT&T Labs¹

01 Aug 1997

TL;DR: The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.

...read moreread less

Abstract: In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting. We show that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. We show how the resulting learning algorithm can be applied to a variety of problems, including gambling, multiple-outcome prediction, repeated games, and prediction of points in Rn. In the second part of the paper we apply the multiplicative weight-update technique to derive a new boosting algorithm. This boosting algorithm does not require any prior knowledge about the performance of the weak learning algorithm. We also study generalizations of the new boosting algorithm to the problem of learning functions whose range, rather than being binary, is an arbitrary finite set or a bounded segment of the real line.

...read moreread less

15,813 citations

Book•

Introduction to Machine Learning

[...]

Ethem Alpaydin

01 Oct 2004

TL;DR: Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts, and discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining.

...read moreread less

Abstract: The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data. Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts. In order to present a unified treatment of machine learning problems and solutions, it discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining. All learning algorithms are explained so that the student can easily move from the equations in the book to a computer program. The text covers such topics as supervised learning, Bayesian decision theory, parametric methods, multivariate methods, multilayer perceptrons, local models, hidden Markov models, assessing and comparing classification algorithms, and reinforcement learning. New to the second edition are chapters on kernel machines, graphical models, and Bayesian estimation; expanded coverage of statistical tests in a chapter on design and analysis of machine learning experiments; case studies available on the Web (with downloadable results for instructors); and many additional exercises. All chapters have been revised and updated. Introduction to Machine Learning can be used by advanced undergraduates and graduate students who have completed courses in computer programming, probability, calculus, and linear algebra. It will also be of interest to engineers in the field who are concerned with the application of machine learning methods. Adaptive Computation and Machine Learning series

...read moreread less

3,950 citations

Proceedings Article•DOI•

A vector Taylor series approach for environment-independent speech recognition

[...]

Pedro J. Moreno¹, Bhiksha Raj¹, Richard M. Stern•Institutions (1)

Carnegie Mellon University¹

07 May 1996

TL;DR: This work introduces the use of a vector Taylor series (VTS) expansion to characterize efficiently and accurately the effects on speech statistics of unknown additive noise and unknown linear filtering in a transmission channel.

...read moreread less

Abstract: In this paper we introduce a new analytical approach to environment compensation for speech recognition. Previous attempts at solving analytically the problem of noisy speech recognition have either used an overly-simplified mathematical description of the effects of noise on the statistics of speech or they have relied on the availability of large environment-specific adaptation sets. Some of the previous methods required the use of adaptation data that consists of simultaneously-recorded or "stereo" recordings of clean and degraded speech. In this work we introduce the use of a vector Taylor series (VTS) expansion to characterize efficiently and accurately the effects on speech statistics of unknown additive noise and unknown linear filtering in a transmission channel. The VTS approach is computationally efficient. It can be applied either to the incoming speech feature vectors, or to the statistics representing these vectors. In the first case the speech is compensated and then recognized; in the second case HMM statistics are modified using the VTS formulation. Both approaches use only the actual speech segment being recognized to compute the parameters required for environmental compensation. We evaluate the performance of two implementations of VTS algorithms using the CMU SPHINX-II system on the 100-word alphanumeric CENSUS database and on the 1993 5000-word ARPA Wall Street Journal database. Artificial white Gaussian noise is added to both databases. The VTS approaches provide significant improvements in recognition accuracy compared to previous algorithms.

...read moreread less

480 citations

Proceedings Article•

Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition

[...]

Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, T. Nishiura¹, Takeshi Yamada² - Show less +1 more•Institutions (2)

Nara Institute of Science and Technology¹, University of Tsukuba²

01 May 2000

TL;DR: LREC2000: the 2nd International Conference on Language Resources and Evaluation, May 31 - June 2, 2000, Athens, Greece.

...read moreread less

Abstract: LREC2000: the 2nd International Conference on Language Resources and Evaluation, May 31 - June 2, 2000, Athens, Greece.

...read moreread less

259 citations

Journal Article•DOI•

Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise

[...]

Li Deng¹, Jasha Droppo¹, Alejandro Acero¹•Institutions (1)

Microsoft¹

13 Apr 2004-IEEE Transactions on Speech and Audio Processing

TL;DR: Novel speech feature enhancement technique based on a probabilistic, nonlinear acoustic environment model that effectively incorporates the phase relationship (hence phase sensitive) between the clean speech and the corrupting noise in the acoustic distortion process is presented.

...read moreread less

Abstract: This paper presents a novel speech feature enhancement technique based on a probabilistic, nonlinear acoustic environment model that effectively incorporates the phase relationship (hence phase sensitive) between the clean speech and the corrupting noise in the acoustic distortion process. The core of the enhancement algorithm is the MMSE (minimum mean square error) estimator for the log Mel power spectra of clean speech based on the phase-sensitive environment model, using highly efficient single-point, second-order Taylor series expansion to approximate the joint probability of clean and noisy speech modeled as a multivariate Gaussian. Since a noise estimate is required by the MMSE estimator, a high-quality, sequential noise estimation algorithm is also developed and presented. Both the noise estimation and speech feature enhancement algorithms are evaluated on the Aurora2 task of connected digit recognition. Noise-robust speech recognition results demonstrate that the new acoustic environment model which takes into account the relative phase in speech and noise mixing is superior to the earlier environment model which discards the phase under otherwise identical experimental conditions. The results also show that the sequential MAP (maximum a posteriori) learning for noise estimation is better than the sequential ML (maximum likelihood) learning, both evaluated under the identical phase-sensitive MMSE enhancement condition.

...read moreread less

131 citations