scispace - formally typeset
Search or ask a question
Author

Rainer Martin

Other affiliations: AT&T Labs, Information Technology Institute, Siemens  ...read more
Bio: Rainer Martin is an academic researcher from Ruhr University Bochum. The author has contributed to research in topics: Speech enhancement & Noise reduction. The author has an hindex of 31, co-authored 208 publications receiving 6199 citations. Previous affiliations of Rainer Martin include AT&T Labs & Information Technology Institute.


Papers
More filters
Journal ArticleDOI
TL;DR: An unbiased noise estimator is developed which derives the optimal smoothing parameter for recursive smoothing of the power spectral density of the noisy speech signal by minimizing a conditional mean square estimation error criterion in each time step.
Abstract: We describe a method to estimate the power spectral density of nonstationary noise when a noisy speech signal is given. The method can be combined with any speech enhancement algorithm which requires a noise power spectral density estimate. In contrast to other methods, our approach does not use a voice activity detector. Instead it tracks spectral minima in each frequency band without any distinction between speech activity and speech pause. By minimizing a conditional mean square estimation error criterion in each time step we derive the optimal smoothing parameter for recursive smoothing of the power spectral density of the noisy speech signal. Based on the optimally smoothed power spectral density estimate and the analysis of the statistics of spectral minima an unbiased noise estimator is developed. The estimator is well suited for real time implementations. Furthermore, to improve the performance in nonstationary noise we introduce a method to speed up the tracking of the spectral minima. Finally, we evaluate the proposed method in the context of speech enhancement and low bit rate speech coding with various noise types.

1,731 citations

01 Jan 2001
TL;DR: An unbiased noise power estimator based on minimum statistics is derived and its statistical properties and its performance in the context of spectral subtraction are discussed.
Abstract: This contribution presents and analyses an algorithm for the enhancement of noisy speech signals by means of spectral subtraction. In contrast to the standard spectral subtraction algorithm the proposed method does not need a speech activity detector nor histograms to learn signal statistics. The algorithm is capable to track non stationary noise signals and compares favorably with standard spectral subtraction methods in terms of performance and computational complexity. Our noise estimation method is based on the observation that a noise power estimate can be obtained using minimum values of a smoothed power estimate of the noisy speech signal. Thus, the use of minimum statistics eliminates the problem of speech activity detection. The proposed method is conceptually simple and well suited for real time implementations. In this paper we derive an unbiased noise power estimator based on minimum statistics and discuss its statistical properties and its performance in the context of spectral subtraction.

645 citations

Journal ArticleDOI
TL;DR: Compared to algorithms based on the Gaussian assumption, such as the Wiener filter or the Ephraim and Malah (1984) MMSE short-time spectral amplitude estimator, the estimators based on these supergaussian densities deliver an improved signal-to-noise ratio.
Abstract: This paper presents a class of minimum mean-square error (MMSE) estimators for enhancing short-time spectral coefficients of a noisy speech signal. In contrast to most of the presently used methods, we do not assume that the spectral coefficients of the noise or of the clean speech signal obey a (complex) Gaussian probability density. We derive analytical solutions to the problem of estimating discrete Fourier transform (DFT) coefficients in the MMSE sense when the prior probability density function of the clean speech DFT coefficients can be modeled by a complex Laplace or by a complex bilateral Gamma density. The probability density function of the noise DFT coefficients may be modeled either by a complex Gaussian or by a complex Laplacian density. Compared to algorithms based on the Gaussian assumption, such as the Wiener filter or the Ephraim and Malah (1984) MMSE short-time spectral amplitude estimator, the estimators based on these supergaussian densities deliver an improved signal-to-noise ratio.

352 citations

Book
03 Mar 2006
TL;DR: This chapter discusses models of Speech Production and Hearing, performance of the Auditory Organs, and statistical properties of Speech Signals in the DFT Domain.
Abstract: 1 Introduction. 2 Models of Speech Production and Hearing. 2.1 Organs of Speech Production. 2.2 Characteristics of Speech Signals. 2.3 Model of Speech Production. 2.4 Anatomy of Hearing. 2.5 Performance of the Auditory Organs. Bibliography. 3 Spectral Transformations. 3.1 Fourier Transform of Continuous Signals. 3.2 Fourier Transform of Discrete Signals. 3.3 Linear Shift Invariant Systems. 3.4 The z-Transform. 3.5 The Discrete Fourier Transform. 3.6 Fast Convolution. 3.7 Cepstral Analysis. Bibliography. 4 Filter Banks for Spectral Analysis and Synthesis. 4.1 Spectral Analysis Using Narrow-Band Filters. 4.2 Polyphase Network Filter Banks. 4.3 QuadratureMirror Filter Banks. Bibliography. 5 Stochastic Signals and Estimation. 5.1 Basic Concepts. 5.2 Expectations andMoments. 5.3 Bivariate Statistics. 5.4 Probability and Information. 5.5 Multivariate Statistics. 5.6 Stochastic Processes. 5.7 Estimation of Statistical Quantities by Time Averages. 5.8 Power Spectral Densities. 5.9 Estimation of the Power Spectral Density. 5.10 Statistical Properties of Speech Signals. 5.11 Statistical Properties of DFT Coe.cients. 5.12 Optimal Estimation. Bibliography. 6 Linear Prediction. 6.1 Vocal TractModels and Short-TermPrediction. 6.2 Optimal Prediction Coe.cients for Stationary Signals. 6.3 Predictor Adaptation. 6.4 Long-TermPrediction. Bibliography. 7 Quantization. 7.1 Analog Samples and Digital Presentation. 7.2 Uniform Quantization. 7.3 Non-uniformQuantization. 7.4 OptimalQuantization. 7.5 Adaptive Quantization. 7.6 Vector Quantization. 7.6.1 Principle. Bibliography. 8 Speech Coding. 8.1 Classi.cation of Speech Coding Algorithms. 8.2 Model-Based Predictive Coding. 8.3 Di.erentialWaveform Coding. 8.4 Parametric Coding. 8.5 Hybrid Coding. 8.6 Adaptive Post.ltering. Bibliography. 9 Error Concealment and Softbit Decoding. 9.1 Hardbit Source Decoding. 9.2 Conventional Error Concealment. 9.3 Softbits and L-Values. 9.4 Softbit Source Decoding (SD). 9.5 Application toModel Parameters. 9.6 Further Improvements. Bibliography. 10 Bandwidth Extension of Speech Signals (BWE). 10.1 Narrowband versusWideband Telephony. 10.2 Speech Coding with Integrated BWE. 10.3 BWE without Auxiliary Transmission. Bibliography. 11 Single and Dual Channel Noise Reduction. 11.1 Introduction. 11.2 LinearMMSE Estimators. 11.3 Speech Enhancement in the DFT Domain. 11.4 Optimal Non-Linear Estimators. 11.5 Joint Optimum Detection and Estimation of Speech. 11.6 Computation of Likelihood Ratios. 11.7 Estimation of the A Priory Probability of Speech Presence. 11.8 VAD and Noise Estimation Techniques. 11.9 Dual-Channel Noise Reduction. Bibliography. 12 Multi-Channel Noise Reduction. 12.1 Introduction. 12.2 Spatial Sampling of Sound Fields. 12.3 Beamforming. 12.4 PerformanceMeasures and Spatial Aliasing. 12.5 Design of Fixed Beamformers. 12.6 Adaptive Beamformers. Bibliography. 13 Acoustic Echo Control. 13.1 The Echo Control Problem. 13.2 Evaluation Criteria. 13.3 TheWiener Solution. 13.4 The LMS and NLMS Algorithm. 13.5 Convergence Analysis and Control of the LMS Algorithm. 13.6 Geometric Projection Interpretation of the NLMS Algorithm. 13.7 The A.ne Projection Algorithm. 13.8 Least-Squares and Recursive Least-Squares Algorithms. 13.9 Block Processing and Frequency-Domain Adaptive Filters. 13.9.1 Block LMS Algorithm. 13.10 Additional Measures for Echo Control. 13.11 Stereophonic Acoustic Echo Control. A Codec Standards. B Speech Quality Assessment. Bibliography.

276 citations

Proceedings ArticleDOI
13 May 2002
TL;DR: Compared to the state-of-the-art Wiener or MMSE short time amplitude estimators the new estimators deliver improved signal-to-noise ratios and show less annoying random fluctuations in the residual noise than for a Gaussian density.
Abstract: In this paper we consider optimal estimators for speech enhancement in the Discrete Fourier Transform (DFT) domain. We present an analytical solution for estimating complex DFT coefficients in the MMSE sense when the clean speech DFT coefficients are Gamma distributed and the DFT coefficients of the noise are Gaussian or Laplace distributed. Compared to the state-of-the-art Wiener or MMSE short time amplitude estimators the new estimators deliver improved signal-to-noise ratios. When the noise model is a Laplacian density the enhanced speech shows less annoying random fluctuations in the residual noise than for a Gaussian density.

217 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 2016
TL;DR: The table of integrals series and products is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can get it instantly.
Abstract: Thank you very much for downloading table of integrals series and products. Maybe you have knowledge that, people have look hundreds times for their chosen books like this table of integrals series and products, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. table of integrals series and products is available in our book collection an online access to it is set as public so you can get it instantly. Our book servers saves in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the table of integrals series and products is universally compatible with any devices to read.

4,085 citations

Journal ArticleDOI
TL;DR: A short-time objective intelligibility measure (STOI) is presented, which shows high correlation with the intelligibility of noisy and time-frequency weighted noisy speech (e.g., resulting from noise reduction) of three different listening experiments and showed better correlation with speech intelligibility compared to five other reference objective intelligible models.
Abstract: In the development process of noise-reduction algorithms, an objective machine-driven intelligibility measure which shows high correlation with speech intelligibility is of great interest. Besides reducing time and costs compared to real listening experiments, an objective intelligibility measure could also help provide answers on how to improve the intelligibility of noisy unprocessed speech. In this paper, a short-time objective intelligibility measure (STOI) is presented, which shows high correlation with the intelligibility of noisy and time-frequency weighted noisy speech (e.g., resulting from noise reduction) of three different listening experiments. In general, STOI showed better correlation with speech intelligibility compared to five other reference objective intelligibility models. In contrast to other conventional intelligibility models which tend to rely on global statistics across entire sentences, STOI is based on shorter time segments (386 ms). Experiments indeed show that it is beneficial to take segment lengths of this order into account. In addition, a free Matlab implementation is provided.

1,847 citations

Journal ArticleDOI
TL;DR: An unbiased noise estimator is developed which derives the optimal smoothing parameter for recursive smoothing of the power spectral density of the noisy speech signal by minimizing a conditional mean square estimation error criterion in each time step.
Abstract: We describe a method to estimate the power spectral density of nonstationary noise when a noisy speech signal is given. The method can be combined with any speech enhancement algorithm which requires a noise power spectral density estimate. In contrast to other methods, our approach does not use a voice activity detector. Instead it tracks spectral minima in each frequency band without any distinction between speech activity and speech pause. By minimizing a conditional mean square estimation error criterion in each time step we derive the optimal smoothing parameter for recursive smoothing of the power spectral density of the noisy speech signal. Based on the optimally smoothed power spectral density estimate and the analysis of the statistics of spectral minima an unbiased noise estimator is developed. The estimator is well suited for real time implementations. Furthermore, to improve the performance in nonstationary noise we introduce a method to speed up the tracking of the spectral minima. Finally, we evaluate the proposed method in the context of speech enhancement and low bit rate speech coding with various noise types.

1,731 citations

Journal ArticleDOI
01 Oct 1980

1,565 citations