scispace - formally typeset
Search or ask a question

Showing papers on "Noise measurement published in 2009"


Journal ArticleDOI
TL;DR: This article considers the application of variational Bayesian methods to joint recursive estimation of the dynamic state and the time-varying measurement noise parameters in linear state space models and proposes an adaptive Kalman filtering method based on forming a separable variational approximation to the joint posterior distribution of states and noise parameters.
Abstract: This article considers the application of variational Bayesian methods to joint recursive estimation of the dynamic state and the time-varying measurement noise parameters in linear state space models. The proposed adaptive Kalman filtering method is based on forming a separable variational approximation to the joint posterior distribution of states and noise parameters on each time step separately. The result is a recursive algorithm, where on each step the state is estimated with Kalman filter and the sufficient statistics of the noise variances are estimated with a fixed-point iteration. The performance of the algorithm is demonstrated with simulated data.

508 citations


Journal ArticleDOI
TL;DR: An 11-bit, 50-MS/s time-to-digital converter (TDC) using a multipath gated ring oscillator with 6 ps of effective delay per stage demonstrates 1st-order noise shaping.
Abstract: An 11-bit, 50-MS/s time-to-digital converter (TDC) using a multipath gated ring oscillator with 6 ps of effective delay per stage demonstrates 1st-order noise shaping. At frequencies below 1 MHz, the TDC error integrates to 80 fs (rms) for a dynamic range of 95 dB with no calibration required. The 157 times 258 mum TDC is realized in 0.13 mum CMOS and, depending on the time difference between input edges, consumes 2.2 to 21 mA from a 1.5 V supply.

340 citations


Journal ArticleDOI
TL;DR: Boundedness and ultimate boundedness of the closed-loop system under switched-gain output feedback is argued and a high-gain observer that switches between two gain values is proposed.

315 citations


Journal ArticleDOI
TL;DR: This paper proposes a novel method capable of dividing an investigated image into various partitions with homogenous noise levels and introduces a segmentation method detecting changes in noise level using the additive white Gaussian noise.

303 citations


Proceedings ArticleDOI
01 Sep 2009
TL;DR: The results suggest that classical benchmark images used in low-level vision are actually noisy and can be cleaned up, and the results on noise estimation on two sets of 50 and a 100 natural images are significantly better than the state-of-the-art.
Abstract: Natural images are known to have scale invariant statistics. While some eariler studies have reported the kurtosis of marginal bandpass filter response distributions to be constant throughout scales, other studies have reported that the kurtosis values are lower for high frequency filters than for lower frequency ones. In this work we propose a resolution for this discrepancy and suggest that this change in kurtosis values is due to noise present in the image. We suggest that this effect is consistent with a clean, natural image corrupted by white noise. We propose a model for this effect, and use it to estimate noise standard deviation in corrupted natural images. In particular, our results suggest that classical benchmark images used in low-level vision are actually noisy and can be cleaned up. Our results on noise estimation on two sets of 50 and a 100 natural images are significantly better than the state-of-the-art.

247 citations


Book ChapterDOI
01 Jan 2009
TL;DR: In this article, it is shown that additive Gaussian noise is the limiting behavior of other noises, e.g., photon counting noise and film grain noise, which is a part of almost any signal.
Abstract: Publisher Summary Noise occurs in images for many reasons. Probably the most frequently occurring noise is additive Gaussian noise. It is widely used to model thermal noise and, under some often reasonable conditions, is the limiting behavior of other noises, e.g., photon counting noise and film grain noise. Gaussian noise is a part of almost any signal. For example, the familiar white noise on a weak television station is well modeled as Gaussian. Since image sensors must count photons—especially in low-light situations—and the number of photons counted is a random quantity, images often have photon counting noise. The grain noise in photographic films is sometimes modeled as Gaussian and sometimes as Poisson. Many images are corrupted by salt and pepper noise, as if someone had sprinkled black and white dots on the image. Other noises include quantization noise and speckle in coherent light situations.

216 citations


Journal ArticleDOI
TL;DR: In this article, a simple ray-theory derivation that facilitates an understanding of how cross correlations of seismic noise can be used to make direct travel-time measurements, even if the conditions assumed by previous derivations do not hold.
Abstract: It has previously been shown that the Green's function between two receivers can be retrieved by cross-correlating time series of noise recorded at the two receivers. This property has been derived assuming that the energy in normal modes is uncorrelated and perfectly equipartitioned, or that the distribution of noise sources is uniform in space and the waves measured satisfy a high frequency approximation. Although a number of authors have successfully extracted travel-time information from seismic surface-wave noise, the reason for this success of noise tomography remains unclear since the assumptions inherent in previous derivations do not hold for dispersive surface waves on the Earth. Here, we present a simple ray-theory derivation that facilitates an understanding of how cross correlations of seismic noise can be used to make direct travel-time measurements, even if the conditions assumed by previous derivations do not hold. Our new framework allows us to verify that cross-correlation measurements of isotropic surface-wave noise give results in accord with ray-theory expectations, but that if noise sources have an anisotropic distribution or if the velocity structure is non-uniform then significant differences can sometimes exist. We quantify the degree to which the sensitivity kernel is different from the geometric ray and find, for example, that the kernel width is period-dependent and that the kernel generally has non-zero sensitivity away from the geometric ray, even within our ray theoretical framework. These differences lead to usually small (but sometimes large) biases in models of seismic-wave speed and we show how our theoretical framework can be used to calculate the appropriate corrections. Even when these corrections are small, calculating the errors within a theoretical framework would alleviate fears traditional seismologists may have regarding the robustness of seismic noise tomography.

185 citations


Journal ArticleDOI
TL;DR: A two-stage algorithm, called switching-based adaptive weighted mean filter, is proposed to remove salt-and-pepper noise from the corrupted images by replacing each noisy pixel with the weighted mean of its noise-free neighbors in the filtering window.
Abstract: A two-stage algorithm, called switching-based adaptive weighted mean filter, is proposed to remove salt-and-pepper noise from the corrupted images. First, the directional difference based noise detector is used to identify the noisy pixels by comparing the minimum absolute value of four mean differences between the current pixel and its neighbors in four directional windows with a predefined threshold. Then, the adaptive weighted mean filter is adopted to remove the detected impulses by replacing each noisy pixel with the weighted mean of its noise-free neighbors in the filtering window. Numerous simulations demonstrate that the proposed filter outperforms many other existing algorithms in terms of effectiveness in noise detection, image restoration and computational efficiency.

183 citations


Proceedings ArticleDOI
01 Nov 2009
TL;DR: It is demonstrated that a simple algorithm, which is dubbed Justice Pursuit (JP), can achieve exact recovery from measurements corrupted with sparse noise.
Abstract: Compressive sensing provides a framework for recovering sparse signals of length N from M ≪ N measurements. If the measurements contain noise bounded by ∈, then standard algorithms recover sparse signals with error at most C∈. However, these algorithms perform suboptimally when the measurement noise is also sparse. This can occur in practice due to shot noise, malfunctioning hardware, transmission errors, or narrowband interference. We demonstrate that a simple algorithm, which we dub Justice Pursuit (JP), can achieve exact recovery from measurements corrupted with sparse noise. The algorithm handles unbounded errors, has no input parameters, and is easily implemented via standard recovery techniques.

181 citations


Proceedings ArticleDOI
29 May 2009
TL;DR: In this article, an LNA-less mixer-first receiver is proposed, which achieves a remarkably high spurious-free dynamic range (SFDR) of 79dB in 1MHz bandwidth over a decade of RF frequencies.
Abstract: Spurious-free dynamic range (SFDR) is a key specification of radio receivers and spectrum analyzers, characterizing the maximum distance between signal and noise+distortion. SFDR is limited by the linearity (intercept point IIP3 mostly, sometimes IIP2) and the noise floor. As receivers already have low noise figure (NF) there is more room for improving the SFDR by increasing the linearity. As there is a strong relation between distortion and voltage swing, it is challenging to maintain or even improve linearity intercept points in future CMOS processes with lower supply voltages. Circuits can be linearized with feedback but loop gain at RF is limited [1]. Moreover, after LNA gain, mixer linearity becomes even tougher. If the amplification is postponed to IF, much more loop gain is available to linearize the amplifier. This paper proposes such an LNA-less mixer-first receiver. By careful analysis and optimization of a passive mixer core [2,3] for low conversion loss and low noise folding it is shown that it is possible to realize IIP3≫11dBm and NF≪6.5dB, i.e. a remarkably high SFDR≫79dB in 1MHz bandwidth over a decade of RF frequencies.

171 citations


Proceedings ArticleDOI
30 Sep 2009
TL;DR: Using the replica method, the outcome of inferring about any fixed collection of signal elements is shown to be asymptotically decoupled, and the single-letter characterization is rigorously justified in the special case of sparse measurement matrices where belief propagation becomes asymPTotically optimal.
Abstract: Compressed sensing deals with the reconstruction of a high-dimensional signal from far fewer linear measurements, where the signal is known to admit a sparse representation in a certain linear space. The asymptotic scaling of the number of measurements needed for reconstruction as the dimension of the signal increases has been studied extensively. This work takes a fundamental perspective on the problem of inferring about individual elements of the sparse signal given the measurements, where the dimensions of the system become increasingly large. Using the replica method, the outcome of inferring about any fixed collection of signal elements is shown to be asymptotically decoupled, i.e., those elements become independent conditioned on the measurements. Furthermore, the problem of inferring about each signal element admits a single-letter characterization in the sense that the posterior distribution of the element, which is a sufficient statistic, becomes asymptotically identical to the posterior of inferring about the same element in scalar Gaussian noise. The result leads to simple characterization of all other elemental metrics of the compressed sensing problem, such as the mean squared error and the error probability for reconstructing the support set of the sparse signal. Finally, the single-letter characterization is rigorously justified in the special case of sparse measurement matrices where belief propagation becomes asymptotically optimal.

Patent
12 Jun 2009
TL;DR: In this paper, an active noise cancellation system that reduces, at a listening position, power of a noise signal radiated from a noise source to the listening position is described. But, the system requires an adaptive filter, at least one acoustic actuator and a signal processing device.
Abstract: An active noise cancellation system that reduces, at a listening position, power of a noise signal radiated from a noise source to the listening position. The system includes an adaptive filter, at least one acoustic actuator and a signal processing device. The adaptive filter receives a reference signal representing the noise signal, and provides a compensation signal. The at least one acoustic actuator radiates the compensation signal to the listening position. The signal processing device evaluates and assesses the stability of the adaptive filter.

Proceedings ArticleDOI
29 Jul 2009
TL;DR: A no-reference objective sharpness metric detecting both blur and noise is proposed, based on the local gradients of the image and does not require any edge detection.
Abstract: A no-reference objective sharpness metric detecting both blur and noise is proposed in this paper. This metric is based on the local gradients of the image and does not require any edge detection. Its value drops either when the test image becomes blurred or corrupted by random noise. It can be thought of as an indicator of the signal to noise ratio of the image. Experiments using synthetic, natural, and compressed images are presented to demonstrate the effectiveness and robustness of this metric. Its statistical properties are also provided.

Journal ArticleDOI
TL;DR: In this article, an early detection of an impending voltage instability from the system states provided by synchronized phasor measurements is presented. But the method fits a set of algebraic equations to the sampled states and performs an efficient sensitivity in order to identify when a combination of load powers has passed through a maximum.
Abstract: This two-part paper deals with the early detection of an impending voltage instability from the system states provided by synchronized phasor measurements. Recognizing that voltage instability detection requires assessing a multidimensional system, the method fits a set of algebraic equations to the sampled states, and performs an efficient sensitivity in order to identify when a combination of load powers has passed through a maximum. This second part of the paper presents simulation results obtained from detailed time-domain simulation of the Nordic32 test system, without and with measurement noise, respectively. Several practical improvements are described such as anticipation of overexcitation limiter activation, and use of a moving average filter. Robustness to load behavior, non-updated topology and unobservability is also shown. Finally a comparison with Thevenin impedance matching criterion is provided.

Journal ArticleDOI
TL;DR: In this article, noise analysis for comparator-based analog-to-digital (ADC) circuits is presented, and the results show that the virtual ground threshold detection comparator dominates the overall ADC noise performance.
Abstract: Noise analysis for comparator-based circuits is presented. The goal is to gain insight into the different sources of noise in these circuits for design purposes. After the general analysis techniques are established, they are applied to different noise sources in the comparator-based switched-capacitor pipeline analog-to-digital converter (ADC). The results show that the noise from the virtual ground threshold detection comparator dominates the overall ADC noise performance. The noise from the charging current can also be significant, depending on the size of the capacitors used, but the contribution was small in the prototype. The other noise sources have contributions comparable to those in op-amp-based designs, and their effects can be managed through appropriate design. In the prototype, folded flicker noise was found to be a significant contributor to the broadband noise because the flicker noise of the comparator extends beyond the Nyquist rate of the converter.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: A system that detects human falls in the home environment, distinguishing them from competing noise, by using only the audio signal from a single far-field microphone, using a Gaussian mixture model (GMM) supervector, whose Euclidean distance measures the pairwise difference between audio segments.
Abstract: We present a system that detects human falls in the home environment, distinguishing them from competing noise, by using only the audio signal from a single far-field microphone. The proposed system models each fall or noise segment by means of a Gaussian mixture model (GMM) supervector, whose Euclidean distance measures the pairwise difference between audio segments. A support vector machine built on a kernel between GMM supervectors is employed to classify audio segments into falls and various types of noise. Experiments on a dataset of human falls, collected as part of the Netcarity project, show that the method improves fall classification F-score to 67% from 59% of a baseline GMM classifier. The approach also effectively addresses the more difficult fall detection problem, where audio segment boundaries are unknown. Specifically, we employ it to reclassify confusable segments produced by a dynamic programming scheme based on traditional GMMs. Such post-processing improves a fall detection accuracy metric by 5% relative.

Journal ArticleDOI
27 Jul 2009
TL;DR: This paper introduces a noise based on sparse convolution and the Gabor kernel that enables all of these properties of noise, and introduces setup-free surface noise, a method for mapping noise onto a surface, complementary to solid noise, that maintains the appearance of the noise pattern along the object and does not require a texture parameterization.
Abstract: Noise is an essential tool for texturing and modeling. Designing interesting textures with noise calls for accurate spectral control, since noise is best described in terms of spectral content. Texturing requires that noise can be easily mapped to a surface, while high-quality rendering requires anisotropic filtering. A noise function that is procedural and fast to evaluate offers several additional advantages. Unfortunately, no existing noise combines all of these properties.In this paper we introduce a noise based on sparse convolution and the Gabor kernel that enables all of these properties. Our noise offers accurate spectral control with intuitive parameters such as orientation, principal frequency and bandwidth. Our noise supports two-dimensional and solid noise, but we also introduce setup-free surface noise. This is a method for mapping noise onto a surface, complementary to solid noise, that maintains the appearance of the noise pattern along the object and does not require a texture parameterization. Our approach requires only a few bytes of storage, does not use discretely sampled data, and is nonperiodic. It supports anisotropy and anisotropic filtering. We demonstrate our noise using an interactive tool for noise design.

Proceedings ArticleDOI
01 Dec 2009
TL;DR: This work addresses the problem of distributed estimation of the poses of N cameras in a camera sensor network using image measurements only by minimizing a cost function on SE(3)N in a distributed fashion using a generalization of the classical consensus algorithm for averaging Euclidean data.
Abstract: We consider the problem of distributed estimation of the poses of N cameras in a camera sensor network using image measurements only. The relative rotation and translation (up to a scale factor) between pairs of neighboring cameras can be estimated using standard computer vision techniques. However, due to noise in the image measurements, these estimates may not be globally consistent. We address this problem by minimizing a cost function on SE(3)N in a distributed fashion using a generalization of the classical consensus algorithm for averaging Euclidean data. We also derive a condition for convergence, which relates the step-size of the consensus algorithm and the degree of the camera network graph. While our methods are designed with the camera sensor network application in mind, our results are applicable to other localization problems in a more general setting. We also provide synthetic simulations to test the validity of our approach.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: This paper presents a method which incorporates a constraint for the WNG into a least-squares beamformer design and still leads to a convex optimization problem that can be solved directly, e.g. by Sequential Quadratic Programming.
Abstract: Broadband data-independent beamforming designs aiming at constant beamwidth often lead to superdirective beamformers for low frequencies, if the sensor spacing is small relative to the wavelengths. Superdirective beamformers are extremely sensitive to spatially white noise and to small errors in the array characteristics. These errors are nearly uncorrelated from sensor to sensor and affect the beamformer in a manner similar to spatially white noise. Hence the White Noise Gain (WNG) is a commonly used measure for the robustness of beamformer designs. In this paper, we present a method which incorporates a constraint for the WNG into a least-squares beamformer design and still leads to a convex optimization problem that can be solved directly, e.g. by Sequential Quadratic Programming. The effectiveness of this method is demonstrated by design examples.

Journal ArticleDOI
TL;DR: It is theoretically and experimentally pointed out that ICA is proficient in noise estimation under a non-point-source noise condition rather than in speech estimation, and a new blind spatial subtraction array (BSSA) is proposed that utilizes ICA as a noise estimator.
Abstract: We propose a new blind spatial subtraction array (BSSA) consisting of a noise estimator based on independent component analysis (ICA) for efficient speech enhancement. In this paper, first, we theoretically and experimentally point out that ICA is proficient in noise estimation under a non-point-source noise condition rather than in speech estimation. Therefore, we propose BSSA that utilizes ICA as a noise estimator. In BSSA, speech extraction is achieved by subtracting the power spectrum of noise signals estimated using ICA from the power spectrum of the partly enhanced target speech signal with a delay-and-sum beamformer. This ldquopower-spectrum-domain subtractionrdquo procedure enables better noise reduction than the conventional ICA with estimation-error robustness. Another benefit of BSSA architecture is ldquopermutation robustness". Although the ICA part in BSSA suffers from a source permutation problem, the BSSA architecture can reduce the negative affection when permutation arises. The results of various speech enhancement test reveal that the noise reduction and speech recognition performance of the proposed BSSA are superior to those of conventional methods.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: Experiments show that the statistically-based U/V classifier can reduce VDE and FFE for the pitch tracker TEMPO in both white and babble noise conditions, and that minimizing FFE instead of VDE results in a reduction in error rates for a number of F0 tracking algorithms, especially in bable noise.
Abstract: In this paper, we propose an F0 Frame Error (FFE) metric which combines Gross Pitch Error (GPE) and Voicing Decision Error (VDE) to objectively evaluate the performance of fundamental frequency (F0) tracking methods. A GPE-VDE curve is then developed to show the trade-off between GPE and VDE. In addition, we introduce a model-based Unvoiced/Voiced (U/V) classification frontend which can be used by any F0 tracking algorithm. In the U/V classification, we train speaker independent U/V models, and then adapt them to speaker dependent models in an unsupervised fashion. The U/V classification result is taken as a mask for F0 tracking. Experiments using the KEELE corpus with additive noise show that our statistically-based U/V classifier can reduce VDE and FFE for the pitch tracker TEMPO [1] in both white and babble noise conditions, and that minimizing FFE instead of VDE results in a reduction in error rates for a number of F0 tracking algorithms, especially in babble noise.

Patent
11 Sep 2009
TL;DR: In this article, a low noise stimulation frequency can be automatically selected by device logic without intervention of the device processor to stimulate the device to sense a touch event at the device, where the frequencies with the lower noise amounts can be selected.
Abstract: Automatic low noise frequency selection for a touch sensitive device is disclosed. A low noise stimulation frequency can be automatically selected by device logic without intervention of the device processor to stimulate the device to sense a touch event at the device. The device logic can automatically select a set of low noise frequencies from among various frequencies based on the amount of noise introduced by the device at the various frequencies, where the frequencies with the lower noise amounts can be selected. The device logic can also automatically select a low noise frequency from among the selected set as the low noise stimulation frequency. The device logic can be implemented partially or entirely in hardware.

Proceedings ArticleDOI
Xuchu Hou1, Shengnan Guo1, Huij Cui1, K. Tang1, Ye Li 
28 Dec 2009
TL;DR: Objective tests show that the proposed algorithm is superior to the traditional MMSE-LSA method in noise-tracking and Mean Opinion Score.
Abstract: This paper proposes an algorithm improved over MMSE-LSA Algorithm. It suits non-stationary noise environments better than the traditional algorithm. The main part of this method is the estimation of noise, which is updated using time-frequency smoothing factors calculated based on speech-present probability in each frequency bin of the noisy speech spectrum. It could keep up with the noise change closely. Objective tests show that the proposed algorithm is superior to the traditional MMSE-LSA method in noise-tracking and Mean Opinion Score.

Journal ArticleDOI
TL;DR: This paper shows that if the image to work with has a sufficiently great amount of low-variability areas, the variance of noise and the coefficient of variation of noise can be estimated as the mode of the distribution of local variances in the image.

Proceedings ArticleDOI
06 Oct 2009
TL;DR: A strategy to efficiently denoise multi-images or video by using a complex image processing chain involving accurate registration, video equalization, noise estimation and the use of state-of-the-art denoising methods that can be estimated accurately from the image burst.
Abstract: Taking photographs under low light conditions with a hand-held camera is problematic. A long exposure time can cause motion blur due to the camera shaking and a short exposure time gives a noisy image. We consider the new technical possibility offered by cameras that take image bursts. Each image of the burst is sharp but noisy. In this preliminary investigation, we explore a strategy to efficiently denoise multi-images or video. The proposed algorithm is a complex image processing chain involving accurate registration, video equalization, noise estimation and the use of state-of-the-art denoising methods. Yet, we show that this complex chain may become risk free thanks to a key feature: the noise model can be estimated accurately from the image burst. Preliminary tests will be presented. On the technical side, the method can already be used to estimate a non parametric camera noise model from any image burst.

Journal ArticleDOI
TL;DR: In the method based on maximization of ki, the improvement of performance is obtained by decreasing DRCZ from 1 to the value corresponding to the minimum of the integrated absolute error (IAE), which is achieved without deteriorating robustness to the model uncertainties.
Abstract: This technical note presents a new, simple and effective, four-parameters proportional-integral-derivative (PID) optimization method. The set of adjustable parameters is defined by the proportional gain k, integral gain ki, damping ratio of the controller zeros (DRCZ), and desired value of the sensitivity to measurement noise Mn. Given Mn and desired value of the maximum sensitivity Ms, for both maximization of k and maximization of ki, only three nonlinear algebraic equations need to be solved for a few values of DRCZ. Contrary to the method based on maximization of ki, in the method based on maximization of k the improvement of performance is obtained by decreasing DRCZ from 1 to the value corresponding to the minimum of the integrated absolute error (IAE). Moreover, this is achieved without deteriorating robustness to the model uncertainties, for a large class of stable processes. Compared to the recently proposed PID optimization methods, for the same Ms and Mn, lower values of IAE and M p are obtained by using the method presented here.

Journal ArticleDOI
TL;DR: It is proved that, in the absence of a permanent cross-layer information path, packet drop should be designed to balance information loss and communication noise in order to optimize the performance.
Abstract: It is the general assumption that in estimation and control over wireless links, the receiver should drop any erroneous packets. While this approach is appropriate for non real-time data-network applications, it can result in instability and loss of performance in networked control systems. In this technical note we consider estimation of a multiple-input multiple-output dynamical system over a mobile fading communication channel using a Kalman filter. We show that the communication protocols suitable for other already-existing applications like data networks may not be entirely applicable for estimation and control of a rapidly changing dynamical system. We then develop new design paradigms in terms of handling noisy packets for such delay-sensitive applications. We reformulate the estimation problem to include the impact of stochastic communication noise in the erroneous packets. We prove that, in the absence of a permanent cross-layer information path, packet drop should be designed to balance information loss and communication noise in order to optimize the performance.

Journal ArticleDOI
TL;DR: The adaptive noise-reduction system that includes the UNANR model can effectively eliminate random noise in ambulatory ECG recordings, leading to a higher SNR improvement than that with the same system using the popular least-mean-square (LMS) filter.

Journal ArticleDOI
01 Jan 2009
TL;DR: The properties of existing non-parametric methods for estimating the plant and noise transfer functions of a linear dynamic system are studied based on the recent insight that leakage errors in the frequency domain have a smooth nature that is completely similar to the initial transients in the time domain.
Abstract: In this paper we study the properties of existing non-parametric methods for estimating the plant and noise transfer functions of a linear dynamic system. The analysis is based on the recent insight that leakage errors in the frequency domain have a smooth nature that is completely similar to the initial transients in the time domain. This not only allows us to understand better the existing classic methods, but also opens the road to new better performing algorithms. The paper includes the output error setup, the errors-in-variables setup, and measurements under feedback conditions. Eventually, some of the methods are illustrated in the analysis of a vibrating metal beam.

Proceedings Article
01 Aug 2009
TL;DR: A nearly ideal VAD algorithm is proposed which is both easy-to-implement and noise robust, comparing to some previous methods and uses short-term features such as Spectral Flatness and Short-term Energy.
Abstract: Voice Activity Detection (VAD) is a very important front end processing in all Speech and Audio processing applications. The performance of most if not all speech/audio processing methods is crucially dependent on the performance of Voice Activity Detection. An ideal voice activity detector needs to be independent from application area and noise condition and have the least parameter tuning in real applications. In this paper a nearly ideal VAD algorithm is proposed which is both easy-to-implement and noise robust, comparing to some previous methods. The proposed method uses short-term features such as Spectral Flatness (SF) and Short-term Energy. This helps the method to be appropriate for online processing tasks. The proposed method was evaluated on several speech corpora with additive noise and is compared with some of the most recent proposed algorithms. The experiments show satisfactory performance in various noise conditions.