scispace - formally typeset
Search or ask a question

Showing papers on "Spectrogram published in 2001"


Journal ArticleDOI
TL;DR: A criterion that can provide a measure of time–frequency distribution concentration is proposed that does not need normalization in order to behave properly when cross-terms are present and does not discriminate low concentrated components with respect to the highly concentrated ones within the same distribution.

366 citations


Journal ArticleDOI
TL;DR: This work set out to find an objective spectral measure for discontinuity, and studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities.
Abstract: A common problem in diphone synthesis is discussed, viz., the occurrence of audible discontinuities at diphone boundaries. Informal observations show that spectral mismatch is the most likely the clause of this phenomenon. We first set out to find an objective spectral measure for discontinuity. To this end, several spectral distance measures are related to the results of a listening experiment. Then, we studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities. The number of additional diphones is limited by clustering consonant contexts that have a similar effect on the surrounding vowels on the basis of the best performing distance measure. A listening experiment has shown that the addition of these context-sensitive diphones significantly reduces the amount of audible discontinuities.

283 citations


Journal ArticleDOI
TL;DR: The utility of using TFRs to quantitatively resolve changes in the frequency content of these nonstationary signals, as a function of time, is illustrated.
Abstract: The objective of this study is to establish the effectiveness of four different time-frequency representations (TFRs)—the reassigned spectrogram, the reassigned scalogram, the smoothed Wigner–Ville distribution, and the Hilbert spectrum—by comparing their ability to resolve the dispersion relationships for Lamb waves generated and detected with optical techniques This paper illustrates the utility of using TFRs to quantitatively resolve changes in the frequency content of these nonstationary signals, as a function of time While each technique has certain strengths and weaknesses, the reassigned spectrogram appears to be the best choice to characterize multimode Lamb waves

253 citations


Proceedings ArticleDOI
01 Aug 2001
TL;DR: The beat spectrum is a measure of acoustic self-similarity versus lag time, computed from a representation of spectrally similarity, which has a variety of applications, including music retrieval by similarity and automatically generating music videos.
Abstract: We introduce the beat spectrum, a new method of automatically characterizing the rhythm and tempo of music and audio. The beat spectrum is a measure of acoustic self-similarity as a function of time lag. Highly structured or repetitive music will have strong beat spectrum peaks at the repetition times. This reveals both tempo and the relative strength of particular beats, and therefore can distinguish between different kinds of rhythms at the same tempo. We also introduce the beat spectrogram which graphically illustrates rhythm variation over time. Unlike previous approaches to tempo analysis, the beat spectrum does not depend on particular attributes such as energy or frequency, and thus will work for any music or audio in any genre. We present tempo estimation results which are accurate to within 1% for a variety of musical genres. This approach has a variety of applications, including music retrieval by similarity and automatically generating music videos. Anyone who has ever tapped a foot in time to music has performed rhythm analysis. Though simple for humans, this task is considerably more difficult to automate. We introduce a new measure of tempo analysis called the beat spectrum. This is a measure of acoustic self-similarity versus lag time, computed from a representation of spectrally similarity. Peaks in the beat spectrum correspond to major rhythmic components of the source audio. The repetition time of each component can be determined by the lag time of the corresponding peak, while the relative amplitudes of different peaks reflects the strengths of their corresponding rhythmic components. We also present the beat spectrogram which graphically illustrates rhythmic variation over time. The beat spectrogram is an image formed from the beat spectrum over successive windows. Strong rhythmic components are visible as bright bars in the beat spectrogram, making changes in tempo or time signature visible. In addition, a measure of audio novelty can be computed that measures how novel the source audio is at any time [2]. Instances when this measure is large correspond to significant audio changes. Periodic peaks correspond to rhythmic periodicity in the music. In the final section, we present various applications of the beat spectrum, including music retrieval by rhythmic similarity, an “automatic DJ” that can smoothly sequence music with similar tempos and automatic music video generation.

212 citations


Journal ArticleDOI
TL;DR: A new kernel for the design of a high resolution time-frequency distribution (TFD) is introduced and it is shown that this distribution can solve problems that the Wigner-Ville distribution (WVD) or the spectrogram cannot.
Abstract: The paper introduces a new kernel for the design of a high resolution time-frequency distribution (TFD). We show that this distribution can solve problems that the Wigner-Ville distribution (WVD) or the spectrogram cannot. In particular, the proposed distribution can resolve two close signals in the time-frequency domain that the two other distributions cannot. Moreover, we show that the proposed distribution is more accurate than the WVD and the spectrogram in the estimation of the instantaneous frequency of a stepped FM signal embedded in additive Gaussian noise. Synthetic and real data collected from real-world applications are shown to validate the proposed distribution.

150 citations


Patent
19 Jun 2001
TL;DR: In this paper, a method for identifying authorized users and the apparatus of the same, which identifies users by comparison with specific spectrograms of authorized users, was proposed, which comprises the steps of detecting the end point of a verbalized sample from the user requesting access; retrieving speech features from a spectrogram of the speech; determining whether training is necessary, and if so, taking the speech features as a reference template, setting a threshold and going back to (i), otherwise going on to next step; matching patterns of the SPs and the reference template; computing a distance
Abstract: A method for identifying authorized users and the apparatus of the same, which identifies users by comparison with specific spectrograms of authorized users. The method comprises the steps of: (i) detecting the end point of a verbalized sample from the user requesting access; (ii) retrieving speech features from a spectrogram of the speech; (iii) determining whether training is necessary, and if so, taking the speech features as a reference template, setting a threshold and going back to (i), otherwise going on to next step; (iv) matching patterns of the speech features and the reference template; (v) computing a distance between the speech features and the reference template according to the matching result of (iv) to obtain a distance scoring; (vi) comparing the distance scoring with the threshold; (vii) determining whether the user is authorized according to the compared result of (vi).

134 citations


Journal ArticleDOI
TL;DR: This work proposes smoothing regular quadratic TFRs to retain only that information that is essential for classification, and calls the resulting quadrato-class-dependent T FRs class-dependentTFRs.
Abstract: In many pattern recognition applications, features are traditionally extracted from standard time-frequency representations (TFRs). This assumes that the implicit smoothing of, say, a spectrogram is appropriate for the classification task. Making such assumptions may degrade classification performance. In general, ana time-frequency classification technique that uses a singular quadratic TFR (e.g., the spectrogram) as a source of features will never surpass the performance of the same technique using a regular quadratic TFR (e,g., Rihaczek or Wigner-Ville). Any TFR that is not regular is said to be singular. Use of a singular quadratic TFR implicitly discards information without explicitly determining if it is germane to the classification task. We propose smoothing regular quadratic TFRs to retain only that information that is essential for classification. We call the resulting quadratic TFRs class-dependent TFRs. This approach makes no a priori assumptions about the amount and type of time-frequency smoothing required for classification. The performance of our approach is demonstrated on simulated and real data. The simulated study indicates that the performance can approach the Bayes optimal classifier. The real-world pilot studies involved helicopter fault diagnosis and radar transmitter identification.

91 citations


Journal ArticleDOI
TL;DR: This paper derives two large families of estimators for this spectrum: one based on a diagonal-Toeplitz-diagonal (dTd) factorization of smoothing kernels and the other based on an diagonal-Hankel-di diagonal (dHd)Factorization, which produces coherent averages of the time-varying spectrogram.
Abstract: For a nonstationary random process, the dual-time correlation function and the dual frequency Loeve spectrum are complete theoretical descriptions of second-order behavior. That is, each may be used to synthesize the random process itself, according to the Cramer-Loeve spectral representation. When suitably transformed on one of its two variables, each of these descriptions produces a time-varying spectrum. This spectrum is, in fact, the expected value of the Rihaczek distribution. In this paper, we derive two large families of estimators for this spectrum: one based on a diagonal-Toeplitz-diagonal (dTd) factorization of smoothing kernels and the other based on a diagonal-Hankel-diagonal (dHd) factorization. The dTd factorization produces noncoherent averages of the time-varying spectrogram, and the dHd factorization produces coherent averages. Some of the dTd estimators may be called time-varying power spectrum estimators, and some of the dHd estimators may be called time-varying Wigner-Ville (WV) estimators. The former may always be implemented as multiwindow spectrum estimators, and in some casts, they are true time variations on the Blackman-Tukey-Rosenblatt-Grenander (BTGR) spectrogram. The latter are variations on the Stankovic class of WV estimators.

65 citations


01 Jan 2001
TL;DR: The attempts to classify music into three broad categories: rock, classical and jazz are discussed, including the particular choice of features that are used- spectrograms and mel scaled cepstral coefficients (MFCC).
Abstract: With the huge increase in the availability of digital music, it has become more important to automate the task of querying a database of musical pieces. At the same time, a computational solution of this task might give us an insight into how humans perceive and classify music. In this paper, we discuss our attempts to classify music into three broad categories: rock, classical and jazz. We discuss the feature extraction process and the particular choice of features that we used- spectrograms and mel scaled cepstral coefficients (MFCC). We use the texture-of- texture models to generate feature vectors out of these. Together, these features are capable of capturing the frequency-power profile of the sound as the song proceeds. Finally, we attempt to classify the generated data using a variety of classifiers. we discuss our results and the inferences that can be drawn from them.

63 citations


Journal ArticleDOI
TL;DR: A multiwindow method for generating a time-varying spectrum of nonstationary signals is presented and examples are provided, with performance criteria measures, to demonstrate and quantify the effectiveness of the method.
Abstract: A multiwindow method for generating a time-varying spectrum of nonstationary signals is presented The time-varying spectrum is computed from an optimally weighted average of multiple orthogonal windowed spectrograms. The weights are determined using linear least squares estimation with respect to a reference time-frequency distribution. Examples are provided, with performance criteria measures, to demonstrate and quantify the effectiveness of the method.

54 citations


Patent
27 Jul 2001
TL;DR: In this paper, an encoder for encoding a stegotext and a decoder for decoding the encoded stegOTExt, the stegtext being generated by modulating the log power spectrogram of a covertext signal with at least one key, the or each key having been added or subtracted in the log domain to the covertext power spectrum.
Abstract: The invention comprises an encoder for encoding a stegotext and a decoder for decoding the encoded stegotext, the stegotext being generated by modulating the log power spectrogram of a covertext signal with at least one key, the or each key having been added or subtracted in the log domain to the covertext power spectrogram in accordance with the data of the watermark code with which the stegotext was generated, and the modulated power spectrogram having been returned into the original domain of the covertext. The decoder carries out Fast Fourier Transformation and rectangular polar conversion of the stegotext signal so as to transform the stegotext signal into the log power spectrogram domain; subtracts in the log power domain positive and negative multiples of the key or keys from blocks of the log power spectrogram and evaluates the probability of the results of such substractions representing an unmodified block of covertext in accordance with a predetermined statistical model.

Journal ArticleDOI
TL;DR: In this paper, exact expressions for the variance and bias of the instantaneous frequency (IF) estimate using a spectrogram are derived and simple approximative formulae are provided and theoretical results are statistically confirmed.
Abstract: Exact expressions for the variance and bias of the instantaneous frequency (IF) estimate using a spectrogram are derived. Simple approximative formulae are provided and theoretical results are statistically confirmed.

Journal ArticleDOI
TL;DR: This work builds on Cohen's work on instantaneous bandwidth and frequency by extending it to a multiwindow framework for polynomial phase signals, and develops a method utilizing this new multiwindow time-varying spectral technique for estimating the instantaneous frequency of a signal.
Abstract: We build on Cohen's work (Cohen and Lee 1988, 1989; Cohen 1990, 1995) on instantaneous bandwidth and frequency by extending it to a multiwindow framework for polynomial phase signals. Unlike the case with a single spectrogram, which Cohen considered, our multiwindow framework allows one to obtain a time-varying spectral estimate that simultaneously satisfies instantaneous bandwidth and frequency constraints. We then develop a method utilizing this new multiwindow time-varying spectral technique for estimating the instantaneous frequency of a signal. The method is computationally simple, asymptotically unbiased for noise-free signals, and provides a signal-to-noise ratio (SNR) improvement of more than 3 dB over other estimators, including the cross-polynomial Wigner distribution method, for quadratic and cubic FM signals.

Journal ArticleDOI
TL;DR: In this paper, the period and length of a fiber grating structure can be reconstructed from its corresponding complex reflection coefficient using time-frequency signal analysis based on Wigner-Ville and spectrogram distributions.
Abstract: The period and length of a fiber grating structure can be reconstructed from its corresponding complex reflection coefficient using time-frequency signal analysis based on Wigner-Ville and spectrogram distributions. We provide an experimental demonstration of this synthesis technique on two fiber grating structures and obtain good agreement between the reconstructed values and those expected based on the parameters used in their fabrication. We then propose and numerically demonstrate how this technique can be applied to distributed strain (or temperature) sensing.

Proceedings ArticleDOI
01 Dec 2001
TL;DR: This paper identifies a hop free subset of data by discarding high-entropy spectral slices from the spectrogram, then performs low-rank decomposition of four-way data generated by capitalizing on both spatial and temporal shift invariance for high resolution direction of arrival (DOA) recovery.
Abstract: This paper considers the problem of blind localization and tracking of multiple frequency-hopped spread-spectrum (FHSS) signals using an antenna array, without knowledge of hopping patterns. We first identify a hop free subset of data by discarding high-entropy spectral slices from the spectrogram, then perform low-rank decomposition of four-way data generated by capitalizing on both spatial and temporal shift invariance for high resolution direction of arrival (DOA) recovery. After MMSE beamforming, a dynamic programming approach is developed for joint ML estimation of signal frequencies and hopping instants for signal user tracking.

Journal Article
TL;DR: This paper proposes methods that apply 3-D microphone arrays, directional analysis of measured room responses, and visualization of data, yielding useful information about the time-frequency-direction properties of the responses.
Abstract: Room impulse responses are inherently multidimensional, including components in three coordinate directions, each one further being described as a time-frequency representation. Suc h 5-dimensional data is di cult to visualize and interpret. We propose methods that apply 3-D microphone arrays, directional analysis of measured room responses, and visualization of data, yielding useful information about the time-frequency-direction properties of the responses. The applicability of the methods is demonstrated with three di erent cases of real measurements. INTRODUCTION A room impulse response, measured from a source to a receiver position, is inherently multidimensional. Traditionally, the evolution of an omnidirectional sound pressure response in a single point has been studied as a function of time and frequency. However, dividing the response further into directional components can reveal much more information about the actual propagation of sound in the room, as well as about its perceptual aspects. In this paper we propose methods that are based on 3-D microphone arrays, directional analysis of the measured responses, and visualization of such data in a way that yields maximal information about the time-frequency-direction properties of the response. MERIMAA ET AL. Measurement, Analysis, and Visualization of Directional Room Responses The measurement of directional room responses is made with a special 3-D microphone probe which basically consists of two intensity probes in each x-, y-, and z-coordinate directions and is constructed of small electret capsules. The responses are analyzed either with a uniform or an auditorily motivated time-frequency resolution. The analysis results in a significant amount of 5-dimensional data that is hard to visualize and interpret. Based on measured x/y/z-intensity components, intensity vectors (magnitude and direction) can be plotted in a spectrogram-like map, one vector for each time-frequency bin, illustrating the directional evolution of the field in time and frequency. Additionally, a pressure-related time-frequency spectrogram can be overlaid with the vectors, in gray levels or colors, illustrating for example a perceptually motivated spectrogram with no directional information. One such map can be used to illustrate the horizontal information and another one can be added for the elevation information. This technique is a part of a Matlab visualization toolbox for directional room responses developed by the authors, and it includes several other possibilities to analyze and represent room acoustical data. Traditional parameters and presentations are also available, some of them in 3-D versions, such as energy-time plots in desired directions. The paper starts with a discussion on measurements of directional room responses and sound intensity. This is followed by descriptions of the visualization method and the auditorily motivated time-frequency analysis. Finally, the applicability of the methods is demonstrated with three different cases of real measurements. DIRECTIONAL SOUND PRESSURE COMPONENTS Existing literature on room acoustics discusses mainly omnidirectional measurements with the exception of some special directional parameters. Directional room responses can be measured with either directional microphones or arrays of microphones. However, an array of omnidirectional microphones has some distinct advantages compared to directional microphones. Omnidirectional capsules can be made smaller and they usually behave more like ideal transducers. Further, if the omnidirectional signals are stored at the measurement time, it is possible to afterwards create varying directivity patterns based on a single measurement. Typical directivity patterns can be formed with an array of two or more closely spaced omnidirectional microphones and some equalization to compensate for the resulting non-flat magnitude response. For example the difference of two microphone signals gives a dipole pattern and adding an appropriate delay to one of the signals changes the pattern to a cardioid. Okubo et al. [1] have also proposed a method that uses a product of cardioid and dipole signals to achieve a directivity pattern more suitable for some directional room acoustics measurements. Various directional sound pressure responses can be used to plot traditional impulse responses, energy-time-curves or spectrograms that give information about the directional properties of the room responses. With larger microphone arrays it is also possible to form directivity patterns with very narrow beams and thus good spatial resolution. However, groups of similar plots for several different directions are not very visual or easy to interpret. Sound intensity as a vector quantity can solve some of the visualization problems in the method we are proposing in this paper. SOUND INTENSITY Sound intensity [2] describes the propagation of energy in a sound field. Instantaneous intensity vector is defined as the product of instantaneous sound pressure p(t) and particle velocity u(t) I(t) = p(t)u(t) (1) Based on the linearized fluid momentum equation, particle velocity in the direction n can be written in the form

Journal ArticleDOI
TL;DR: It is shown that the applied dynamic neural model of the ECT sensor offers very high speed of operation and guarantees reliability of the recognition results.
Abstract: In this paper a Multi-Frequency Excitation and Spectrogram Eddy Current System and an inverse neural model were used to detect and identify natural flaws in steam generator tubes. It is shown that the applied dynamic neural model of the ECT sensor offers very high speed of operation and guarantees reliability of the recognition results.

Proceedings ArticleDOI
21 Oct 2001
TL;DR: The main benefits of the proposed reassignment stage are that it yields an improved time-frequency localisation estimate relative to standard methods, and that it produces a measure of the variance of these estimates to be used as an aid in later processing.
Abstract: The reassignment method for the short-time Fourier transform is proposed as a technique for improving the time and frequency estimates of musical audio data. Based on this representation, four classes of expected objects (sinusoid, unresolved sinusoid, transient and noise) are proposed and explained. Pattern classification methods are then used to extract objects conforming to these classes from individual frames of the reassigned spectrogram, with each frame being examined independently. Results for several simple real-world examples are presented, showing the capability of this method even without the aid of tracking from frame to frame. The main benefits of the proposed reassignment stage are that it yields an improved time-frequency localisation estimate relative to standard methods, and that it produces a measure of the variance of these estimates to be used as an aid in later processing.

Proceedings ArticleDOI
07 May 2001
TL;DR: This work proposes a sequential design of two dimensional discriminants (CLDs) and shows that these CLDs are similar to first few JLDs and the discriminant features derived from the CLDs outperform those obtained from J LDs in the continuous-digit recognition task.
Abstract: We study the information in the joint time-frequency domain using 1515 dimensional-15 spectral energies and temporal span of 1s-block of spectrogram as features. In this feature space, we first derive 20 joint linear discriminants (JLDs) using linear discriminant analysis (LDA). Using principal component analysis (PCA), we conclude that information in this block of the spectrogram can be analyzed independently across the time and frequency domains. Under this assumption, we propose a sequential design of two dimensional discriminants (CLDs), i.e., spectral discriminants followed by temporal discriminants. We show that these CLDs are similar to first few JLDs and the discriminant features derived from the CLDs outperform those obtained from JLDs in the continuous-digit recognition task.

Journal ArticleDOI
TL;DR: Several TFR methods can be used to measure the magnitude of the turbulence fluctuations and different parameters must be used for each method to minimize the velocity variance of the estimator, to optimize the detection of the turbulent frequency fluctuations, and to estimate the Kolmogorov spectrum.
Abstract: The current processing performed by commercial instruments to obtain the time-frequency representation (TFR) of pulsed-wave Doppler signals may not be adequate to characterize turbulent flow motions. The assessment of the intensity of turbulence is of high clinical importance and measuring high-frequency (small-scale) flow motions, using Doppler ultrasound (US), is a difficult problem that has been studied very little. The objective was to optimize the performance of the spectrogram (SPEC), autoregressive modeling (AR), Choi-Williams distribution (CWD), Choi-Williams reduced interference distribution (CW-RID), Bessel distribution (BD), and matching pursuit method (MP) for mean velocity waveform estimation and turbulence detection. The intensity of turbulence was measured from the fluctuations of the Doppler mean velocity obtained from a simulation model under pulsatile flow. The Kolmogorov spectrum, which is used to determine the frequency of the fluctuations and, thus, the scale of the turbulent motions, was also computed for each method. The best set of parameters for each TFR method was determined by minimizing the error of the absolute frequency fluctuations and Kolmogorov spectral bandwidth measured from the simulated and computed Doppler spectra. The results showed that different parameters must be used for each method to minimize the velocity variance of the estimator, to optimize the detection of the turbulent frequency fluctuations, and to estimate the Kolmogorov spectrum. To minimize the variance and to measure the absolute turbulent frequency fluctuations, four methods provided similar results: SPEC (10-ms sine-cosine windows), AR (10-ms rectangular windows, model order = 8), CWD (w(N) and w(M) = 10-ms rectangular windows, sigma = 0.01), and BD (w(N) = 10-ms rectangular windows, alpha = 16). The velocity variance in the absence of turbulence was on the order of 0.04 m/s (coefficient of variation ranging from 8.0% to 14.5%, depending on the method). With these spectral techniques, the peak of the turbulence intensity was adequately estimated (velocity bias < 0.01 m/s). To track the frequency of turbulence, the best method was BD (w(N) = 2-ms rectangular windows, alpha = 2). The bias in the estimate of the -10 dB bandwidth of the Kolmogorov spectrum was 354 +/- 51 Hz in the absence of turbulence (the true bandwidth should be 0 Hz), and -193 +/- 371 Hz with turbulence (the simulated -10-dB bandwidth was estimated at 1256 Hz instead of 1449 Hz). In conclusion, several TFR methods can be used to measure the magnitude of the turbulent fluctuations. To track eddies ranging from large vortex to small turbulent fluctuations (wide Kolmogorov spectrum), the Bessel distribution with appropriate set of parameters is recommended.

Journal ArticleDOI
TL;DR: In this article, higher harmonics in the acoustic spectrogram of an unmanned air vehicle in-flight, as obtained from a ground-based microphone measurement, is shown to be useful for estimating the vehicle's altitude, speed and true engine revolutions per minute.
Abstract: "Higher harmonics in the acoustic spectrogram of an unmanned air vehicle in-flight; as obtained from a ground-based microphone measurement, is shown to be useful for estimating the vehicle's altitude, speed and true engine revolutions per minute. Specifically, the Doppler-shifted frequency time histories derived from spectrogram contours are used in this estimation approach which is based on a least mean square error fit to an Instantaneous frequency model proposed In the literature. Benefits of employing higher harmonics -rather than the fundamental only -in the computations are brought out. The results obtained are satisfactory. A possible system configuration for automatic detection and localisation of similar aircraft is briefly discussed here.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: This paper presents a procedure that allows the analyst to perform the same task in an automatic way using the resolution performance measure for TFD, and optimises all distributions considered and selects the one which results into best concentration of signal components around their instantaneous frequency laws.
Abstract: Selecting a time-frequency distribution (TFD) which represents a signal in an optimal way is commonly done by visually comparing plots of different TFD. This paper presents a procedure that allows the analyst to perform the same task in an automatic way. Using the resolution performance measure for TFD, the procedure optimises all distributions considered and selects the one which results into best concentration of signal components around their instantaneous frequency laws, as well as best suppression of the interference terms in the time-frequency plane. To do this requires us to define a methodology to measure the time-frequency characteristics of a signal from its optimal TFD. An algorithm which implements this methodology is described and results are presented.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: The time-frequency (TF) distributions robust with respect to the heavy-tailed impulse noise are introduced and the robust spectrogram and robust Wigner distribution are considered.
Abstract: The time-frequency (TF) distributions robust with respect to the heavy-tailed impulse noise are introduced. The robust spectrogram (SPEC) and robust Wigner distribution (WD) are considered. The calculation procedure and properties of these representations are given.

Journal ArticleDOI
TL;DR: In this paper, the estimation of motion parameters of moving objects by using variable /spl mu/propagation and time-frequency representations is proposed, where the spectrogram and the Wigner distribution, two basic timefrequency distributions, are used.
Abstract: The estimation of motion parameters of moving objects by using variable /spl mu/-propagation and time-frequency representations is proposed. The spectrogram and the Wigner distribution, two basic time-frequency distributions, are used. Both the velocity and the initial position can be accurately estimated by this approach.

Journal ArticleDOI
TL;DR: A new technique for the measurement of the velocity of individual solid particles moving in fluid flows is proposed, which relies on the ability to resolve in time the Doppler shift of the sound scattered by the continuously insonified particle.
Abstract: It is known that ultrasound techniques yield non-intrusive measurements of hydrodynamic flows. For example, the study of the echoes produced by a large number of particle insonified by pulsed wavetrains has led to a now standard velocimetry technique. In this paper, we propose to extend the method to the continuous tracking of one single particle embedded in a complex flow. This gives a Lagrangian measurement of the fluid motion, which is of importance in mixing and turbulence studies. The method relies on the ability to resolve in time the Doppler shift of the sound scattered by the continuously insonfied particle. For this signal processing problem two classes of approaches are used: time-frequency analysis and parametric high resolution methods. In the first class we consider the spectrogram and reassigned spectrogram, and we apply it to detect the motion of a small bead settling in a fluid at rest. In more non-stationary turbulent flows where methods in the second class are more robust, we have adapted an Approximated Maximum Likelihood technique coupled with a generalized Kalman filter.

01 Jan 2001
TL;DR: It is shown that although the classifier -based compensation methods are superior when recognition is performed with spectrographic features, feature- based compensation methods provide better recognition performance overall, since cepstra derived from the reconstructed spectrogram can now be used for recognition.
Abstract: Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing unreliable noise corrupted components of a spectrographic representation of the noisy speech and performing recognition with the remaining reliable components. Conventional classifier compensation methods modify the recognition system to work with the incomplete representation so obtained. This constrains them to perform recognition using spectrographic features which are known to be suboptimal to cepstra. In previous work we have proposed an alternative feature-compensation approach whereby the unreliable components are replaced by estimates derived from the reliable components and the known statistics of clean speech. In this paper we perform a detailed comparison of various aspects of classifier -based and feature-based compensation methods. We show that although the classifier -based compensation methods are superior when recognition is performed with spectrographic features, feature-based compensation methods provide better recognition performance overall, since cepstra derived from the reconstructed spectrogram can now be used for recognition. In addition, they have the added advantages of being computationally less expensive and not requiring modificati on of the recognizer.

Journal ArticleDOI
TL;DR: Peak matched multiple windows (PM MW) were used to estimate the spectrogram of the electroencephalogram (EEG) and were shown to give estimates with good resolution and low variance.
Abstract: Peak matched multiple windows (PM MW) were used to estimate the spectrogram of the electroencephalogram (EEG). The authors focussed on the ability to estimate frequency changes, and especially resolving close peaks. A peak of known frequency was evoked in the EEG in a predetermined time interval. The PM MW spectrogram was compared to the commonly used single Hanning window and to weighted overlapped segment averaging in simulations and for real-data. The PM MW were shown to give estimates with good resolution and low variance.

Proceedings ArticleDOI
07 May 2001
TL;DR: A RSPEC-based instantaneous frequency (IF) estimator, with a time-varying window length, is presented and simulations show good accuracy ability of the adaptive algorithm and good robustness property with respect to rare high-magnitude noise values.
Abstract: The robust M-periodogram is defined for the analysis of signals with heavy-tailed distribution noise. In the form of a robust spectrogram (RSPEC) it can be used for the analysis of nonstationary signals. A RSPEC-based instantaneous frequency (IF) estimator, with a time-varying window length, is presented. The optimal choice of the window length can resolve the bias-variance tradeoff in the RSPEC-based IF estimation. However, it depends on the unknown nonlinearity of the IF. The algorithm used is able to provide accuracy close to the one that could be achieved if the IF to be estimated were known in advance. Simulations show good accuracy ability of the adaptive algorithm and good robustness property with respect to rare high-magnitude noise values.

Journal ArticleDOI
TL;DR: In this paper, a new technique for studying the Doppler effect quantitatively is presented, which is based on a tape recorder and software that generates spectrograms from digital sound files.
Abstract: This experiment illustrates a new technique for studying the Doppler effect quantitatively. All that is needed is a tape recorder and software that generates spectrograms from digital sound files.

Journal ArticleDOI
TL;DR: Numerical results on noisy spoken words indicated that the transformed spectral pattern of the spoken words was insensitive to noise for SNR ranging from 0 to 20 dB (decibel), and spectral distances between noisy words and original words decreased after the transformation.