scispace - formally typeset
Search or ask a question

Showing papers on "Spectrogram published in 1997"


Patent
TL;DR: In this paper, an acoustic signature recognition and identification system receives signals from a sensor placed on a designated piece of equipment, and the acoustic data is digitized and processed, via a Fast Fourier Transform routine, to create a spectrogram image of frequency versus time.
Abstract: An acoustic signature recognition and identification system receives signals from a sensor placed on a designated piece of equipment. The acoustic data is digitized and processed, via a Fast Fourier Transform routine, to create a spectrogram image of frequency versus time. The spectrogram image is then normalized to permit acoustic pattern recognition regardless of the surrounding environment or magnitude of the acoustic signal. A feature extractor then detects, tracks and characterizes the lines which form the spectrogram. Specifically, the lines are detected via a KY process that is applied to each pixel in the line. A blob coloring process then groups spatially connected pixels into a single signal object. The harmonic content of the lines is then determined and compared with stored templates of known acoustic signatures to ascertain the type of machinery. An alert is then generated in response to the recognized and identified machinery.

399 citations


Proceedings ArticleDOI
21 Apr 1997
TL;DR: This work proposes a new representational format, the modulation spectrogram, that discards much of the spectro-temporal detail in the speech signal and instead focuses on the underlying, stable structure incorporated in the low-frequency portion of the modulation spectrum distributed across critical-band-like channels.
Abstract: Understanding the human ability to reliably process and decode speech across a wide range of acoustic conditions and speaker characteristics is a fundamental challenge for current theories of speech perception. Conventional speech representations such as the sound spectrogram emphasize many spectro-temporal details that are not directly germane to the linguistic information encoded in the speech signal and which consequently do not display the perceptual stability characteristic of human listeners. We propose a new representational format, the modulation spectrogram, that discards much of the spectro-temporal detail in the speech signal and instead focuses on the underlying, stable structure incorporated in the low-frequency portion of the modulation spectrum distributed across critical-band-like channels. We describe the representation and illustrate its stability with color-mapped displays and with results from automatic speech recognition experiments.

211 citations


Journal ArticleDOI
TL;DR: In this article, a phase-retrieval algorithm that retrieves both the probe and the gate pulses independently by converting the frequency-resolved optical gating (FROG) phase retrieval problem to an eigenvector problem is presented.
Abstract: Frequency-resolved optical gating (FROG) is a technique that produces a spectrogram of an ultrashort laser pulse. The intensity and phase of the ultrashort laser pulse can be determined through solving for the phase of the spectrogram with an iterative, phase-retrieval algorithm. This work presents a new phase-retrieval algorithm that retrieves both the probe and the gate pulses independently by converting the FROG phase-retrieval problem to an eigenvector problem. The new algorithm is robust and general. It is tested theoretically by use of synthetic data sets and experimentally by use of single-shot, polarization-gate FROG. We independently and simultaneously characterize the electric field amplitude and phase of a pulse (probe) that was passed though 200 mm of BK7 glass and the amplitude of an unchanged pulse (gate) from an amplified Ti:sapphire laser. When the effect of the 200 mm of BK7 glass was removed mathematically from the probe, there was good agreement between the measured gate and the calculated, prechirped probe.

107 citations


Journal ArticleDOI
TL;DR: A modified version of the dynamic spectrogram is developed that permits complete characterization of an ultrashort optical pulse with fewer approximations than do other schemes and yields an intuitive spectrogram with no inherent symmetry constraints or ambiguity.
Abstract: We develop a modified version of the dynamic spectrogram that permits complete characterization of an ultrashort optical pulse with fewer approximations than do other schemes. The experimental procedure for such a measurement uses a second-order optical nonlinearity and yet yields an intuitive spectrogram with no inherent symmetry constraints or ambiguity. We describe and demonstrate pulse-shape reconstruction from this type of spectrogram. Phase retrieval is performed by means of an iterative algorithm.

89 citations


Journal ArticleDOI
TL;DR: Methods for the automatic recognition of low‐frequency sounds of baleen whales are presented and spectrogram correlation is implemented and found effective at detection of blue whale vocalizations in the presence of interfering sounds.
Abstract: Methods for the automatic recognition of low‐frequency sounds of baleen whales are presented. Matched filtering is implemented with a synthetic filter kernel derived from measurements of whale sounds, and this method is found effective at detecting blue whale (Balaenoptera musculus) sounds in white background noise. Spectrogram correlation is implemented and found effective at detection of blue whale vocalizations in the presence of interfering sounds. Spectrogram correlation employs image correlation using spectrograms and carefully designed image kernels to give good performance in the presence of noise. The two methods are briefly compared. The effectiveness of the spectrogram correlator is also demonstrated with finback whale (B. physalus) sounds. A supplementary method for detecting regular, repetitive sequences of sounds is applied to minke whale (B. acutorostrata) pulse trains, and found to improve detection in conditions of high ambient noise.

73 citations


Journal ArticleDOI
TL;DR: An architecture of the system for time-frequency signal analysis based on the S-method, whose special cases are two the most important distributions: the spectrogram and the Wigner distribution is presented.
Abstract: An architecture of the system for time-frequency signal analysis is presented. This system is based on the S-method, whose special cases are two the most important distributions: the spectrogram and the Wigner distribution. Systems with constant and signal-dependent window widths are presented.

63 citations


Patent
17 Jul 1997
TL;DR: In this article, a triangular interpolation function having a time length twice that of a fundamental period is used to produce a smoothed spectrogram having the space between grid points on the time-frequency plane filled with the surface of a bilinear function.
Abstract: At a smoothing spectrogram calculation portion, a triangular interpolation function having a frequency width twice that of the fundamental frequency of a signal is obtained based on information on the fundamental frequency of the signal. The interpolation function and a spectrum obtained at an adaptive frequency analysis portion are convoluted in the direction of frequency. Then, using a triangular interpolation function having a time length twice that of a fundamental period, the spectrum interpolated in the frequency direction described above is further interpolated in the temporal direction, in order to produce a smoothed spectrogram having the space between grid points on the time-frequency plane filled with the surface of a bilinear function. Using the smoothed spectrogram, a speech sound is transformed. Therefore, the influence of periodicity in the frequency direction and the temporal direction can be reduced.

58 citations


Journal ArticleDOI
TL;DR: Digital spectrographic cross-correlation (SPCC), a technique described by Clark et al. (1987), simultaneously analyses frequency, amplitude and time components of a signal, and returns a single peak correlation coefficient.
Abstract: Digital spectrographic cross-correlation (SPCC), a technique described by Clark et al. (1987), simultaneously analyses frequency, amplitude and time components of a signal, and returns a single peak correlation coefficient. The procedure is objective and uses all the information in the spectrogram. As such, it is a candidate to replace and/or supplement visual spectrogram comparison and multivariate analysis as the technique of choice for comparing sounds. With the increasing availability of sound analysis software with built-in cross-correlation routines, the procedure is becoming readily available to biologists who may not have extensive knowledge of acoustics. This ease of access increases the potential for misapplication of the technique or misinterpretation of results. To assess the utility of SPCC and to highlight pitfalls that need to be avoided in its implementation, we performed a series of tests designed to reveal the sensitivity of the peak cross-correlation coefficient to a variety of...

54 citations


Journal ArticleDOI
TL;DR: This study examines the appropriateness of FDs combined with 17 other general features for classifying objects contained in binary spectrogram images using principal component analysis for feature reduction on a speaker-dependent data set.

52 citations



Journal ArticleDOI
TL;DR: The authors propose a simulated first heart sound (S1) signal that can be used as a reference signal to evaluate the accuracy of time-frequency representation techniques for studying multicomponent signals.
Abstract: The authors propose a simulated first heart sound (S1) signal that can be used as a reference signal to evaluate the accuracy of time-frequency representation techniques for studying multicomponent signals. The composition of this simulated S1 is based on the hypothesis that an S1 recorded on the thorax over the apical area of the heart is composed of constant frequency vibrations from the mitral valve and a frequency modulated vibration from the myocardium. Essentially, the simulated S1 consists of a valvular component and a myocardial component. The valvular component is modelled as two exponentially decaying sinusoids of 50 Hz and 150 Hz and the myocardial component is modelled by a frequency modulated wave between 20 Hz and 100 Hz. The study shows that the simulated S1 has temporal and spectral characteristics similar to S1 recorded in humans and dogs. It also shows that the spectrogram cannot resolve the three components of the simulated S1. It is concluded that it is necessary to search for a better time-frequency representation technique for studying the time-frequency distribution of multicomponent signals such as the simulated S1.

Journal ArticleDOI
TL;DR: In this article, Gated spontaneous emission and four-wave mixing signals are expressed using a mixed time-frequency representation of the fields (Wigner spectrograms) and of the material response functions.
Abstract: Gated spontaneous emission and four-wave-mixing signals are expressed using a mixed time-frequency representation of the fields (Wigner spectrograms) and of the material response functions. Well-separated and overlapping pulses are described using two-sided (noncausal) and one-sided (causal) spectrograms, respectively. Pump–probe and fluorescence spectra are recast in an anologous form which facilitates the direct comparison of the underlying microscopic dynamics.

Journal ArticleDOI
G. Jones1, Boualem Boashash
TL;DR: The usefulness of the generalized instantaneous parameters is demonstrated in their application to optimal selection of windows for spectrograms through window matching in the time-frequency plane.
Abstract: The concept of instantaneous parameters, which has previously been associated exclusively with 1-D measures like the instantaneous frequency and the group delay, are extended to the 2-D time-frequency plane. Such generalized instantaneous parameters are associated with the short-time Fourier transform. They may also be interpreted as local moments of certain time-frequency distributions. It is shown that these measures enable local signal behavior to be characterized in the time-frequency plane for nonstationary deterministic signals. The usefulness of the generalized instantaneous parameters is demonstrated in their application to optimal selection of windows for spectrograms. This is achieved through window matching in the time-frequency plane. An algorithm is provided that illustrates the performance of this window matching. Results based on simulated and real data are presented.

Journal ArticleDOI
TL;DR: The results of the comparative study show that, although important limitations were found for all five TFRs tested, the CKD appears to be the best technique for the time-frequency analysis of multicomponent signals such as the simulated S1.
Abstract: A simulated first heart sound (S1) signal is used to determine the best technique for analysing physiological S1 from the following five time-frequency representations (TFR): the spectrogram, time-varying autoregressive modelling, binomial reduced interference distribution, Bessel distribution and cone-kernel distribution (CKD). To provide information on the time and frequency resolutions of each TFR technique, the instantaneous frequency and the -3 dB bandwidth as functions of time were computed for each simulated component of the S1. The performance index for selecting the best technique was based on the relative error and the correlation coefficient of the instantaneous frequency function between the theoretical distribution and the computed TFR. This index served to select the best technique. The sensitivity of each technique to noise and to small variations of the signal parameters was also evaluated. The results of the comparative study show that, although important limitations were found for all five TFRs tested, the CKD appears to be the best technique for the time-frequency analysis of multicomponent signals such as the simulated S1.

Journal ArticleDOI
TL;DR: The study shows that although a single technique cannot be optimal for all six murmurs, the spectrogram using a Hamming window of 30 ms is an acceptable compromise to detect the six simutated heart murmurs.
Abstract: The basic parameters of the spectrogram, the Choi-Williams, and the Bessel distributions are adjusted to provide the best time-frequency representations (TFRs) of the simulated murmur signals of mitral stenosis, mitral regurgitation, aortic stenosis, aortic regurgitation, and of two musical murmurs. The initial adjustment of the parameters of each TFR technique is performed by computing and minimising the relative averaged absolute error between the frequency contours at −3dB and −10dB of each TFR of the simulated murmurs and those of the theoretical distribution of the same signals. The results show that the spectrogram generally provides very good to excellent performance in representing the TFRs of stenotic and regurgitant murmurs. Improvements provided by the Choi-Williams and the Bessel distributions are minor but not systematic for the two signal-to-noise ratios tested (0 and 30 dB) and for the two frequency contours estimated. The Bessel and the Choi-Williams distributions provide the best performance for the musical murmurs. The study shows that although a single technique cannot be optimal for all six murmurs, the spectrogram using a Hamming window of 30 ms is an acceptable compromise to detect the six simutated heart murmurs.

Journal ArticleDOI
TL;DR: A new fast algorithm is introduced which allows the recursive evaluation of classical spectrogram and spectrograms modified by the reassignment method to be extended to CTFDs and can be used to compute recursively reassigned smoothed pseudo-Wigner—Ville distributions.


Proceedings ArticleDOI
M. Hauenstein1
21 Apr 1997
TL;DR: This paper describes how loudness patterns can be efficiently calculated with an allpass-transformed polyphase-filterbank based on a mixed radix FFT and three subsequent non-linear stages that model masking effects in the frequency and time domain as well as loudness compression.
Abstract: Loudness patterns are closer to the human perception of sound waves than spectrograms. This paper describes how loudness patterns can be efficiently calculated with an allpass-transformed polyphase-filterbank based on a mixed radix FFT and three subsequent non-linear stages that model masking effects in the frequency and time domain as well as loudness compression.

Proceedings ArticleDOI
24 Oct 1997
TL;DR: In this article, the authors consider the definition and interpretation of instantaneous frequency and other time-varying frequencies of a signal, and related concepts of instantaneous amplitude, instantaneous bandwidth and the time-volumetric spectrum of the signal.
Abstract: We consider the definition and interpretation of instantaneous frequency and other time-varying frequencies of a signal, and related concepts of instantaneous amplitude, instantaneous bandwidth and the time-varying spectrum of a signal. A definition for the average frequency at each time is given, and we show that spectrograms and Cohen-Posch time-frequency distributions can yield this result for the first conditional moment in frequency. For some signals this result equals the instantaneous frequency, but generally instantaneous frequency is not the average frequency at each time in the signal. We discuss monocomponent versus multicomponent signals, and give an estimate of the time-varying spectrum given the instantaneous frequencies and bandwidths of the components. We also consider the role of the complex signal in defining instantaneous amplitude, frequency and bandwidth, and ways to obtain a complex signal satisfying certain physical properties, given a real signal (or its time-varying spectrum). Depending upon the physical properties desired (e.g., the instantaneous amplitude of a magnitude-bounded signal should itself be bounded), one obtains different complex representations -- and hence different instantaneous amplitudes, frequencies and bandwidths -- of the given signal.

Patent
21 Aug 1997
TL;DR: In this article, a time-varying multi-frequency primary electromagnetic field is generated, preferably over the range 100 Hz to 300 kHz, and the secondary magnetic field strength as a function of frequency and spatial relationship is used to identify hidden objects.
Abstract: A spectrogram of secondary magnetic field strength as a function of frequency and spatial relationship is used to identify hidden objects. A time-varying multi-frequency primary electromagnetic field is generated, preferably over the range 100 Hz to 300 kHz, which induces a time-varying multi-frequency secondary electromagnetic field about the hidden object. The strength of the secondary field, typically inphase and quadrature, is plotted as a spectrogram over a low frequency broadband spectrum as a function of frequency and spatial relationship between the hidden object and the secondary field strength detector. From this spectrogram, indications may be had of the hidden object's characteristics such as location, size and shape, and material composition. Preferably, the measured spectrogram is compared against a library of reference spectrograms by a computer to identify the hidden object.

Proceedings ArticleDOI
21 Apr 1997
TL;DR: A new method for generating speech spectrograms based on an autocorrelation function whose parameters are chosen provide processing gain and formant resolution, while minimizing pitch artifacts in the spectrum.
Abstract: A new method for generating speech spectrograms is presented. This algorithm is based on an autocorrelation function whose parameters are chosen provide processing gain and formant resolution, while minimizing pitch artifacts in the spectrum. Crisp formants are produced, and the power ratio of the formants can be adjusted by pre-filtering the data. The autocorrelation process is functionally equivalent to a time-smoothed, windowed Wigner distribution. The process is an improvement over the normal FFT implementation since it requires much less data to resolve the speech formants, and it is an improvement over the un-smoothed Wigner distribution since the cross-terms normally associated with the Wigner distribution are greatly attenuated by the smoothing operation.


Journal ArticleDOI
TL;DR: Recently developed time-frequency distributions, the Wigner Distribution and the Choi-Williams Distribution are investigated to provide high resolution representations of transient evoked OAEs to estimate the cross-products and provide a relatively artefact-free time- frequencies distribution of Oaes.
Abstract: Otoacoustic emissions (OAE) are non-stationary signals that vary in time depending on the characteristics of the stimulus. Traditional spectral analysis using Fourier methods ignores the effects of time and can miss important temporal information. Therefore, a better form of spectral analysis requires the use of time-frequency distribution methods. Traditionally, short time Fourier transforms (STFT), commonly known as spectrograms, are used to provide such time-frequency representations. STFT however, suffer from poor resolution and do not provide enough detail about the characteristics of the emissions. In this study, recently developed time-frequency distributions, the Wigner Distribution (WD) and the Choi-Williams Distribution (CWD) are investigated to provide high resolution representations of transient evoked OAEs. Although WD has excellent properties for time-frequency analysis, it suffers from cross-term artefacts generated when multiple sinusoids are present. CWD provides a solution to thi...

Journal ArticleDOI
TL;DR: By providing a joint distribution of signal intensity at any frequency along time, TFDs preserve details of the temporal structure of the EEG waveform, and can extract its time-varying frequency and amplitude features.
Abstract: The EEG is a time-varying or nonstationary signal. Frequency and amplitude are two of its significant characteristics, and are valuable clues to different states of brain activity. Detection of these temporal features is important in understanding EEGs. Commonly, spectrograms and AR models are used for EEG analysis. However, their accuracy is limited by their inherent assumption of stationarity and their trade-off between time and frequency resolution. We investigate EEG signal processing using existing compound kernel time-frequency distributions (TFDs). By providing a joint distribution of signal intensity at any frequency along time, TFDs preserve details of the temporal structure of the EEG waveform, and can extract its time-varying frequency and amplitude features. We expect that this will have significant implications for EEG analysis and medical diagnosis.

Proceedings ArticleDOI
02 Nov 1997
TL;DR: The quadratic time-frequency representations (TFRs) that may be called time-varying spectrum estimators are derived from first principles as mentioned in this paper, and they turn out to be time varying multi-window spectrum estimator.
Abstract: The quadratic time-frequency representations (TFRs) that may be called time-varying spectrum estimators are derived from first principles. They turn out to be time-varying multiwindow spectrum estimators. In special cases they are time-varying spectrograms that may be written as Fourier transforms of lag-windowed, time-varying correlation sequences or as spectrally smoothed time-varying periodograms. These are not ad-hoc variations on stationary ideas to accommodate time variation. Rather they are the only variations one can obtain for time-varying spectrum analysis.

Proceedings ArticleDOI
23 Jun 1997
TL;DR: In this paper, the scattering interaction of dolphin-emitted acoustic pulses (clicks) with various elastic shells located, underwater, in front of the animal in a large test site in Kaneohe Bay, Hawaii was studied.
Abstract: We study the scattering interaction of dolphin-emitted acoustic pulses ('clicks') with various elastic shells located, underwater, in front of the animal in a large test site in Kaneohe Bay, Hawaii. A carefully instrumented analog- to-digital system continuously captured the emitted clicks and also the returned, backscattered echoes. Using standard conditioning techniques and food reinforcers, the dolphin is taught to push an underwater paddle when the 'correct' target -- the one he has been trained to identify -- is presented to him. He communicates to us his consistently correct identifying choices in this manner. By means of several time- frequency distributions (TFD) of the Wigner-type, or Cohen class, we examine echoes returned by three types of cylindrical shells. The time-frequency distributions we compare in this survey are the pseudo-Wigner distribution (PWD), the Choi-Williams distribution (CWD), the adaptive spectrogram (AS), the cone-shaped distribution (CSD), the Gabor spectrogram (GS), and the spectrogram (SPEC). To be satisfactory for target identification purposes, a time- frequency representation of the echoes should display a sufficient amount of distinguishing features, and still be robust enough to suppress the interference of noise contained in the received signals. Both these properties in a time- frequency distribution depend on the distribution's capability of concentrating the featuers in time and frequency and of handling cross-term interference. With some time-frequency distributions there is a trade-off between the concentration of features and the suppression of cross-term interference. The results of our investigation serve the twofold purposes of (1) advancing the understanding of the amazing target identification capability of dolphins, and (2) to assist in assessing the possibility of identifying submerged targets using active sonar and a classifier based on target signatures in the combined time-frequency domain.

Journal ArticleDOI
TL;DR: The algorithm performance in individuating and tracking the modifications of the cardiac autonomic control is presented and the signal speed variation is used to draw the attention of the physician to transient episodes.
Abstract: A multiple weighted-least-square (WLS) identification process is presented for recognizing changes in ICU patient status. An adaptive scheme for the WLS is proposed in which the forgetting factor is automatically driven by the signal characteristics. Generally, adaptive algorithms are more complex and time-consuming than standard WLS, but they show a high tracking performance combined with the benefit of parameter smoothing. Nevertheless, the use of parameter-explicit filtering significantly reduces the computation time. This is a relevant advantage for real-time implementation. This adaptive approach also provides additional information to identify the signal variation speed, which can be used to localize transient phenomena. This article presents the algorithm performance in individuating and tracking the modifications of the cardiac autonomic control. To make data interpretation easier, the time-frequency distributions obtained are displayed as spectrograms. In addition, the signal speed variation is used to draw the attention of the physician to transient episodes.

Journal ArticleDOI
TL;DR: In this article, the potential of the Gabor spectrogram as a tool for the study of the temporal behavior of nonlinear oscillators was examined, and numerical simulations of a snap-through system and homogeneous Duffing equation were analyzed by this method.
Abstract: We examine the potential of the Gabor spectrogram as a tool for the study of the temporal behavior of nonlinear oscillators. Numerical simulations of a snap-through system and homogeneous Duffing equation are analyzed by this method. Gabor spectrograms show dynamic properties of nonlinear oscillators in situations where the conventional Fourier analysis is not appropriate.

Proceedings ArticleDOI
29 Jan 1997
TL;DR: In this paper, Qian and Chen's "Joint time frequency analysis-Methods and Applications" (Pub. Prentice-Hall, 1996) is an excellent introduction to the subject and to its applications.
Abstract: Time-frequency analysis preserves time and frequency information for non-stationary signals. The Wigner-Ville, Choi-Williams and Cone Shaped distributions (all part of Cohen's class) all produce 'cross-product' terms which are not in the original signal. The Gabor spectrogram also suffers increasingly from spurious terms as its integer order parameter, D, increases. Only the short-time Fourier transform (STFT), and the adaptive algorithm don't suffer from cross products. The STFT is by far the quickest to compute but has the poorest time-frequency resolution of all the above joint time frequency analysis techniques. It is concluded that S. Qian and D. Chen's "Joint Time-Frequency Analysis-Methods and Applications" (Pub. Prentice- Hall, 1996) is an excellent introduction to the subject and to its applications.

Book ChapterDOI
17 Sep 1997
TL;DR: This paper developed a rule set for the recognition of isolated Hungarian vowels represented by Prolog clauses that were refined by the IMPUT Inductive Logic Programming method.
Abstract: Current speech recognition systems can be categorized into two broad classes; the knowledge-based approach and the stochastic one. In this paper we present a rule-based method for the recognition of Hungarian vowels. A spectrogram model was used as a front-end module and some acoustic features were extracted (e.g. locations, intensities and shapes of local maxima) from spectrograms by using a genetic algorithm method. On the basis of these features we developed a rule set for the recognition of isolated Hungarian vowels. These rules represented by Prolog clauses were refined by the IMPUT Inductive Logic Programming method.