IEEE Transactions on Acoustics, Speech, and Signal Processing
About: IEEE Transactions on Acoustics, Speech, and Signal Processing is an academic journal. The journal publishes majorly in the area(s): Digital filter & Adaptive filter. It has an ISSN identifier of 0096-3518. Over the lifetime, 3035 publication(s) have been published receiving 262961 citation(s).
Topics: Digital filter, Adaptive filter, Signal processing, Filter design, Filter (signal processing)
Papers published on a yearly basis
J. Proakis1•Institutions (1)
TL;DR: Although discussed in the context of direction-of-arrival estimation, ESPRIT can be applied to a wide variety of problems including accurate detection and estimation of sinusoids in noise.
Abstract: An approach to the general problem of signal parameter estimation is described. The algorithm differs from its predecessor in that a total least-squares rather than a standard least-squares criterion is used. Although discussed in the context of direction-of-arrival estimation, ESPRIT can be applied to a wide variety of problems including accurate detection and estimation of sinusoids in noise. It exploits an underlying rotational invariance among signal subspaces induced by an array of sensors with a translational invariance structure. The technique, when applicable, manifests significant performance and computational advantages over previous algorithms such as MEM, Capon's MLM, and MUSIC. >
TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.
Abstract: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition. First, a general principle of time-normalization is given using time-warping function. Then, two time-normalized distance definitions, called symmetric and asymmetric forms, are derived from the principle. These two forms are compared with each other through theoretical discussions and experimental studies. The symmetric form algorithm superiority is established. A new technique, called slope constraint, is successfully introduced, in which the warping function slope is restricted so as to improve discrimination between words in different categories. The effective slope constraint characteristic is qualitatively analyzed, and the optimum slope constraint condition is determined through experiments. The optimized algorithm is then extensively subjected to experimental comparison with various DP-algorithms, previously applied to spoken word recognition by different research groups. The experiment shows that the present algorithm gives no more than about two-thirds errors, even compared to the best conventional algorithm.
S. Boll1•Institutions (1)
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Abstract: Several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system. The vocabulary included many phonetically similar monosyllabic words, therefore the emphasis was on the ability to retain phonetically significant acoustic information in the face of syntactic and duration variations. For each parameter set (based on a mel-frequency cepstrum, a linear frequency cepstrum, a linear prediction cepstrum, a linear prediction spectrum, or a set of reflection coefficients), word templates were generated using an efficient dynamic warping method, and test data were time registered with the templates. A set of ten mel-frequency cepstrum coefficients computed every 6.4 ms resulted in the best performance, namely 96.5 percent and 95.0 percent recognition with each of two speakers. The superior performance of the mel-frequency cepstrum coefficients may be attributed to the fact that they better represent the perceptually relevant aspects of the short-term speech spectrum.
Related Journals (5)
IEEE Transactions on Communications
16.2K papers, 741.1K citations
IEEE Transactions on Image Processing
8.6K papers, 734.8K citations
IEEE Transactions on Signal Processing
13.6K papers, 822.8K citations
IEEE Journal on Selected Areas in Communications
7K papers, 595.1K citations
IEEE Transactions on Information Theory
15.9K papers, 1.2M citations