Showing papers by "Richard P. Lippmann published in 1997"

PDF

Open Access

Journal Article•DOI•

Speech recognition by machines and humans

[...]

01 Jul 1997-Speech Communication

TL;DR: Comparisons suggest that the human-machine performance gap can be reduced by basic research on improving low-level acoustic-phonetic modeling, on improving robustness with noise and channel variability, and on more accurately modeling spontaneous speech.

...read moreread less

606 citations

Proceedings Article•

Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering and noise KN-37.

[...]

Richard P. Lippmann, Beth A. Carlson

01 Jan 1997

TL;DR: A new and simple approach to compensate for speech recognizers degradations is presented which uses mel-filter-bank (MFB) magnitudes as input features and missing feature theory to dynamically modify the probability computations performed in Hidden Markov Model recognizers.

...read moreread less

Abstract: Speech recognizers trained with quiet wide-band speech degrade dramatically with high-pass, low-pass, and notch filtering, with noise, and with interruptions of the speech input. A new and simple approach to compensate for these degradations is presented which uses mel-filter-bank (MFB) magnitudes as input features and missing feature theory to dynamically modify the probability computations performed in Hidden Markov Model recognizers. When the identity of features missing due to filtering or masking is provided, recognition accuracy on a large talker-independent digit recognition task often rises from below 50% to above 95%. These promising results suggest future work to continuously estimate SNR's within MFB bands for dynamic adaptation of speech recognizers.

...read moreread less

98 citations

Journal Article•DOI•

Coronary Artery Bypass Risk Prediction Using Neural Networks

[...]

Richard P. Lippmann¹, David M. Shahian•Institutions (1)

Massachusetts Institute of Technology¹

01 Jun 1997-The Annals of Thoracic Surgery

TL;DR: A committee classifier combining the best neural network and logistic regression provided the best model calibration, but the receiver operating characteristic curve area was only 76% irrespective of which predictive model was used.

...read moreread less

88 citations

Journal Article•DOI•

A micropower analog circuit implementation of hidden Markov model state decoding

[...]

John Lazzaro¹, John Wawrzynek¹, Richard P. Lippmann²•Institutions (2)

University of California, Berkeley¹, Massachusetts Institute of Technology²

01 Aug 1997-IEEE Journal of Solid-state Circuits

TL;DR: The implementation of a hidden Markov model state decoding system, a component for a wordspotting speech recognition system, and the mapping of the discrete-time state decoding algorithm into the continuous domain are described.

...read moreread less

Abstract: We describe the implementation of a hidden Markov model state decoding system, a component for a wordspotting speech recognition system. The key specification for this state decoder design is microwatt power dissipation: this requirement led to a continuous-time, analog circuit implementation. We describe the tradeoffs inherent in the choice of an analog design and explain the mapping of the discrete-time state decoding algorithm into the continuous domain. We characterize the operation of a ten-word (81-state) state decoder test chip.

...read moreread less

21 citations

Proceedings Article•DOI•

Speech recognition by humans and machines under conditions with severe channel variability and noise

[...]

Richard P. Lippmann¹, Beth A. Carlson¹•Institutions (1)

Massachusetts Institute of Technology¹

04 Apr 1997

TL;DR: An approach to compensate for variable unknown sharp filtering and noise is presented which uses mel-filter-bank magnitudes as input features, estimates the signal-to-noise ratio (SNR) for each filter, and uses missing feature theory to dynamically modify the probability computations performed using Gaussian Mixture or Radial Basis Function neural network classifiers embedded within Hidden Markov Model recognizers.

...read moreread less

Abstract: Despite dramatic recent advances in speech recognition technology, speech recogmzers still perform muchworse than humans The difference in performance between humans and machines is most dramatic whenvariable amounts and types of filtering and noise are present during testing For example, humans readilyunderstand speech that is low-pass filtered below 3 kHz or high-pass filtered above 1 kHz Machines trainedwith wide-band speech, however, degrade dramatically under these conditions An approach to compensatefor variable unknown sharp filtering and noise is presented which uses mel-filter-bank magnitudes as inputfeatures, estimates the signal-to-noise ratio (SNR) for each filter, and uses missing feature theory todynamically modify the probability computations performed using Gaussian Mixture or Radial BasisFunction neural network classifiers embedded within Hidden Markov Model (1-1MM) recognizers Theapproach was successfully demonstrated using a talker-independent digit recognition task It was found thatrecognition accuracy across many conditions rises from below 50%toabove 95% with this approach Thesepromising results suggest future work to dynamically estimate SNIR's and to explore the dynamics of humanadaptation to channel and noise variabilityKeywords: speech recognition, speech perception, missing features, filtering, noise, robust, neural network

...read moreread less

14 citations

Robust speech recognition with interruptions, and noise:':

[...]

Richard P. Lippmann, Beth A. Carlson

01 Jan 1997

TL;DR: A new and simple approach to compensaite for speech recognizers which uses mel-filter-bank (MFB) magnitudes as input features and missing feature theory to dynamically modify the probability computations performed in Hidden Markov Model recognizers.

...read moreread less

Abstract: Speech recognizers trained with quiet wide-band speech degrade dramatically with high-pass, low-pass, and notch filtering, with noise, and with interruptions of the speech input. A new and simple approach to compensaite for these degradations is presented which uses mel-filter-bank (MFB) magnitudes as input features and missing feature theory to dynamically modify the probability computations performed in Hidden Markov Model recognizers. When the identity of features missing due to filtering or masking is provided, recognition accuracy on a large talker-independent digit recognition task often rises from below 50% to above 95%. These promising results suggest future work to continuously estimate SPiR's within MFB bands for dynamic adaptation of speech recognizers.

...read moreread less

7 citations

Journal Article•DOI•

High-performance low-complexity wordspotting using neural networks

[...]

E.I. Chang¹, Richard P. Lippmann•Institutions (1)

Nuance Communications¹

01 Nov 1997-IEEE Transactions on Signal Processing

TL;DR: A high-performance low-complexity neural network wordspotter was developed using radial basis function (RBF) neutral networks in a hidden Markov model (HMM) framework and two new complementary approaches substantially improve performance on the talker-independent Switchboard corpus.

...read moreread less

Abstract: A high-performance low-complexity neural network wordspotter was developed using radial basis function (RBF) neutral networks in a hidden Markov model (HMM) framework. Two new complementary approaches substantially improve performance on the talker-independent Switchboard corpus. Figure of merit (FOM) training adapts wordspotter parameters to directly improve the FOM performance metric, and voice transformations generate additional training examples by warping the spectra of training data to mimic across-talker vocal tract length variability.

...read moreread less

2 citations