scispace - formally typeset
R

Richard M. Stern

Researcher at Carnegie Mellon University

Publications -  260
Citations -  8820

Richard M. Stern is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Speech processing & Speaker recognition. The author has an hindex of 46, co-authored 235 publications receiving 8489 citations. Previous affiliations of Richard M. Stern include New York University & University of Pittsburgh.

Papers
More filters
Journal ArticleDOI

An approach to cardiac arrhythmia analysis using hidden Markov models

TL;DR: A new approach to ECG arrhythmia analysis is described, based on hidden Markov modeling (HMM), a technique successfully used since the mid 1970s to model speech waveforms for automatic speech recognition.
Proceedings ArticleDOI

A vector Taylor series approach for environment-independent speech recognition

TL;DR: This work introduces the use of a vector Taylor series (VTS) expansion to characterize efficiently and accurately the effects on speech statistics of unknown additive noise and unknown linear filtering in a transmission channel.
Proceedings ArticleDOI

Environmental robustness in automatic speech recognition

TL;DR: Initial efforts to make Sphinx, a continuous-speech speaker-independent recognition system, robust to changes in the environment are reported, and two novel methods based on additive corrections in the cepstral domain are proposed.
Journal ArticleDOI

Power-normalized cepstral coefficients (PNCC) for robust speech recognition

TL;DR: Experimental results demonstrate that PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for speech in the presence of various types of additive noise and in reverberant environments, with only slightly greater computational cost than conventional MFCC processing.
Proceedings ArticleDOI

Power-Normalized Cepstral Coefficients (PNCC) for robust speech recognition

TL;DR: Experimental results demonstrate that PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for speech in the presence of various types of additive noise and in reverberant environments, with only slightly greater computational cost than conventional MFCC processing.