scispace - formally typeset

Speech processing

About: Speech processing is a(n) research topic. Over the lifetime, 24203 publication(s) have been published within this topic receiving 637093 citation(s). The topic is also known as: speech technology. more


Open accessBook
01 Jan 1993-
Abstract: 1. Fundamentals of Speech Recognition. 2. The Speech Signal: Production, Perception, and Acoustic-Phonetic Characterization. 3. Signal Processing and Analysis Methods for Speech Recognition. 4. Pattern Comparison Techniques. 5. Speech Recognition System Design and Implementation Issues. 6. Theory and Implementation of Hidden Markov Models. 7. Speech Recognition Based on Connected Word Models. 8. Large Vocabulary Continuous Speech Recognition. 9. Task-Oriented Applications of Automatic Speech Recognition. more

Topics: Speech processing (81%), Voice activity detection (73%), Speech corpus (70%) more

8,351 Citations

Journal ArticleDOI: 10.1109/TASSP.1979.1163209
S. Boll1Institutions (1)
Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems. more

Topics: Speech processing (70%), Voice activity detection (68%), Speech enhancement (67%) more

4,550 Citations

Journal ArticleDOI: 10.1109/TASSP.1985.1164550
Yariv Ephraim1, David Malah2Institutions (2)
Abstract: This paper focuses on the class of speech enhancement systems which capitalize on the major importance of the short-time spectral amplitude (STSA) of the speech signal in its perception. A system which utilizes a minimum mean-square error (MMSE) STSA estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm. In this paper we derive the MMSE STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables. We analyze the performance of the proposed STSA estimator and compare it with a STSA estimator derived from the Wiener estimator. We also examine the MMSE STSA estimator under uncertainty of signal presence in the noisy observations. In constructing the enhanced signal, the MMSE STSA estimator is combined with the complex exponential of the noisy phase. It is shown here that the latter is the MMSE estimator of the complex exponential of the original phase, which does not affect the STSA estimation. The proposed approach results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise. The complexity of the proposed algorithm is approximately that of other systems in the discussed class. more

Topics: Minimum mean square error (61%), Estimator (58%), Speech enhancement (56%) more

3,775 Citations

Journal ArticleDOI: 10.1038/NRN2113
Gregory Hickok1, David Poeppel2Institutions (2)
Abstract: Despite decades of research, the functional neuroanatomy of speech processing has been difficult to characterize. A major impediment to progress may have been the failure to consider task effects when mapping speech-related processing systems. We outline a dual-stream model of speech processing that remedies this situation. In this model, a ventral stream processes speech signals for comprehension, and a dorsal stream maps acoustic speech signals to frontal lobe articulatory networks. The model assumes that the ventral stream is largely bilaterally organized--although there are important computational differences between the left- and right-hemisphere systems--and that the dorsal stream is strongly left-hemisphere dominant. more

3,694 Citations

No. of papers in the topic in previous years

Top Attributes

Show by:

Topic's top 5 most impactful authors

John H. L. Hansen

129 papers, 6.8K citations

DeLiang Wang

80 papers, 5.6K citations

Shrikanth S. Narayanan

61 papers, 3.4K citations

Li Deng

54 papers, 2.2K citations

B. Yegnanarayana

46 papers, 2K citations

Network Information
Related Topics (5)
Voice activity detection

12.7K papers, 272.6K citations

93% related
Speech coding

14.2K papers, 271.9K citations

93% related
Speaker recognition

14.9K papers, 310K citations

92% related
Speech synthesis

13.3K papers, 261.9K citations

92% related
Linear predictive coding

6.5K papers, 142.9K citations

91% related