Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Method and apparatus for obtaining complete speech signals for speech recognition applications

[...]

Victor Abrash¹, Federico Cesari¹, Horacio Franco¹, Christopher George¹, Jing Zheng¹ - Show less +1 more•Institutions (1)

SRI International¹

01 Sep 2005

TL;DR: In this paper, the authors present a method and apparatus for obtaining complete speech signals for speech recognition applications using a Hidden Markov Model (HMM) and a sequence of frames.

...read moreread less

Abstract: The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.

...read moreread less

78 citations

Patent•

Systems and methods for protecting a speaker

[...]

Jie Su¹, Samuel Oyetunji¹•Institutions (1)

Cirrus Logic¹

08 Mar 2013

TL;DR: In this article, a controller configured to be coupled to an audio speaker is presented, where the controller receives an audio input signal, and based on a displacement transfer function associated with the audio speaker, processes the audio input signals to generate an output audio signal communicated to the speaker.

...read moreread less

Abstract: In accordance with these and other embodiments of the present disclosure, systems and methods may include a controller configured to be coupled to an audio speaker, wherein the controller receives an audio input signal, and based on a displacement transfer function associated with the audio speaker, processes the audio input signal to generate an output audio signal communicated to the audio speaker, wherein the displacement transfer function correlates an amplitude and a frequency of the audio input signal to an expected displacement of the audio speaker in response to the amplitude and the frequency of the audio input signal.

...read moreread less

78 citations

Journal Article•DOI•

Hidden Markov model-based packet loss concealment for voice over IP

[...]

C.A. Rodbro¹, Manohar N. Murthi¹, Soren Vang Andersen², Søren Holdt Jensen²•Institutions (2)

University of Miami¹, Aalborg University²

01 Sep 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: With a hidden Markov model (HMM) tracking the evolution of speech signal parameters, it is demonstrated how PLC is performed within a statistical signal processing framework and how the HMM is used to index a specially designed PLC module for the particular signal context, leading to signal-contingent PLC.

...read moreread less

Abstract: As voice over IP proliferates, packet loss concealment (PLC) at the receiver has emerged as an important factor in determining voice quality of service. Through the use of heuristic variations of signal and parameter repetition and overlap-add interpolation to handle packet loss, conventional PLC systems largely ignore the dynamics of the statistical evolution of the speech signal, possibly leading to perceptually annoying artifacts. To address this problem, we propose the use of hidden Markov models for PLC. With a hidden Markov model (HMM) tracking the evolution of speech signal parameters, we demonstrate how PLC is performed within a statistical signal processing framework. Moreover, we show how the HMM is used to index a specially designed PLC module for the particular signal context, leading to signal-contingent PLC. Simulation examples, objective tests, and subjective listening tests are provided showing the ability of an HMM-based PLC built with a sinusoidal analysis/synthesis model to provide better loss concealment than a conventional PLC based on the same sinusoidal model for all types of speech signals, including onsets and signal transitions

...read moreread less

78 citations

Journal Article•DOI•

Processing of reverberant speech for time-delay estimation

[...]

B. Yegnanarayana¹, S. R. M. Prasanna², Ramani Duraiswami³, Dmitry N. Zotkin³•Institutions (3)

Indian Institute of Technology Madras¹, Indian Institute of Technology Guwahati², University of Maryland, College Park³

17 Oct 2005-IEEE Transactions on Speech and Audio Processing

TL;DR: The proposed method for time-delay estimation is found to perform better than the generalized cross-correlation (GCC) approach and a method for enhancement of speech is also proposed using the knowledge of the time- delay and the information of the excitation source.

...read moreread less

Abstract: In this paper, we present a method of extracting the time-delay between speech signals collected at two microphone locations. Time-delay estimation from microphone outputs is the first step for many sound localization algorithms, and also for enhancement of speech. For time-delay estimation, speech signals are normally processed using short-time spectral information (either magnitude or phase or both). The spectral features are affected by degradations in speech caused by noise and reverberation. Features corresponding to the excitation source of the speech production mechanism are robust to such degradations. We show that these source features can be extracted reliably from the speech signal. The time-delay estimate can be obtained using the features extracted even from short segments (50-100 ms) of speech from a pair of microphones. The proposed method for time-delay estimation is found to perform better than the generalized cross-correlation (GCC) approach. A method for enhancement of speech is also proposed using the knowledge of the time-delay and the information of the excitation source.

...read moreread less

78 citations

Patent•DOI•

Digital transmission of acoustic signals over a noisy communication channel

[...]

John C. Hardwick, J.S. Lim

30 Nov 1992-Journal of the Acoustical Society of America

TL;DR: In this paper, the effect of uncorrectable bit errors is reduced by adaptively smoothing the spectral parameters in a speech decoder, depending upon the number of errors detected during the error control decoding of the received data.

...read moreread less

Abstract: The performance of digital communication over a noisy communication channel is improved. An encoder combines bit modulation with error control encoding to allow the decoder to use the redundancy in the error control codes to detect uncorrectable bit errors. This method improves the efficiency of the communication system since fewer bits are required for error control, leaving more bits available for data. In the context of a speech coding system, speech quality is improved without sacrificing robustness to bit errors. A bit prioritization method further improves performance over noisy channels. Individual bits in a set of quantizer values are arranged according to their sensitivity to bit errors. Error control codes having higher levels of redundancy are used to protect the most sensitive (highest priority) bits, while lower levels of redundancy are used to protest less sensitive bits. This method improves efficiency of the error control system, since only the highest priority data is encoded with the highest levels of redundancy. The effect of uncorrectable bit errors is reduced by adaptively smoothing the spectral parameters in a speech decoder. The amount of smoothing is varied depending upon the number of errors detected during the error control decoding of the received data. More smoothing is used when a large number of errors are detected, thereby reducing the perceived effect of any uncorrectable bit errors which may be present.

...read moreread less

78 citations

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics