Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Speech activity detector for use in noise reduction system, and methods therefor

[...]

David V. Anderson, Stephen McGrath¹, Kwan Truong¹•Institutions (1)

Polycom¹

10 Aug 1999

TL;DR: In this article, a speech or voice activity detector (VAD) is provided for detecting whether speech signals are present in individual time frames of an input signal, and a state machine is coupled to the VAD and having a plurality of states.

...read moreread less

Abstract: A system and method for removing noise from a signal containing speech (or a related, information carrying signal) and noise. A speech or voice activity detector (VAD) is provided for detecting whether speech signals are present in individual time frames of an input signal. The VAD comprises a speech detector that receives as input the input signal and examines the input signal in order to generate a plurality of statistics that represent characteristics indicative of the presence or absence of speech in a time frame of the input signal, and generates an output based on the plurality of statistics representing a likelihood of speech presence in a current time frame; and a state machine coupled to the speech detector and having a plurality of states. The state machine receives as input the output of the speech detector and transitions between the plurality of states based on a state at a previous time frame and the output of the speech detector for the current time frame. The state machine generates as output a speech activity status signal based on the state of the state machine, which provides a measure of the likelihood of speech being present during the current time frame. The VAD may be used in a noise reduction system.

...read moreread less

104 citations

Journal Article•DOI•

Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals

[...]

Anurag Chowdhury¹, Arun Ross¹•Institutions (1)

Michigan State University¹

01 Jan 2020-IEEE Transactions on Information Forensics and Security

TL;DR: This work approaches the problem of speaker recognition from severely degraded audio data by judiciously combining two commonly used features: Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC), and concludes that MFCC and LPC capture two distinct aspects of speech, viz., speech perception and speech production.

...read moreread less

Abstract: Speaker recognition algorithms are negatively impacted by the quality of the input speech signal. In this work, we approach the problem of speaker recognition from severely degraded audio data by judiciously combining two commonly used features: Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC). Our hypothesis rests on the observation that MFCC and LPC capture two distinct aspects of speech, viz., speech perception and speech production. A carefully crafted 1D Triplet Convolutional Neural Network (1D-Triplet-CNN) is used to combine these two features in a novel manner, thereby enhancing the performance of speaker recognition in challenging scenarios. Extensive evaluation on multiple datasets, different types of audio degradations, multi-lingual speech, varying length of audio samples, etc. convey the efficacy of the proposed approach over existing speaker recognition methods, including those based on iVector and xVector.

...read moreread less

104 citations

Patent•DOI•

Enhancement of speech coding in background noise for low-rate speech coder

[...]

Yu-Jih Liu¹•Institutions (1)

Wilmington University¹

12 May 1993-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech coding system employs measurements of robust features of speech frames whose distribution is not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment.

...read moreread less

Abstract: A speech coding system employs measurements of robust features of speech frames whose distribution are not strongly affected by noise/levels to make voicing decisions for input speech occurring in a noisy environment. Linear programing analysis of the robust features and respective weights are used to determine an optimum linear combination of these features. The input speech vectors are matched to a vocabulary of codewords in order to select the corresponding, optimally matching codeword. Adaptive vector quantization is used in which a vocabulary of words obtained in a quiet environment is updated based upon a noise estimate of a noisy environment in which the input speech occurs, and the "noisy" vocabulary is then searched for the best match with an input speech vector. The corresponding clean codeword index is then selected for transmission and for synthesis at the receiver end. The results are better spectral reproduction and significant intelligibility enhancement over prior coding approaches. Robust features found to allow robust voicing decisions include: low-band energy; zero-crossing counts adapted for noise level; AMDF ratio (speech periodicity) measure; low-pass filtered backward correlation; low-pass filtered forward correlation; inverse-filtered backward correlation; and inverse-filtered pitch prediction gain measure.

...read moreread less

103 citations

Patent•

Audio Segmentation and Classification

[...]

Hao Jiang¹, Hong-Jiang Zhang¹•Institutions (1)

Microsoft¹

27 Oct 2004-Journal of the Acoustical Society of America

TL;DR: In this paper, a portion of an audio signal is separated into multiple frames from which one or more different features are extracted, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence).

...read moreread less

Abstract: A portion of an audio signal is separated into multiple frames from which one or more different features are extracted. These different features are used, in combination with a set of rules, to classify the portion of the audio signal into one of multiple different classifications (for example, speech, non-speech, music, environment sound, silence, etc.). In one embodiment, these different features include one or more of line spectrum pairs (LSPs), a noise frame ratio, periodicity of particular bands, spectrum flux features, and energy distribution in one or more of the bands. The line spectrum pairs are also optionally used to segment the audio signal, identifying audio classification changes as well as speaker changes when the audio signal is speech.

...read moreread less

102 citations

Proceedings Article•DOI•

Voiced/Unvoiced classification of speech with applications to the U.S. government LPC-10E algorithm

[...]

Joseph P. Campbell¹, T. Tremain•Institutions (1)

United States Department of Defense¹

07 Apr 1986

TL;DR: The development and application of a new voicing algorithm used in the 2400 bit per second U.S. Government's Enhanced Linear Predictive Coder (LPC-10E) that improves upon other 2400 bps LPC voicing algorithms by providing higher quality synthesized speech.

...read moreread less

Abstract: This paper describes the development and application of a new voicing algorithm used in the 2400 bit per second U.S. Government's Enhanced Linear Predictive Coder (LPC-10E). Correct voicing is crucial to perceived quality and naturalness of LPC systems and therefore to user acceptance of LPC systems. This new voicing algorithm uses a smoothed adaptive linear discriminator to classify the signal as voiced or unvoiced speech. The classifier was determined using Fisher's method of linear discriminant analysis. The voicing decision smoother is a modified median smoother that uses both the linear discriminant and speech onsets to determine its smoothing. The voicing classifier adapts to various acoustic noise levels and features a powerful new set of signal measurements: biased zero crossing rate, energy measures, reflection coefficients, and prediction gains. The LPC-10E voicing algorithm improves upon other 2400 bps LPC voicing algorithms by providing higher quality synthesized speech. Higher quality is due to halving of the error rate and graceful degradation in the presence of acoustic noise.

...read moreread less

102 citations

Collapse

Network Information

Performance

Metrics

6,598

Papers

148,119

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	25
2021	26
2020	42
2019	25
2018	37

Linear predictive coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics