Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

The AMI System for the Transcription of Speech in Meetings

[...]

Thomas Hain¹, Vincent Wan¹, Lukas Burget², Martin Karafiat², John Dines³, Jithendra Vepa⁴, Giulia Garau⁴, Mike Lincoln⁴ - Show less +4 more•Institutions (4)

University of Sheffield¹, Brno University of Technology², Idiap Research Institute³, University of Edinburgh⁴

15 Apr 2007

TL;DR: The AMI transcription system for speech in meetings developed in collaboration by five research groups includes generic techniques such as discriminative and speaker adaptive training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, maximum likelihood linear regression, and phone posterior based features, as well as techniques specifically designed for meeting data.

...read moreread less

Abstract: This paper describes the AMI transcription system for speech in meetings developed in collaboration by five research groups. The system includes generic techniques such as discriminative and speaker adaptive training, vocal tract length normalisation, heteroscedastic linear discriminant analysis, maximum likelihood linear regression, and phone posterior based features, as well as techniques specifically designed for meeting data. These include segmentation and cross-talk suppression, beam-forming, domain adaptation, Web-data collection, and channel adaptive training. The system was improved by more than 20% relative in word error rate compared to our previous system and was used in the NIST RT106 evaluations where it was found to yield competitive performance.

...read moreread less

137 citations

Journal Article•DOI•

A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood

[...]

A. Nadas¹•Institutions (1)

IBM¹

01 Aug 1983-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The currently used method of maximum likelihood, while heuristic, is shown to be superior under certain assumptions to another heuristic: the method of conditional maximum likelihood.

...read moreread less

Abstract: The choice of method for training a speech recognizer is posed as an optimization problem. The currently used method of maximum likelihood, while heuristic, is shown to be superior under certain assumptions to another heuristic: the method of conditional maximum likelihood.

...read moreread less

135 citations

Patent•DOI•

Speech synthesis system

[...]

Nobuyuki Katae¹•Institutions (1)

Fujitsu¹

03 Mar 2005-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech synthesizing system consisting of a speech segment storage section where speech segment is stored, speech segment selection information storage section, and a waveform generating section for generating speech waveform data from the combination of speech segment selected by the speech segment selecting section.

...read moreread less

Abstract: A speech synthesizing system producing a speech of an improved quality of voice by selecting a combination of speech segment most suitable for a synthesis speech unit sequence. The speech synthesizing system comprises a speech segment storage section where speech segment is stored, a speech segment selection information storage section where speech segment selection information including combinations of speech segment constituted of speech segment stored in the speech segment storage section for an arbitrary speech unit sequence and the appropriateness information representing the appropriatenesses of the combinations are stored, a speech segment selecting section for selecting a combination of speech segment most suitable for a synthesis parameter according to the speech segment selection information stored in the speech segment storage section, and a waveform generating section for generating speech waveform data from the combination of speech segment selected by the speech segment selecting section.

...read moreread less

135 citations

Proceedings Article•DOI•

Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction

[...]

Alexander Kain¹, Michael W. Macon¹•Institutions (1)

Oregon Health & Science University¹

07 May 2001

TL;DR: Results show that the speaker identity of speech whose LPC spectrum has been converted can be recognized as the target speaker with the same level of performance as discriminating between LPC coded speech, however, the level of discrimination of converted utterances produced by the full VC system is significantly below that of speaker discrimination of natural speech.

...read moreread less

Abstract: The purpose of a voice conversion (VC) system is to change the perceived speaker identity of a speech signal. We propose an algorithm based on converting the LPC spectrum and predicting the residual as a function of the target envelope parameters. We conduct listening tests based on speaker discrimination of same/difference pairs to measure the accuracy by which the converted voices match the desired target voices. To establish the level of human performance as a baseline, we first measure the ability of listeners to discriminate between original speech utterances under three conditions: normal, fundamental frequency and duration normalized, and LPC coded. Additionally, the spectral parameter conversion function is tested in isolation by listening to source, target, and converted speakers as LPC coded speech. The results show that the speaker identity of speech whose LPC spectrum has been converted can be recognized as the target speaker with the same level of performance as discriminating between LPC coded speech. However, the level of discrimination of converted utterances produced by the full VC system is significantly below that of speaker discrimination of natural speech.

...read moreread less

135 citations

Journal Article•DOI•

Encoding speech using prototype waveforms

[...]

Willem Bastiaan Kleijn¹•Institutions (1)

Bell Labs¹

01 Oct 1993-IEEE Transactions on Speech and Audio Processing

TL;DR: The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals and excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s.

...read moreread less

Abstract: Voiced speech is interpreted as a concentration of slowly evolving pitch-cycle waveforms. This signal can be reconstructed by interpolation from a downsampled sequence of pitch-cycle waveforms with a rate of one prototype waveform per 20-30 ms interval. The prototype waveform is described by a set of linear-prediction (LP) filter coefficients describing the formant structure and a prototype excitation waveform, quantized with analysis-by-synthesis procedures. The speech signal is reconstructed by filtering an excitation signal consisting of the concatenation of (infinitesimal) sections of the instantaneous excitation waveforms. To obtain the correct level of periodicity, the short-term and the long-term correlations between the instantaneous excitation waveforms can be controlled explicitly. Thus, distortions such as noise, reverberation, and buzziness can be prevented. The coding method is easily combined with existing LP-based speech coders, such as CELP, for unvoiced signals. Excellent voiced speech quality is obtained at rates between 3.0 and 4.0 kb/s. >

...read moreread less

133 citations

Collapse

Network Information

Performance

Metrics

6,598

Papers

148,119

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	25
2021	26
2020	42
2019	25
2018	37

Linear predictive coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics