Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Speech scrambling based on chaotic maps and one time pad

[...]

Samah M. H. Alwahbani¹, Eihab Bashier Mohammed Bashier¹•Institutions (1)

University of Khartoum¹

17 Oct 2013

TL;DR: Several experimental results, signal-to-noise ratio, key space, key sensitivity tests, statistical analysis, chosen/known plaintext attack and time analysis show that the proposed method for speech scrambling performs efficiently and can be applied for secure real time speech communications.

...read moreread less

Abstract: This paper presents a chaos-based speech scrambling system. Chaotic maps have been successfully used for large-scale data encryption such as image, audio and video data, due to their good properties such as pseudo-randomness, sensitivity to changes in initial conditions and system parameters and aperiodicity. This paper uses two chaotic maps, circle map and logistic map for speech confusion and diffusion, respectively. In the confusion stage, speech samples are divided into small segments. Then, indices of ordered generated sequence of circle map are used to shuffle the positions of the speech signal segments. Then, a one-time pad generated by the logistic map is used for the diffusion stage. Several experimental results, signal-to-noise ratio, key space, key sensitivity tests, statistical analysis, chosen/known plaintext attack and time analysis show that the proposed method for speech scrambling performs efficiently and can be applied for secure real time speech communications.

...read moreread less

28 citations

Proceedings Article•DOI•

A hybrid barge-in procedure for more reliable turn-taking in human-machine dialog systems

[...]

R.C. Rose¹, Hong Kook Kim•Institutions (1)

AT&T Labs¹

30 Nov 2003

TL;DR: A hybrid procedure for barge-in detection is proposed and evaluated that combines a feature-based voice activity detection (VAD) algorithm with a model-based approach for verifying hypothesized speech segments and is found to obtain better detection performance than procedures that rely on the speech recognition decoder to detect speech.

...read moreread less

Abstract: This paper investigates techniques designed to allow the users of human-machine dialog systems to interrupt or barge-in over machine generated speech messages. An experimental study was performed on utterances collected from a telephone based dialog system to analyze the effect of barge-in performance on users' speech. One result of this study was that excessive barge-in latencies resulted in disfluencies appearing in over half of users' utterances. A hybrid procedure for barge-in detection is proposed and evaluated on the utterances collected from the same domain. The procedure combines a feature-based voice activity detection (VAD) algorithm with a model-based approach for verifying hypothesized speech segments. The procedure is shown in the paper to obtain better detection performance than procedures that rely on the speech recognition decoder to detect speech. It is also found to have latencies that are comparable to those obtained by low delay feature-based speech detection algorithms.

...read moreread less

28 citations

Journal Article•DOI•

Formant estimation and tracking: A deep learning approach.

[...]

Yehoshua Dissen¹, Jacob Goldberger¹, Joseph Keshet¹•Institutions (1)

Bar-Ilan University¹

04 Feb 2019-Journal of the Acoustical Society of America

TL;DR: A network architecture is proposed, which allows model adaptation to different formant frequency ranges that were not seen at training time and which compares favorably with alternative methods for formant estimation and tracking.

...read moreread less

Abstract: Formant frequency estimation and tracking are among the most fundamental problems in speech processing. In the estimation task, the input is a stationary speech segment such as the middle part of a vowel, and the goal is to estimate the formant frequencies, whereas in the task of tracking the input is a series of speech frames, and the goal is to track the trajectory of the formant frequencies throughout the signal. The use of supervised machine learning techniques trained on an annotated corpus of read-speech for these tasks is proposed. Two deep network architectures were evaluated for estimation: feed-forward multilayer-perceptrons and convolutional neural-networks and, correspondingly, two architectures for tracking: recurrent and convolutional recurrent networks. The inputs to the former are composed of linear predictive coding–based cepstral coefficients with a range of model orders and pitch-synchronous cepstral coefficients, where the inputs to the latter are raw spectrograms. The performance of the methods compares favorably with alternative methods for formant estimation and tracking. A network architecture is further proposed, which allows model adaptation to different formant frequency ranges that were not seen at training time. The adapted networks were evaluated on three datasets, and their performance was further improved.

...read moreread less

28 citations

Proceedings Article•DOI•

A high quality 9.6 kbps speech coding system

[...]

D. Griffin¹, Jae Lim•Institutions (1)

Massachusetts Institute of Technology¹

01 Apr 1986

TL;DR: Preliminary results indicate that high quality reproduction can be obtained with this speech coding system for both clean and noisy speech without the "buzziness" and severe degradation in noise typically associated with vocoder speech.

...read moreread less

Abstract: A 9.6 kbps speech coding system based on a new speech model is presented. In this model, the short-time spectrum of speech is modeled as the product of an excitation spectrum and a spectral envelope. The spectral envelope is some smoothed version of the speech spectrum and the excitation spectrum is represented by a fundamental frequency, a voiced/unvoiced (V/UV) decision for each harmonic of the fundamental, and the phase of each harmonic declared voiced. In speech analysis, the model parameters are estimated by explicit comparison between the original speech spectrum and the synthetic speech spectrum. Preliminary results indicate that high quality reproduction can be obtained with this speech coding system for both clean and noisy speech without the "buzziness" and severe degradation in noise typically associated with vocoder speech.

...read moreread less

28 citations

Journal Article•DOI•

Spectral Domain Speech Enhancement Using HMM State-Dependent Super-Gaussian Priors

[...]

Nasser Mohammadiha, Rainer Martin¹, Arne Leijon•Institutions (1)

Ruhr University Bochum¹

24 Jan 2013-IEEE Signal Processing Letters

TL;DR: A spectral domain speech enhancement algorithm is developed, and hidden Markov model (HMM) based MMSE estimators for speech periodogram coefficients are derived under this gamma assumption in both a high uniform resolution and a reduced-resolution Mel domain.

...read moreread less

Abstract: The derivation of MMSE estimators for the DFT coefficients of speech signals, given an observed noisy signal and super-Gaussian prior distributions, has received a lot of interest recently. In this letter, we look at the distribution of the periodogram coefficients of different phonemes, and show that they have a gamma distribution with shape parameters less than one. This verifies that the DFT coefficients for not only the whole speech signal but also for individual phonemes have super-Gaussian distributions. We develop a spectral domain speech enhancement algorithm, and derive hidden Markov model (HMM) based MMSE estimators for speech periodogram coefficients under this gamma assumption in both a high uniform resolution and a reduced-resolution Mel domain. The simulations show that the performance is improved using a gamma distribution compared to the exponential case. Moreover, we show that, even though beneficial in some aspects, the Mel-domain processing does not lead to better results than the algorithms in the high-resolution domain.

...read moreread less

28 citations

Collapse

Network Information

Performance

Metrics

6,598

Papers

148,119

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	25
2021	26
2020	42
2019	25
2018	37

Linear predictive coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics