Showing papers on "Linear predictive coding published in 1992"

PDF

Open Access

Patent•DOI•

[...]

Jacobs Paul E¹, Gardner William R¹, Lee Chong U¹, Gilhousen Klein S¹, Lam S Katherine¹, Ming-Chang Tsai¹ - Show less +2 more•Institutions (1)

Qualcomm¹

03 Jun 1992-Journal of the Acoustical Society of America

TL;DR: In this paper, a variable rate coding of frames of digitized speech samples is proposed, comprising the steps of determining a level of speech activity for a frame of digitised speech samples, selecting an encoding rate from a set of rates based upon the determined level of activity within said frame, and coding said frame according to a predetermined coding format for said selected rate wherein each rate has a corresponding different coding format.

...read moreread less

Abstract: A method of speech signal compression, by variable rate coding of frames of digitized speech samples, comprising the steps of: determining a level of speech activity for a frame of digitized speech samples; selecting an encoding rate from a set of rates based upon said determined level of speech activity within said frame; coding said frame according to a predetermined coding format for said selected rate wherein each rate has a corresponding different coding format; providing for said frame a corresponding output data packet at said selected rate.

...read moreread less

552 citations

Journal Article•DOI•

Statistical-model-based speech enhancement systems

[...]

Yariv Ephraim¹•Institutions (1)

Bell Labs¹

01 Oct 1992

TL;DR: A unified statistical approach for the three basic problems of speech enhancement is developed, using composite source models for the signal and noise and a fairly large set of distortion measures.

...read moreread less

Abstract: Since the statistics of the speech signal as well as of the noise are not explicitly available, and the most perceptually meaningful distortion measure is not known, model-based approaches have recently been extensively studied and applied to the three basic problems of speech enhancement: signal estimation from a given sample function of noisy speech, signal coding when only noisy speech is available, and recognition of noisy speech signals in man-machine communication. Research on the model-based approach is integrated and put into perspective with other more traditional approaches for speech enhancement. A unified statistical approach for the three basic problems of speech enhancement is developed, using composite source models for the signal and noise and a fairly large set of distortion measures. >

...read moreread less

383 citations

Proceedings Article•DOI•

RASTA-PLP speech analysis technique

[...]

Hynek Hermansky, Nelson Morgan¹, Aruna Bayya, Phil Kohn¹•Institutions (1)

International Computer Science Institute¹

23 Mar 1992

TL;DR: The authors have developed a technique that is more robust to such steady-state spectral factors in speech that is conceptually simple and computationally efficient.

...read moreread less

Abstract: Most speech parameter estimation techniques are easily influenced by the frequency response of the communication channel. The authors have developed a technique that is more robust to such steady-state spectral factors in speech. The approach is conceptually simple and computationally efficient. The new method is described, and experimental results are proposed that show significant advantages for the proposed method. >

...read moreread less

297 citations

Journal Article•DOI•

Shape invariant time-scale and pitch modification of speech

[...]

Thomas F. Quatieri¹, R.J. McAulay¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 1992-IEEE Transactions on Signal Processing

TL;DR: A time-scale modification system that preserves shape-invariant joint time- scale and pitch modification during voicing is developed using a version of the sinusoidal analysis-synthesis system that models and independently modifies the phase contributions of the vocal tract and vocal cord excitation.

...read moreread less

Abstract: The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. A time-scale modification system that preserves this shape-invariance property during voicing is developed. This is done using a version of the sinusoidal analysis-synthesis system that models and independently modifies the phase contributions of the vocal tract and vocal cord excitation. An important property of the system is its ability to perform time-varying rates of change. Extensions of the method are applied to fixed and time-varying pitch modification of speech. The sine-wave analysis-synthesis system also allows for shape-invariant joint time-scale and pitch modification, and allows for the adjustment of the time scale and pitch according to speech characteristics such as the degree of voicing. >

...read moreread less

245 citations

Journal Article•DOI•

A low-delay CELP coder for the CCITT 16 kb/s speech coding standard

[...]

J.-H. Chen¹, Richard V. Cox¹, Y.-C. Lin, Nuggehally Sampath Jayant², M.J. Melchner² - Show less +1 more•Institutions (2)

Bell Labs¹, AT&T²

01 Jun 1992-IEEE Journal on Selected Areas in Communications

TL;DR: The official CCITT laboratory tests revealed that the speech quality of this 16 kb/s LD-CELP coder is either equivalent to or better than that of the CCITT G.721 standard 32-kb/s ADPCM coder for almost all conditions tested.

...read moreread less

Abstract: A low-delay code-excited linear prediction (LD-CELP) speech coder which is expected to be standardized in 1992 as a CCITT G Series Recommendation for universal applications of speech coding at 16 kb/s is presented. The coder achieves a one-way coding delay of less than 2 ms by making both the LPC predictor and the excitation gain backward-adaptive and by using a small excitation vector size of five samples. The official CCITT laboratory tests revealed that the speech quality of this 16 kb/s LD-CELP coder is either equivalent to or better than that of the CCITT G.721 standard 32-kb/s ADPCM coder for almost all conditions tested. A description of the LD-CELP algorithm, its implementation on the DSP32C for CCITT testing, and performance results from these tests are presented. >

...read moreread less

206 citations

Patent•DOI•

Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models

[...]

Lalit R. Bahl¹, Peter V. De Souza¹, Ponani S. Gopalakrishnan¹, Michael Picheny¹•Institutions (1)

IBM¹

10 Sep 1992-Journal of the Acoustical Society of America

TL;DR: A speech coding apparatus compares the closeness of the feature value of a featurevector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal.

...read moreread less

Abstract: A speech coding apparatus compares the closeness of the feature value of a feature vector signal of an utterance to the parameter values of prototype vector signals to obtain prototype match scores for the feature vector signal and each prototype vector signal. The speech coding apparatus stores a plurality of speech transition models representing speech transitions. At least one speech transition is represented by a plurality of different models. Each speech transition model has a plurality of model outputs, each comprising a prototype match score for a prototype vector signal. Each model output has an output probability. A model match score for a first feature vector signal and each speech transition model comprises the output probability for at least one prototype match score for the first feature vector signal and a prototype vector signal. A speech transition match score for the first feature vector signal and each speech transition comprises the best model match score for the first feature vector signal and all speech transition models representing the speech transition. The identification value of each speech transition and the speech transition match score for the first feature vector signal and each speech transition are output as a coded utterance representation signal of the first feature vector signal.

...read moreread less

176 citations

Proceedings Article•DOI•

An unsupervised, sequential learning algorithm for the segmentation of speech waveforms with multiple speakers

[...]

M.-H. Siu, G. Yu, H. Gish

23 Mar 1992

TL;DR: The authors present a method for segmenting speech waveforms containing several speakers into utterances, each from one individual, and then identifying each utterance as coming from a specific individual or group of individuals.

...read moreread less

Abstract: The authors present a method for segmenting speech waveforms containing several speakers into utterances, each from one individual, and then identifying each utterance as coming from a specific individual or group of individuals. The procedure is unsupervised in that there is no training set, and sequential in that information obtained in early stages of the process is utilized in later stages. >

...read moreread less

77 citations

Proceedings Article•DOI•

Generalized analysis-by-synthesis coding and its application to pitch prediction

[...]

Willem Bastiaan Kleijn¹, Ravi P. Ramachandran¹, P. Kroon¹•Institutions (1)

Bell Labs¹

23 Mar 1992

TL;DR: The authors discuss the application of generalized analysis-by-synthesis coding to the pitch predictor of a code excited linear predictor (CELP) coder, which makes it possible to transmit the pitch prediction parameters at a much lower rate than conventional approaches, without compromising speech quality.

...read moreread less

Abstract: Many modifications can be applied to a speech signal without changing its perceptual quality. For a particular speech coder, the coding efficiency will differ for distinct modifications. To exploit this, the authors introduced a generalized analysis-by-synthesis procedure. In this procedure, a search is performed over a multitude of modified original signals (on a blockwise basis), and the signal which can be encoded with the least distortion is selected for transmission. At the receiver, a quantized version of this modified original signal is constructed. The authors discuss the application of generalized analysis-by-synthesis coding to the pitch predictor of a code excited linear predictor (CELP) coder. The use of this technique makes it possible to transmit the pitch predictor parameters at a much lower rate than conventional approaches, without compromising speech quality. >

...read moreread less

56 citations

Journal Article•DOI•

Detection of laryngeal function using speech and electroglottographic data

[...]

Donald G. Childers¹, K.S. Bae¹•Institutions (1)

University of Florida¹

01 Jan 1992-IEEE Transactions on Biomedical Engineering

TL;DR: Two procedures for the detection of laryngeal pathology were developed: a spectral distortion measure using pitch synchronous and asynchronous methods with linear predictive coding vectors and vector quantization, and analysis of the EGG signal using time interval and amplitude difference measures.

...read moreread less

Abstract: The purpose of this research was to develop quantitative measures for the assessment of laryngeal function using speech and electroglottographic (EGG) data Two procedures for the detection of laryngeal pathology were developed: (1) a spectral distortion measure using pitch synchronous and asynchronous methods with linear predictive coding (LPC) vectors and vector quantization (VQ), and (2) analysis of the EGG signal using time interval and amplitude difference measures The VQ procedure was conjectured to offer the possibility of circumventing the need to estimate the glottal volume velocity waveform by inverse filtering techniques The EGG procedure was to evaluate data that was 'nearly' a direct measure of vocal fold vibratory motion and thus was conjectured to offer the potential for providing an excellent assessment of laryngeal function A threshold based procedure gave 759 and 690% probability of pathological detection using procedures (1) and (2), respectively, for 29 patients with pathological voices and 52 normal subjects The false alarm probability was 96% for the normal subjects >

...read moreread less

50 citations

Journal Article•DOI•

On the use of line spectral frequency parameters for speech recognition

[...]

Kuldip K. Paliwal¹•Institutions (1)

Tata Institute of Fundamental Research¹

01 Apr 1992-Digital Signal Processing

TL;DR: The aim of the present paper is to extend the use of the LSF representation for more general speech recognition systems and to widen the scope of its results.

...read moreread less

49 citations

Patent•DOI•

Speech information extractor

[...]

Robert H. Mceachern

30 Sep 1992-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech signal is received into a bank of bandpass filters and the instantaneous amplitude modulation and frequency modulation of each harmonic in the speech waveform is determined, for example, by computing a weighted average of the frequency modulations of the harmonics.

...read moreread less

Abstract: A method and apparatus for extracting information from human speech are disclosed. A speech signal is received into a bank of bandpass filters and the instantaneous amplitude modulation and frequency modulation of each harmonic in the speech waveform is determined. A logarithm of the instantaneous frequency of the speech fundamental frequency is determined, for example, by computing a weighted average of the frequency modulations of the harmonics. An output signal is formed having the logarithm of the frequency of the thus determined speech fundamental and the logarithms of the amplitude modulation for the ten lowest frequency speech harmonics and/or the speech envelope.

...read moreread less

Patent•DOI•

Auditory model for parametrization of speech

[...]

Hynek Hermansky¹, Nelson Morgan¹, Philip D. Kohn¹•Institutions (1)

International Computer Science Institute¹

11 Aug 1992-Journal of the Acoustical Society of America

TL;DR: In this paper, a method and system are provided for alleviating the harmful effects of convolutional distortions of speech, such as the effect of a telecommunication channel, on the performance of an automatic speech recognizer (ASR).

...read moreread less

Abstract: A method and system are provided for alleviating the harmful effects of convolutional distortions of speech, such as the effect of a telecommunication channel, on the performance of an automatic speech recognizer (ASR). The technique is based on the filtering of time trajectories of an auditory-like spectrum derived from the Perceptual Linear Predictive (PLP) method of speech parameter estimation.

...read moreread less

Proceedings Article•DOI•

Improved phonetically-segmented vector excitation coding at 3.4 kb/s

[...]

S. Wang¹, Allen Gersho¹•Institutions (1)

University of California, Santa Barbara¹

23 Mar 1992

TL;DR: The improved PS-VXC coder operated by the authors has a subjective performance closely matching that of the 4.8 kb/s DoD CELP coder.

...read moreread less

Abstract: Several major modifications to the phonetically segmented vector excitation coding (PS-VXC) coder by the authors (1989, 1990) reported previously have resulted in enhanced speech quality while reducing the delay, complexity, and bit rate. Speech is segmented into variable-length phonetic classes and a VXC coding module is tailored to each class. Coding techniques include adaptive linear predictive coding (LPC) analysis and interpolation, two-stage excitation coding of onsets, comb filtering, modified perceptual weighting, and pitch contour smoothing. The improved PS-VXC coder operates at a peak rate of 3.4 kb/s with an average rate of 3.0 kb/s and has a subjective performance closely matching that of the 4.8 kb/s DoD CELP coder. >

...read moreread less

Proceedings Article•DOI•

Tree searched multi-stage vector quantization of LPC parameters for 4 kb/s speech coding

[...]

Bhaskar Bhattacharya¹, W.P. LeBlanc², S.A. Mahmoud², V. Cuperman¹•Institutions (2)

Simon Fraser University¹, Carleton University²

23 Mar 1992

TL;DR: The authors present a tree searched multi-stage vector quantization scheme which achieves spectral distortion lower than 1 dB with low complexity and good robustness using 24 b/frame and it is shown that TS-MSVQ significantly outperforms the split-codebook approach.

...read moreread less

Abstract: The authors present a tree searched multi-stage vector quantization (TS-MSVQ) scheme which achieves spectral distortion lower than 1 dB with low complexity and good robustness using 24 b/frame. The M-L search is used and it is shown that it achieves performance close to that of the optimal search for a relatively small M. The best performance/complexity trade-offs are obtained with relatively small size codebooks cascaded in a three-four stage configuration. Results for log-area ratio (LAR) and line spectral pain (LSP) parameters are presented. A training technique which reduces outliers at the expense of a slight average performance degradation is introduced. The robustness across different languages and input spectral shapings is studied. Finally, it is shown that TS-MSVQ significantly outperforms the split-codebook approach. >

...read moreread less

Journal Article•DOI•

A new efficient algorithm to compute the LSP parameters for speech coding

[...]

Samir Saoudi, Jean-Marc Boucher, Alain Le Guyader¹•Institutions (1)

CNET¹

01 Aug 1992-Signal Processing

TL;DR: Two new real functions defined from the reciprocal and antireciprocal parts of the predictor polynomials obtained from the split Levinson algorithm are proposed and shown to obey three-term recurrence relations.

...read moreread less

Journal Article•DOI•

Modeling and classification of natural sounds by product code hidden Markov models

[...]

J.P. Woodard

01 Jul 1992-IEEE Transactions on Signal Processing

TL;DR: A new structure called the product code HMM uses two independent HMM per class, one for spectral shape and one for gain, which outperformed the conventional structure with an accuracy of over 96% for three classes.

...read moreread less

Abstract: Linear predictive coding (LPC), vector quantization (VQ), and hidden Markov models (HMMs) are three popular techniques from speech recognition which are applied in modeling and classifying nonspeech natural sounds. A new structure called the product code HMM uses two independent HMM per class, one for spectral shape and one for gain. Classification decisions are made by scoring shape and gain index sequences from a product code VQ. In a series of classification experiments, the product code structure outperformed the conventional structure, with an accuracy of over 96% for three classes. >

...read moreread less

Proceedings Article•DOI•

Cinematic techniques for speech processing: temporal decomposition and multivariate linear prediction

[...]

C. Montacie, P. Deleglise, Frédéric Bimbot, M.-J. Caraty

23 Mar 1992

TL;DR: Using the original method developed by Laforia, a series of text-independent speaker recognition experiments, characterized by a long-term multivariate auto-regressive modelization, gives first-rate results without using more than one sentence.

...read moreread less

Abstract: Two models, the temporal decomposition and the multivariate linear prediction, of the spectral evolution of speech signals capable of processing some aspects of the speech variability are presented. A series of acoustic-phonetic decoding experiments, characterized by the use of spectral targets of the temporal decomposition techniques and a speaker-dependent mode, gives good results compared to a reference system (i.e., 70% vs. 60% for the first choice). Using the original method developed by Laforia, a series of text-independent speaker recognition experiments, characterized by a long-term multivariate auto-regressive modelization, gives first-rate results (i.e., 98.4% recognition rate for 420 speakers) without using more than one sentence. Taking into account the interpretation of the models, these results show how interesting the cinematic models are for obtaining a reduced variability of the speech signal representation. >

...read moreread less

Proceedings Article•DOI•

Kalman filtering techniques in speech coding

[...]

S. Crisafulli¹, J.D. Mills², Robert R. Bitmead¹•Institutions (2)

Australian National University¹, Tellabs²

23 Mar 1992

TL;DR: Simulation results reveal that KF based speech coding has significant advantage over the equivalent LP based systems, particularly when used with coarsely quantized measurements.

...read moreread less

Abstract: The use of Kalman filtering (KF) techniques in speech coding is investigated. The authors show that the common linear predictor (LP) is a special case of the KF based on an all-pole signal model. They also show that the KF algorithm provides fixed-lag smoothing at no additional complexity. Simulation results reveal that KF based speech coding has significant advantage over the equivalent LP based systems, particularly when used with coarsely quantized measurements. >

...read moreread less

Patent•DOI•

Speech recognition LSI system including recording/reproduction device

[...]

Motoaki Koyama¹•Institutions (1)

Toshiba¹

30 Sep 1992-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech segment detector is used to detect speech segments and a reference pattern memory for storing reference patterns, and a speech recognition section for comparing the detected speech segment detected by the detector with the reference patterns stored in the Reference Pattern Memory and selecting the reference pattern most similar to that of the speech segment.

...read moreread less

Abstract: A speech recognition LSI system comprises a speech segment detector for detecting a speech segment from a speech segment detected, a reference pattern memory for storing reference patterns, and a speech recognition section for comparing the speech segment detected by the detector with the reference patterns stored in the reference pattern memory and selecting the reference pattern most similar to that of the speech segment. The system further comprises a recording/reproduction device for recording the speech signal and for reproducing only the speech segment the speech segment detector has detected, so that an operator can hear the speech segment.

...read moreread less

Proceedings Article•DOI•

Low bit-rate quantization of LSP parameters using two-dimensional differential coding

[...]

Chih-Chung Kuo¹, Fu-Rong Jean¹, Hsiao-Chuan Wang¹•Institutions (1)

National Tsing Hua University¹

23 Mar 1992

TL;DR: A novel spectral coding method, two-dimensional differential line spectra pair coding (2DdLSP), is proposed, taking advantage of the strong inter-frame, and intra-frame correlation of LSP parameters to reduce the variance of the parameters to be quantized.

...read moreread less

Abstract: A novel spectral coding method, two-dimensional differential line spectra pair coding (2DdLSP), is proposed Taking advantage of the strong inter-frame, and intra-frame correlation of LSP parameters, a two-dimensional linear prediction technique is used to reduce the variance of the parameters to be quantized One scalar quantization and two vector quantization schemes are designed to quantize the 2-D prediction residuals Without further buffering delay, the spectral distortion of 1 dB/sup 2/ can be achieved at 19 b/frame when the frame period is 10 ms Both within- and out-of-training tests show the robustness of the method to speech data variance >

...read moreread less

Proceedings Article•DOI•

Improving the performance of a mixed excitation LPC vocoder in acoustic noise

[...]

Alan V. McCree¹, Thomas P. Barnwell¹•Institutions (1)

Georgia Institute of Technology¹

23 Mar 1992

TL;DR: The improved LPC vocoder performs much better in acoustic background noise, and it produces natural sounding speech in both quiet and noisy environments.

...read moreread less

Abstract: A number of improvements to the mixed excitation linear predictive coding (LPC) vocoder are presented. First, the authors have added more sophisticated frequency shaping of the pulse and noise in the mixture. They use a bandpass filter bank to attain a staircase approximation to any desired noise shape. Voicing strength in each frequency band is controlled by periodicity analysis of both the bandpass filtered speech and the bandpass speech envelope. Second, the authors have improved their pitch detection algorithm by using separate searches on the LPC residual and the input speech signal. Finally, they have added a fixed pulse shaping filter based on a spectrally flattened synthetic glottal pulse. The improved LPC vocoder performs much better in acoustic background noise, and it produces natural sounding speech in both quiet and noisy environments. >

...read moreread less

Proceedings Article•DOI•

Phoneme based speaker verification

[...]

M. Savic¹, J. Sorensen¹•Institutions (1)

Rensselaer Polytechnic Institute¹

23 Mar 1992

TL;DR: An approach to text-independent speaker verification that uses a two-stage classifier that consists of a speaker-independent phoneme detector trained to recognize a phoneme that is distinctive from speaker to speaker.

...read moreread less

Abstract: Text-independent speaker verification systems typically depend upon averaging over a long utterance to obtain a feature set for classification. However, not all speech is equally suited to the task of speaker verification. An approach to text-independent speaker verification that uses a two-stage classifier is presented. The first stage consists of a speaker-independent phoneme detector trained to recognize a phoneme that is distinctive from speaker to speaker. The second stage is trained to recognize the frames of speech from the target speaker that are admitted by the phoneme detector. A common feature vector based on the linear predictive coding (LPC) cepstrum is projected in different directions for each of these pattern recognition tasks. Results of tests using the described speaker verification system are shown. >

...read moreread less

Proceedings Article•DOI•

Improvements in 2.4 kbps high-quality speech coding

[...]

J. Haagen, H. Nielsen, S.D. Hansen

23 Mar 1992

TL;DR: An algorithm for 2.4 kb/s speech coding is described, which results in a better compromise between bit allocation for short-term quantization and residual coding and an improved high-frequency regeneration.

...read moreread less

Abstract: An algorithm for 2.4 kb/s speech coding is described. The main problem addressed is the coding of voiced speech. A way of coding the pitch structure is introduced. Compared with traditional coding schemes, it results in a better compromise between bit allocation for short-term quantization and residual coding. The coder uses vector quantization of the short-term parameters (line spectrum frequencies). The residual is lowpass filtered to obtain the baseband signal. Unvoiced frames are coded by means of a method based on repetition and interpolation of pitch pulses. The method exploits the high correlation between pitch pulses. Harmonic postfiltering is applied to obtain an improved high-frequency regeneration. >

...read moreread less

Proceedings Article•DOI•

Signal approximation via data-adaptive normalized Gaussian functions and its applications for speech processing

[...]

Shie Qian¹, D. Chen¹, K. Chen²•Institutions (2)

National Instruments¹, University of Maryland, Baltimore²

23 Mar 1992

TL;DR: A signal approximation via data-adaptive normalized Gaussian functions is presented, which resembles the traditional Gabor expansion, but it is more precise and efficient.

...read moreread less

Abstract: A signal approximation via data-adaptive normalized Gaussian functions is presented. This approach resembles the traditional Gabor expansion, but it is more precise and efficient. Numerical simulations for the speech signal are included to demonstrate the effectiveness of the new scheme. >

...read moreread less

Proceedings Article•DOI•

Real-time implementation of a 9.6 kbit/s ACELP wideband speech coder

[...]

R. Salami¹, Claude Laflamme¹, J.-P. Adoul¹•Institutions (1)

Université de Sherbrooke¹

06 Dec 1992

TL;DR: The real-time implementation of a wideband ACELP speech coder at 9.6 kb/s is presented and the quality of the encoded wideband speech was judged vastly superior to that of the original narrowband speech.

...read moreread less

Abstract: The real-time implementation of a wideband ACELP speech coder at 9.6 kb/s is presented. The coder is implemented on a TMS320C30 floating-point DSP chip. The attempt to implement an ACELP coder for wideband speech in real time results in 3-4 times more complexity than that for narrowband speech. Very efficient algorithms for searching the pitch and codebook parameters have been introduced. The pitch search was brought down to 20% of real time by the combination of an efficient open-loop approach and a decimation procedure. The excitation search complexity was significantly reduced by using two codebooks. The first models the main features in the excitation and is very efficiently searched using focused search. The second has a simple structure and does not need exhaustive search. The quality of the encoded wideband speech at 9.6 kb/s was judged vastly superior to that of the original narrowband speech. >

...read moreread less

Proceedings Article•

Segment based variable frame rate speech analysis and recognition using a spectral variation function.

[...]

Giovanni Flammia, Paul Dalsgaard, Ove Kjeld Andersen, Børge Lindberg

01 Jan 1992

Proceedings Article•DOI•

Study of voice packet reconstruction methods applied to CELP speech coding

[...]

M. Yong¹•Institutions (1)

Mansfield University of Pennsylvania¹

23 Mar 1992

TL;DR: Four voice packet reconstruction methods used for speech coded by code excited linear prediction (CELP)-type speech coders are described and their performance is discussed.

...read moreread less

Abstract: Four voice packet reconstruction methods used for speech coded by code excited linear prediction (CELP)-type speech coders are described. In the first method, the authors generalize the waveform substitution technique originally developed for the PCM coded speech to the CELP speech coding. In the second method, a priority level is assigned to each speech frame to protect against those perceptually important and hard-to-reconstruct speech frames being lost. The third and fourth methods both split the information bits in a frame into two groups of different levels of importance. In method three, the bits for representing the filter parameters are given high priority and bits for representing the excitation signals are given low priority. Method four is an embedded coding technique based on two-stage CELP. The four methods were tested in combination with a simulated voice activity and queuing model and their performance is discussed. >

...read moreread less

Patent•

Method and apparatus for smoothing pitch-cycle waveforms

[...]

Willem Bastiaan Kleijn¹•Institutions (1)

Bell Labs¹

14 Dec 1992

TL;DR: In this article, a method and apparatus for processing a reconstructed speech signal from an analysis-by-synthesis decoder are provided to improve the quality of reconstructed speech by using smoothing techniques.

...read moreread less

Abstract: A method and apparatus for processing a reconstructed speech signal from an analysis-by-synthesis decoder are provided to improve the quality of reconstructed speech. By operation of the invention, one or more traces in a reconstructed speech signal are identified. Traces are sequences of like-features in the reconstructed speech signal. The like-features are identified by time-distance data received from the long term predictor of the decoder. The identified traces are smoothed by one of the known smoothing techniques. A smoothed version of the reconstructed speech signal is formed by combining one or more of the smoothed traces. The original reconstructed speech signal may be that provided by a long term predictor of the decoder. Values of the reconstructed speech signal and smoothed speech signal may be combined based on a measure of periodicity in speech.

...read moreread less

Book•

Digital speech processing : speech coding, synthesis, and recognition

[...]

A. Nejat İnce

01 Jan 1992

TL;DR: The application of Audio/Speech Recognition for Military Requirements and Quality Evaluation of Speech Processing Systems is studied.

...read moreread less

Abstract: 1: Overview of Voice Communications and Speech Processing.- 2: The Speech Signal.- 3: Speech Coding.- 4: Voice Interactive Information Systems.- 5: Speech Recognition Based on Pattern Recognition Approaches.- 6: Quality Evaluation of Speech Processing Systems.- 7: Speech Processing Standards.- 8: Application of Audio/Speech Recognition for Military Requirements.- Selective Bibliography with Abstract.

...read moreread less

Patent•

Prioritization method and device for speech frames coded by a linear predictive coder

[...]

Mei Yong¹•Institutions (1)

Motorola¹

21 Sep 1992

TL;DR: In this paper, a priority assignment method and device for assigning a priority to a selected speech frame coded by a linear predictive coder based on at least two of: an energy of the speech frame, a log spectral distance between a frame and a frame immediately previous, and a pitch predictor coefficient for the selected frame.

...read moreread less

Abstract: A priority assignment method and device are set forth for assigning a priority to a selected speech frame coded by a linear predictive coder based on at least two of: an energy of the speech frame, a log spectral distance between a frame and a frame immediately previous, and a pitch predictor coefficient for the selected speech frame. The invention protects against loss of perceptually important and hard-to-reconstruct speech frames.

...read moreread less