Showing papers on "Linear predictive coding published in 1981"

PDF

Open Access

Journal Article•DOI•

Cepstral analysis technique for automatic speaker verification

[...]

01 Apr 1981-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: In this paper, a set of functions of time obtained from acoustic analysis of a fixed, sentence-long utterance are extracted by means of LPC analysis successively throughout an utterance to form time functions, and frequency response distortions introduced by transmission systems are removed.

...read moreread less

Abstract: This paper describes new techniques for automatic speaker verification using telephone speech. The operation of the system is based on a set of functions of time obtained from acoustic analysis of a fixed, sentence-long utterance. Cepstrum coefficients are extracted by means of LPC analysis successively throughout an utterance to form time functions, and frequency response distortions introduced by transmission systems are removed. The time functions are expanded by orthogonal polynomial representations and, after a feature selection procedure, brought into time registration with stored reference functions to calculate the overall distance. This is accomplished by a new time warping method using a dynamic programming technique. A decision is made to accept or reject an identity claim, based on the overall distance. Reference functions and decision thresholds are updated for each customer. Several sets of experimental utterances were used for the evaluation of the system, which include male and female utterances recorded over a conventional telephone connection. Male utterances processed by ADPCM and LPC coding systems were used together with unprocessed utterances. Results of the experiment indicate that verification error rate of one percent or less can be obtained even if the reference and test utterances are subjected to different transmission conditions.

...read moreread less

1,187 citations

Journal Article•DOI•

Rate-distortion speech coding with a minimum discrimination information distortion measure

[...]

Robert M. Gray, A. Gray, G. Rebolledo¹, John E. Shore²•Institutions (2)

Stanford University¹, United States Naval Research Laboratory²

01 Nov 1981-IEEE Transactions on Information Theory

TL;DR: An information theory approach to the theory and practice of linear predictive coded speech compression systems is developed and it is shown that a traditional LPC system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is a minimum discrimination information between a speech process model and an observed frame of actual speech.

...read moreread less

Abstract: An information theory approach to the theory and practice of linear predictive coded (LPC) speech compression systems is developed. It is shown that a traditional LPC system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is a minimum discrimination information between a speech process model and an observed frame of actual speech. This distortion measure is used in an algorithm for computer-aided design of block source codes subject to a fidelity criterion to obtain a 750-bits/s speech compression system that resembles an LPC system but has a much lower rate, a larger memory requirement, and requires no on-line LPC analysis. Quantitative and informal subjective comparisons are made among our system and LPC systems.

...read moreread less

217 citations

Journal Article•DOI•

The spectral envelope estimation vocoder

[...]

D. Paul¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 1981-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A low bit-rate vocoder designed for improved speech reproduction quality and robustness is described, designed around a new algorithm, the spectral envelope estimator, which forms the nucleus of the spectral analyzer.

...read moreread less

Abstract: This paper describes a low bit-rate vocoder designed for improved speech reproduction quality and robustness. The vocoder is designed around a new algorithm, the spectral envelope estimator, which forms the nucleus of the spectral analyzer. In addition to estimating the speech spectrum, the spectral analyzer also allows determination of a continuous estimate of the background noise spectrum, which is used for noise suppression. A maximum-likelihood pitch estimator, which shares the signal processing of the spectral envelope estimator, has been integrated into the vocoder to yield accurate pitch estimates of noisy speech. This system is capable of good quality speech reproduction at bit rates down to 2.4 kbits/s.

...read moreread less

130 citations

Speech analysis

[...]

A.P. Benguerel¹•Institutions (1)

University of British Columbia¹

01 Jun 1981

108 citations

Proceedings Article•DOI•

A tone oriented voice excited vocoder

[...]

Per Hedelin¹•Institutions (1)

Chalmers University of Technology¹

01 Apr 1981

TL;DR: An LPC base-band vocoder is developed and experiments have shown the coder to be robust to background noise and implementation aspects as well as simulation results are discussed.

...read moreread less

Abstract: An LPC base-band vocoder is developed. The novel feature concerns the coding of the base-band. A model is set up for the base-band as a set of modulated tones. Algorithms are presented for the extraction of amplitude and phase/frequency of the tones. Implementation aspects as well as simulation results are discussed. Total bit rates in the order of 3,2-4.8 kbits are possible where approximately one half of the bits represents the base-band coding. Experiments have shown the coder to be robust to background noise.

...read moreread less

67 citations

Journal Article•DOI•

Recursive windowing for generating autocorrelation coefficients for LPC analysis

[...]

Thomas P. Barnwell¹•Institutions (1)

Georgia Institute of Technology¹

01 Oct 1981-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Experimental results showed the speech quality to be comparable to and slightly better than that produced by an auto-correlation LPC vocoder using a Hamming window.

...read moreread less

Abstract: A method for recursively computing the autocorrelation estimates needed for LPC analysis in a vocoder environment has been developed theoretically and studied experimentally. The method has three specific advantages: 1) it requires very little memory for its implementation; 2) it is realized by a structure consisting of several identical modules; and 3) the effective window length may be changed without varying the structure. Experimental results showed the speech quality to be comparable to and slightly better than that produced by an auto-correlation LPC vocoder using a Hamming window.

...read moreread less

66 citations

Journal Article•DOI•

A modified autocorrelation method of linear prediction for pitch-synchronous analysis of voiced speech

[...]

Kuldip K. Paliwal¹, P.V.S. Rao¹•Institutions (1)

Tata Institute of Fundamental Research¹

01 Apr 1981-Signal Processing

TL;DR: A modified autocorrelation method of linear prediction is proposed for pitch-synchronous analysis of voiced speech that guarantees the stability of the estimated all-pole filter and is shown to perform better than the covariance and autcorrelation methods of linear Prediction.

...read moreread less

22 citations

Proceedings Article•DOI•

An optimization algorithm for determining the endpoints of isolated utterances

[...]

Hermann Ney¹•Institutions (1)

Philips¹

01 Apr 1981

TL;DR: An optimization technique for locating the initial and final points of utterances by means of dynamic programming and results are presented for end-point detection in a speaker recognition system using only the speech intensity as acoustic parameter.

...read moreread less

Abstract: This paper describes an optimization technique for locating the initial and final points of utterances. Acoustic parameters extracted from each signal segment are converted into a cost function versus time. An overall cost for the presence of a speech signal is introduced and is to be optimized with respect to the unknown initial and final points. The optimization is carried out by means of dynamic programming. The computation grows linearly with the number of segments. In a second stage, the locations of the obtained endpoints are refined by matching, transition templates against the input signal. Results are presented for end-point detection in a speaker recognition system using only the speech intensity as acoustic parameter.

...read moreread less

21 citations

Proceedings Article•DOI•

Pitch invariant frequency lowering with nonuniform spectral compression

[...]

B. Hicks¹, Louis D. Braida, Nathaniel I. Durlach•Institutions (1)

Massachusetts Institute of Technology¹

01 Apr 1981

TL;DR: A technique for improving speech reception for persons with high frequency hearing loss based on pitch invariant frequencg lowering of the short term spectral envelope is described and spectrographic examples illustrating the range of transformations achieved are presented.

...read moreread less

Abstract: A technique for improving speech reception for persons with high frequency hearing loss based on pitch invariant frequencg lowering of the short term spectral envelope is described. The speech signal is segmented pitch synchronously, processed by a technique described by oppenheim and Johnson (1) to achieve nonuniform spectral warping, dilated in time to achieve frequencg lowering, and resynthesized with the original periodicity. Both the overall bandwidth reduction and the relative compression of low and high frequency components can be specified. Spectrographic examples illustrating the range of transformations achieved are presented to illustrate the capability of the system.

...read moreread less

20 citations

Patent•DOI•

Digital voice processing system

[...]

Bruce Alan Fette¹, Rose M. Gibson¹, Donald P. McCabe¹•Institutions (1)

Motorola¹

08 Oct 1981-Journal of the Acoustical Society of America

TL;DR: A multiple rate voice processing system incorporating a complete linear predictive coding algorithm wherein the algorithm is partitioned among a plurality of integrated circuit chips so that all communications between chips occur at low data rates.

...read moreread less

Abstract: A multiple rate voice processing system incorporating a complete linear predictive coding algorithm wherein the algorithm is partitioned among a plurality of integrated circuit chips so that all communications between chips occur at low data rates.

...read moreread less

20 citations

Journal Article•DOI•

Universal tree encoding for speech

[...]

Yasuo Matsuyama¹, Robert M. Gray•Institutions (1)

Ibaraki University¹

01 Jan 1981-IEEE Transactions on Information Theory

TL;DR: A low-rate wave form coder for speech compression is designed using techniques from universal source coding, fake process tree encoding, and linear predictive coding to yield a fidelity that compares well with the best existing adaptive-waveform coder of the same rate.

...read moreread less

Abstract: A low-rate (about one bit per sample) waveform coder for speech compression is designed using techniques from universal source coding, fake process tree encoding, and linear predictive coding (LPC). The system does not require on-line adaptation or LPC analysis, yet it yields a fidelity that compares well with the best existing adaptive-waveform coder of the same rate.

...read moreread less

Proceedings Article•DOI•

Speech synthesis techniques

[...]

J. Solomon¹•Institutions (1)

National Semiconductor¹

01 Jan 1981

TL;DR: The panel will explore the many voices of the new IC speech synthesizers, including: 'rules'-generated speech, synthesis-by-analysis and Mozer waveform encoding.

...read moreread less

Abstract: The panel will explore the many voices of the new IC speech synthesizers, including: 'rules'-generated speech, synthesis-by-analysis and Mozer waveform encoding. Advantages of analog sampled-data filters versus digital filters, linear prediction (LPC) versus formant encoding and ways to beat the quality/ bit rate trade-offs will also be covered.

...read moreread less

Patent•

Speech synthesizer with smooth linear interpolation

[...]

Bruce Alan Fette¹•Institutions (1)

Motorola¹

26 May 1981

TL;DR: A linear predictive coding (LPC) voice synthesizer as discussed by the authors is an integrated circuit on a single semiconductor chip, which circuit is programmed to provide the all pole lattice filter method of speech synthesis.

...read moreread less

Abstract: A linear predictive coding (LPC) voice synthesizer formed as an integrated circuit on a single semiconductor chip, which circuit is programmed to provide the all pole lattice filter method of speech synthesis. The apparatus smoothly interpolates between correlation coefficients during the synthesis operation.

...read moreread less

Proceedings Article•DOI•

A high speed array computer for dynamic time warping

[...]

D. Burr¹, Bryan D. Ackland¹, Neil Weste¹•Institutions (1)

Bell Labs¹

01 Apr 1981

TL;DR: This paper describes a CMOS integrated array processor for computing the dynamic time warp algorithm which allows many popular variations including LPC and frequency domain representations of speech.

...read moreread less

Abstract: Dynamic time warping is an established technique for time alignment and comparison of speech segments in speech recognition. This paper describes a CMOS integrated array processor for computing the dynamic time warp algorithm. It allows many popular variations including LPC and frequency domain representations of speech. High speed is obtained by extensive pipelining, parallel computation, and simultaneous matching of multiple patterns. A realistic application using 40 nine-component LPC vectors per word permits 10,000 word comparisons per second or, equivalently, real time recognition of a 10,000 word vocabulary.

...read moreread less

Journal Article•DOI•

New applications of channel vocoders

[...]

B. Gold¹, P. Blankenship¹, R.J. McAulay¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Feb 1981-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: In this paper recent innovations in channel vocoders are described that pertain to these requirements and the conclusion is drawn that the channel vocoder has significant potential for fulfilling the above needs.

...read moreread less

Abstract: Recent work in the field of speech digitization has led to the identification of a variety of new requirements for the speech terminal. Among these are the need for variable rate, robustness, very low rates, low cost, weight, and power. Coupled with these needs is the ever-present desire for higher quality systems. In this paper recent innovations in channel vocoders are described that pertain to these requirements and the conclusion is drawn that the channel vocoder has significant potential for fulfilling the above needs.

...read moreread less

Patent•

Method of and means for variable-rate coding of LPC parameters

[...]

Yeunung Chen¹, Michael J. McLaughlin¹•Institutions (1)

Motorola¹

10 Aug 1981

TL;DR: In this paper, a signal exhibiting redundancy is transmitted in a reduced bandwidth by performing a linear interpolation over a number of frames, where interpolated coefficients are tested against quantized values to see if they differ by no more than a threshold.

...read moreread less

Abstract: A signal exhibiting redundancy, such as speech subjected to linear predictive coding, is transmitted in a reduced bandwidth by performing a linear interpolation over a number of frames. Interpolated coefficients are tested against quantized values to see if they differ by no more than a threshold. If they do not, only the last frame is sent and intermediate values are reconstructed by interpolation. If the interpolated values differ by more than the threshold from the quantized values, the number of frames for interpolation is reduced and the interpolation is repeated. This is continued until either interpolation is successful or else the next consecutive frame is sent. The required bandwidth for transmission can be varied by varying the threshold, the maximum number of frames for interpolation, the number of LPC coefficients, or a combination of these.

...read moreread less

Proceedings Article•DOI•

Improving LPC analysis of noisy speech by autocorrelation subtraction method

[...]

C. Un¹, K. Choi•Institutions (1)

KAIST¹

01 Mar 1981

TL;DR: A robust linear predictive coding (LPC) method that can be used in noisy as well as quiet environment has been studied and a performance improvement of about 5 dB can be gained by using this method.

...read moreread less

Abstract: A robust linear predictive coding (LPC) method that can be used in noisy as well as quiet environment has been studied. In this method, noise autocorrelation coefficients are first obtained and updated during non-speech periods. Then, the effect of additive noise in the input speech is removed by subtracting values of the noise autocorrelation coefficients from those of autocorrelation coefficients of corrupted speech in the course of computation of linear prediction coefficients. When signal-to-noise ratio of the input speech ranges from 0 to 10 dB, a performance improvement of about 5 dB can be gained by using this method. The proposed method is computationally very efficient and requires a small storage area.

...read moreread less

Patent•DOI•

Speech synthesis system with variable interpolation capability

[...]

Alva E. Henderson¹•Institutions (1)

Texas Instruments¹

05 Oct 1981-Journal of the Acoustical Society of America

TL;DR: A speech synthesis system with a linear predictive filter as discussed by the authors utilizes coded reflection coefficients to produce digital signals representative of human speech, and a variable interpolation circuit within the linear predictive filters allows a variable number of interpolation steps to be calculated between successive values of reflection coefficients.

...read moreread less

Abstract: A speech synthesis system with a linear predictive filter. The linear predictive filter utilizes coded reflection coefficients to produce digital signals representative of human speech. A variable interpolation circuit within the linear predictive filter allows a variable number of interpolation steps to be calculated between successive values of reflection coefficients. Additionally, a user programmable option allows the user to select a linear, nonlinear, or a combination form of interpolation based on stored scale data. The synthesizer output circuit also functions as a digital-to-analog converter.

...read moreread less

Patent•DOI•

Speech synthesizer for use with computer and computer system with speech capability formed thereby

[...]

Leon W. Cox¹•Institutions (1)

Texas Instruments¹

21 Aug 1981-Journal of the Acoustical Society of America

TL;DR: The speech synthesizer is capable of electronically synthesizing human speech from coded speech data including parameters as stored either in a solid state memory on a permanent basis or alternatively as temporarily stored in another memory, wherein the codedspeech data is made available from an external source, such as a central processing unit of a commercial or home-type computer, as coupled to the speech synthesizers.

...read moreread less

Abstract: Speech synthesizer and a computer system having the speech synthesizer operably coupled thereto to provide speech capability for the computer system. The speech synthesizer is capable of electronically synthesizing human speech from coded speech data including parameters as stored either in a solid state memory on a permanent basis or alternatively as temporarily stored in another memory, wherein the coded speech data is made available from an external source, such as a central processing unit of a commercial or home-type computer, as coupled to the speech synthesizer. The speech synthesizer may be in the form of a speech module including a speech synthesizer processor for converting coded speech data into digital speech signals in combination with a mode selector which selectively applies either the coded speech data from a read-only-memory within the speech module or the coded speech data obtained from the external source to the speech synthesizer processor in response to a control signal provided by the external source for determining which of the two alternative operating modes will be employed in a given instance. The computer system is provided with speech capability by including the speech module as a component thereof in combination with a computer input device, the central processing unit of the computer, and an audio amplifier and speaker connected to a digital-to-analog converter of the speech module so as to generate audible human speech from the digital speech signals provided by the speech synthesizer processor of the speech module.

...read moreread less

Journal Article•DOI•

Text-To-Speech Using LPC Allophone Stringing

[...]

Kun-Shan Lin¹, Kathleen M. Goudie¹, Gene A. Frantz¹, George L. Brantingham¹•Institutions (1)

Texas Instruments¹

01 May 1981-IEEE Transactions on Consumer Electronics

TL;DR: A low cost voice response system is presented, which performs text-to-speech conversion of any English text, built around an LPC synthesizer chip and a microprocessor.

...read moreread less

Abstract: A low cost voice response system is presented, which performs text-to-speech conversion of any English text. The system is built around an LPC synthesizer chip and a microprocessor. Text-to-allophone rules are used to convert an input string of ASCII characters into allophonic codes. LPC parameters are then drawn from an allophone library, which takes very little storage space, and concatenated using a fast and simple algorithm to produce natural sounding speech.

...read moreread less

Proceedings Article•DOI•

Recent developments in vector quantization for speech processing

[...]

D. Wong, Biing-Hwang Juang, A. Gray

01 Apr 1981

TL;DR: Vector Quantization is applied to modify a 2400 bps LPC vocoder to operate at 800 bps, while retaining acceptable intelligibility and naturalness of quality, and several new properties are presented.

...read moreread less

Abstract: Vector Quantization is applied to modify a 2400 bps LPC vocoder to operate at 800 bps, while retaining acceptable intelligibility and naturalness of quality. The design of this speech compression system is discussed and compared to other very low bit rate vocoders. Advantages of vector quantization over a scalar technique are examined in detail, and several new properties are presented.

...read moreread less

Proceedings Article•DOI•

Mediumband speech processor with baseband residual spectrum encoding

[...]

G.S. Kang¹, L. Fransen, E. Kline•Institutions (1)

United States Naval Research Laboratory¹

01 Apr 1981

TL;DR: The Navy has developed a Multirate Processor (MRP), which generates digitized speech at 2.4, 9.6, and 16 kb/s by the linear predictive coding principle, and under various operational conditions, the Diagnostic Rhyme Test (DRT) scores of the MRP compare favorably to the DRT scores of an existing 16 kb /s rate Continuously Variable Slope Delta (CVSD) encoder.

...read moreread less

Abstract: The Navy has developed a Multirate Processor (MRP) which generates digitized speech at 2.4, 9.6, and 16 kb/s by the linear predictive coding principle. This multirate capability is achieved by embedding the 2.4 kb/s data in the 9.6 kb/s data stream and the 9.6 kb/s data in the 16 kb/s data stream. Conversion between the rates is accomplished by truncating a certain portion of the bits from the higher-data rate signal or appending extra bits to the lower-data rate signal. The MRP mediumband (9.6 kb/s or 16 kb/s) mode is a baseband residual excited LPC in which the baseband residual is transmitted in terms of Fourier spectral components. Under various operational conditions, the Diagnostic Rhyme Test (DRT) scores for the 9.6 kb/s rate of the MRP compare favorably to the DRT scores of an existing 16 kb/s rate Continuously Variable Slope Delta (CVSD) encoder.

...read moreread less

Patent•DOI•

Clipped speech-linear predictive coding speech processor

[...]

James M. Avery, Elmer August Wichita Ks Us Hoyer

11 Dec 1981-Journal of the Acoustical Society of America

TL;DR: In this paper, an input signal representative of the spoken utterance is passed through a clipper to generate a clipped input signal, and a sampler generates a plurality of discrete binary values, each discrete binary value corresponding to a sample value of the clipped signal.

...read moreread less

Abstract: The present invention relates to a speech recognition system and the method therefor, which analyzes a sampled clipped speech signal for identifying a spoken utterance. An input signal representative of the spoken utterance is passed through a clipper to generate a clipped input signal. A sampler generates a plurality of discrete binary values, each discrete binary value corresponding to a sample value of the clipped input signal. A processor then analyzes the plurality of sample values thereby identifying the spoken utterance. Analysis includes determining linear prediction coefficients of the autocorrelation function of speech utterences.

...read moreread less

Journal Article•DOI•

On the effects of varying analysis parameters on an LPC-based isolated word recognizer

[...]

Lawrence R. Rabiner, J. G. Wilpon, J. Ackenhusen

08 Jul 1981-Bell System Technical Journal

TL;DR: The results showed that system performance was best with an analysis parameter set equivalent to what is currently being used in the computer simulations, and that variations in parameter values that reduced computation also degraded performance, whereas variations in parameters that increased computation did not lead to improved performance.

...read moreread less

Abstract: For practical hardware implementations of isolated-word recognition systems, it is important to understand how the feature set chosen for recognition affects the overall performance of the recognizer. In particular, we would like to determine whether hardware implementations could be simplified by reducing computation and memory requirements without significantly degrading overall system performance. The effects of system bandwidth (both in training and testing the recognizer) on the performance must also be considered since the conditions under which the system is used may be different than those under which it was trained. Finally, we must take account of the effects of finite word-length implementations, on both the computation of features and of distances, for the system to properly operate. In this paper we present the results of a study to determine the effects on recognition error rate of varying the basic analysis parameters of a linear predictive coding (LPC) model of speech. The results showed that system performance was best with an analysis parameter set equivalent to what is currently being used in the computer simulations, and that variations in parameter values that reduced computation also degraded performance, whereas variations in parameter values that increased computation did not lead to improved performance.

...read moreread less

Proceedings Article•DOI•

Speech coding using modulo-PCM with side information

[...]

V. Ramamoorthy

01 Apr 1981

TL;DR: The application of a new source coding scheme called Modulo-PCM with side information to speech coding is studied and the performance characteristics of an adaptive and a non-adaptive scheme are evaluated using two speech utterances.

...read moreread less

Abstract: The application of a new source coding scheme called Modulo-PCM with side information to speech coding is studied. The performance characteristics of an adaptive and a non-adaptive scheme are evaluated using two speech utterances.

...read moreread less

Proceedings Article•DOI•

Speech deconvolution by recursive ARMA lattice filters

[...]

Benjamin Friedlander, S. Maitra

01 Apr 1981

TL;DR: This paper presents a recursive pole-zero lattice form for speech analysis based on the recently developed square-root normalized lattice forms, and a comparison between the performance of AR and ARMA lattice filters is presented.

...read moreread less

Abstract: All-zero filters in tapped-delay-line or lattice implementations are commonly used for speech deconvolution. The analysis techniques are mostly non-recursive, operating on a block of data at a time. In this paper we present a recursive pole-zero lattice form for speech analysis. The algorithm is based on the recently developed square-root normalized lattice forms. A comparison between the performance of AR and ARMA lattice filters is presented, using synthetic data. Preliminary results using speech data are also discussed.

...read moreread less

Proceedings Article•DOI•

Split-band APC System for low bit-rate encoding of speech

[...]

B. Atal¹, J. Remde•Institutions (1)

Bell Labs¹

01 Apr 1981

TL;DR: A split-band adaptive predictive coding system for digital transmission of speech signals that division of the prediction residue signal into many frequency bands results in more accurate pitch prediction - particularly, at low frequencies.

...read moreread less

Abstract: We describe a split-band adaptive predictive coding system for digital transmission of speech signals. In this system, the prediction residue signal obtained after spectral prediction is filtered into 2 or more frequency bands. Each of the filtered signals is reduced further by pitch prediction and is quantized by a 15-level noise feedback quantizer. The input to the quantizer is severely center-clipped to produce a quantized signal with low entropy. The division of the prediction residue signal into many frequency bands results in more accurate pitch prediction - particularly, at low frequencies. The split-band system uses separate quantizers for each frequency band. The step size of the quantizer and the center-clipping threshold can thus be adjusted to optimize speech quality in each band.

...read moreread less

Proceedings Article•DOI•

A model for short-time phase prediction of speech

[...]

Luís B. Almeida¹, José Tribolet•Institutions (1)

Instituto Superior Técnico¹

01 Apr 1981

TL;DR: This paper discusses a form of non-linear prediction, namely, the prediction of the phase of speech signals, based upon a new treatment of the classical speech production model within a short-time analysis/synthesis framework.

...read moreread less

Abstract: Prediction plays a key role in many signal processing applications. Linear Prediction has, in particular, been extremely useful to the development of digital speech processing techniques and applications. There is however a growing need for improved forms of prediction. We discuss, in this paper, a form of non-linear prediction, namely, the prediction of the phase of speech signals. This study is conducted within a short-time analysis/synthesis framework and is based upon a new treatment of the classical speech production model. Experimental data are presented confirming the theoretical results. Finally the use of phase prediction to low-bit rate, high-quality coding applications is discussed.

...read moreread less

Proceedings Article•DOI•

A technique for perceptually reducing periodically structured noise in speech

[...]

R. Cox¹, D. Malah•Institutions (1)

Bell Labs¹

01 Apr 1981

TL;DR: The recently developed time domain harmonic sealing (TDHS) algorithm has been found to be the basis for an effective enhancement technique and a class of windows for its implementation is established.

...read moreread less

Abstract: Periodically structured noise is noise which occurs randomly but with a fixed or slowly varying period. The noise periodicity is usually due to some underlying process, such as block processing of the speech where discontinuities between successive blocks result. This type of noise permeates the entire speech spectrum and is not removable by standard filtering techniques. The recently developed time domain harmonic sealing (TDHS) algorithm has been found to be the basis for an effective enhancement technique. In this paper we discuss the underlying theory of this technique and establish a class of windows for its implementation. As an example the frame rate noise of adaptive transform coding was perceptually reduced using this technique. Results from a subjective testing experiment using ATC coded speech with bit rates of 7.2 to 16 Kb/s indicated an improvement in quality equivalent to an increase in code rate of 2.4 to 3 Kb/s for speech originally coded at 7.2 to 12 Kb/s.

...read moreread less

Journal Article•DOI•

Quality improvement of synthesized speech in noisy speech analysis-synthesis processing

[...]

Hiromi Nagabuchi¹, Tsutomu Kobayashi¹•Institutions (1)

Nippon Telegraph and Telephone¹

01 Sep 1981-Electronics and Communications in Japan Part I-communications

TL;DR: In this paper, the authors proposed a method of improving the quality of noise-de-graded synthesized speech by using noise reduction methods which are based on the previously proposed comb filtering in the frequency region for the case that the noisy speech signal is processed by the PARCOR analysis-synthesis method.

...read moreread less

Abstract: Speech analysis-synthesis is an effective method for low bit-rate speech coding. However, it has been pointed out that system performance degrades as noise is added to the input speech. In this paper, we describe a method of improving the quality of noise-de-graded synthesized speech by using noise reduction methods which are based on the previously proposed comb filtering in the frequency region for the case that the noisy speech signal is processed by the PARCOR analysis-synthesis method. Quality degradation of synthesized speech due to additive noise is caused mainly by an increase in spectral distortion. In the speech analysis method proposed in this paper, the basic frequency (pitch) of the speech is stably extracted from noisy speech and spectral distortion is eliminated by obtaining the spectral envelope parameter after reducing the noise from the input speech based on this pitch information. Furthermore, the proposed method prevents degradation of the quality of the synthesized speech by using the extracted pitch as the pitch parameter of the analysis-synthesis system. As a result of auditory experiments, it is shown that the subjective speech quality and intelligibility of the synthesized speech are improved by the proposed method. In addition, we obtain some clues to the configuration of a speech analysis-synthesis system which is resistant to additive noise.

...read moreread less