Showing papers on "Speech coding published in 1978"

PDF

Open Access

Book•

[...]

05 Sep 1978

TL;DR: This paper presents a meta-modelling framework for digital Speech Processing for Man-Machine Communication by Voice that automates the very labor-intensive and therefore time-heavy and expensive process of encoding and decoding speech.

...read moreread less

Abstract: 1. Introduction. 2. Fundamentals of Digital Speech Processing. 3. Digital Models for the Speech Signal. 4. Time-Domain Models for Speech Processing. 5. Digital Representation of the Speech Waveform. 6. Short-Time Fourier Analysis. 7. Homomorphic Speech Processing. 8. Linear Predictive Coding of Speech. 9. Digital Speech Processing for Man-Machine Communication by Voice.

...read moreread less

3,103 citations

Journal Article•DOI•

All-pole modeling of degraded speech

[...]

Jae Lim¹, Alan V. Oppenheim¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jun 1978-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: This paper considers the estimation of speech parameters in an all-pole model when the speech has been degraded by additive background noise and develops a procedure based on maximum a posteriori (MAP) estimation techniques which is related to linear prediction analysis of speech.

...read moreread less

Abstract: This paper considers the estimation of speech parameters in an all-pole model when the speech has been degraded by additive background noise. The procedure, based on maximum a posteriori (MAP) estimation techniques is first developed in the absence of noise and related to linear prediction analysis of speech. The modification in the presence of background noise is shown to be nonlinear. Two suboptimal procedures are suggested which have linear iterative implementations. A preliminary illustration and discussion based both on a synthetic example and real speech data are given.

...read moreread less

590 citations

Journal Article•DOI•

Optimizing digital speech coders by exploiting masking properties of the human ear

[...]

Manfred R. Schroeder, B. S. Atal, J. L. Hall

01 Nov 1978-Journal of the Acoustical Society of America

TL;DR: New results of masking and loudness reduction of noise are reported and the design principles of speech coding systems exploiting auditory masking are described.

...read moreread less

Abstract: In any speech coding system that adds noise to the speech signal, the primary goal should not be to reduce the noise power as much as possible, but to make the noise inaudible or to minimize its subjective loudness. ’’Hiding’’ the noise under the signal spectrum is feasible because of human auditory masking: sounds whose spectrum falls near the masking threshold of another sound are either completely masked by the other sound or reduced in loudness. In speech coding applications, the ’’other sound’’ is, of course, the speech signal itself. In this paper we report new results of masking and loudness reduction of noise and describe the design principles of speech coding systems exploiting auditory masking.

...read moreread less

434 citations

Journal Article•DOI•

Adaptive noise canceling for speech signals

[...]

M. Sambur

01 Oct 1978-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Preliminary tests indicate that the least mean-square adaptive filtering approach for removing the deleterious effects of additive noise on the speech signal improves the perceived speech quality and increases the signal-to-noise ratio (SNR) by 7 dB in a 0 dB environment.

...read moreread less

Abstract: A least mean-square (LMS) adaptive filtering approach has been formulated for removing the deleterious effects of additive noise on the speech signal. Unlike the classical LMS adaptive filtering scheme, the proposed method is designed to cancel out the clean speech signal. This method takes advantage of the quasi-periodic nature of the speech signal to form an estimate of the clean speech signal at time t from the value of the signal at time t minus the estimated pitch period. For additive white noise distortion, preliminary tests indicate that the method improves the perceived speech quality and increases the signal-to-noise ratio (SNR) by 7 dB in a 0 dB environment. The method has also been shown to partially remove the perceived granularity of CVSD coded speech signals and to lead to an improvement in the linear prediction analysis/synthesis of noisy speech.

...read moreread less

207 citations

Proceedings Article•DOI•

A study of complexity and quality of speech waveform coders

[...]

José Tribolet¹, P. Noll¹, B. McDermott¹, R. Crochiere¹•Institutions (1)

Bell Labs¹

10 Apr 1978

TL;DR: This paper presents the results of a pilot study comparing four different speech waveform coding techniques of varying complexity, and conclusions are drawn concerning the quality and complexity, of different coding techniques.

...read moreread less

Abstract: This paper presents the results of a pilot study comparing four different speech waveform coding techniques of varying complexity. Coder transmission rates of 24, 16, and 9.6 Kb/s were used in the experiment. Subjective ratings and objective measurements of quality are obtained and compared. A number of conclusions are drawn concerning the quality and complexity, of different coding techniques. By comparing the objective measurements to the subjective ratings a number of conclusions are also drawn concerning the strengths and weaknesses of various (objective) quality measures of speech waveform coders.

...read moreread less

143 citations

Proceedings Article•DOI•

Predictive coding of speech signals and subjective error criteria

[...]

Bishnu S. Atal¹, Manfred R. Schroeder²•Institutions (2)

Bell Labs¹, University of Göttingen²

10 Apr 1978

TL;DR: Improved speech quality is obtained a) by efficient removal of formant and pitch related redundant structure of speech before quantizing and b) by effective masking of the quantizer noise by the speech signal.

...read moreread less

Abstract: Predictive coding methods attempt to minimize the r.m.s. error in the coded signal. However, the human ear does not perceive signal distortion on the basis of r.m.s. error regardless of its spectral shape relative to the signal spectrum. Specifically, for speech signals, the locations of the formant frequencies and their rates of change with time influence the audibility, and thus the subjective distortion of any quantizing noise. In this paper, methods for reducing the subjective distortion in predictive coders for speech siganls are described and evaluated. Improved speech quality is obtained a) by efficient removal of formant and pitch related redundant structure of speech before quantizing and b) by effective masking of the quantizer noise by the speech signal.

...read moreread less

94 citations

Patent•DOI•

System and method for speech recognition

[...]

John Marley

08 May 1978-Journal of the Acoustical Society of America

TL;DR: A system and method for speech recognition provides a means of printing phonemes in response to received speech signals utilizing inexpensive components and an algorithm for detecting major slope transitions of the analog speech signals.

...read moreread less

Abstract: A system and method for speech recognition provides a means of printing phonemes in response to received speech signals utilizing inexpensive components. The speech signals are inputted into an amplifier which provides negative feedback to normalize the amplitude of the speech signals. The normalized speech signals are delta modulated at a first sampling rate to produce a corresponding first sequence of digital pulses. The negative feedback signal of the amplifier is delta modulated at a second sampling rate to produce a second sequence of digital pulses corresponding to amplitude information of the speech signals. The speech signals are filtered and utilized to produce a digital pulse corresponding to high frequency components of the speech signals having magnitudes in excess of a threshold voltage. A microprocessor contains an algorithm for detecting major slope transitions of the analog speech signals in response to the first sequence of digital signals by detecting information corresponding to presence and absence of predetermined numbers of successive slope reversals in the delta modulator producing the first sequence of digital pulses. The algorithm computes cues from the high frequency digital pulse and the second sequence of pulses. The algorithm computes a plurality of speech waveform characteristic ratios of time intervals between various slope transitions and compares the speech waveform characteristic ratios with a plurality of stored phoneme ratios representing a set of phonemes to detect matching therebetween. The order of comparing is determined on the basis of the cues and a configuration of a phoneme decision tree contained in the algorithm. When a matching occurs, a signal corresponding to the matched phoneme is produced and utilized to cause the phoneme to be printed. In one embodiment of the invention, the speech signals are produced by the earphone of a standard telephone headset.

...read moreread less

60 citations

Proceedings Article•DOI•

A mixed-source model for speech compression and synthesis

[...]

John Makhoul¹, R. Viswanathan, Richard Schwartz, A. W. F. Huggins•Institutions (1)

BBN Technologies¹

10 Apr 1978

TL;DR: An excitation source model for speech compression and synthesis is presented, which allows for a degree of voicing by mixing voiced (pulse) and unvoiced (noise) excitations in a frequency-selective manner.

...read moreread less

Abstract: This paper presents an excitation source model for speech compression and synthesis, which allows for a degree of voicing by mixing voiced (pulse) and unvoiced (noise) excitations in a frequency-selective manner. The mix is achieved by dividing the speech spectrum into two regions, with the pulse source exciting the low-frequency region and the noise source exciting the high-frequency region. A parameter F c determines the degree of voicing by specifying the cut-off frequency between the voiced and unvoiced regions. For speech compression applications, F c can be extracted automatically from the speech spectrum and transmitted. Experiments using the new model indicate its power in synthesizing natural sounding voiced fricatives, and in largely eliminating the "buzzy" quality of vocoded speech. A functional definition of buzziness and naturalness is given in terms of the model.

...read moreread less

58 citations

Patent•DOI•

Parameter interpolator for speech synthesis circuit

[...]

George L. Brantingham¹•Institutions (1)

Texas Instruments¹

28 Apr 1978-Journal of the Acoustical Society of America

TL;DR: Using a parameter interpolator permits the data rate to the speech synthesis circuit to be lowered inasmuch as the incoming speech data is used to slowly charge the data previously inputted to the values of the incoming data.

...read moreread less

Abstract: Disclosed is a parameter interpolator for a speech synthesis circuit. Using a parameter interpolator permits the data rate to the speech synthesis circuit to be lowered inasmuch as the incoming speech data is used to slowly charge the data previously inputted to the values of the incoming data. The speech synthesis circuit includes an input circuit for receiving the target values of the speech data and a memory for stored interpolated values of the speech data. The interpolator includes a circuit coupled to the input circuit and the memory which calculates the difference between the target values and the stored values. Another circuit is used to add a portion of the difference to the values stored in the memory; the particular portion of the difference is equal to 1/2N where N=0, 1, 2 . . . Further, the interpolator is arranged to inhibit the normal interpolation upon certain conditions, such as changes from voiced speech to unvoiced speech, and visa versa.

...read moreread less

45 citations

Journal Article•DOI•

Tree-Encoding of Speech Using the (M, L)-Algorithm and Adaptive Quantization

[...]

Nuggehally Sampath Jayant¹, S. Christensen•Institutions (1)

Bell Labs¹

01 Sep 1978-IEEE Transactions on Communications

TL;DR: This paper shows the utility of using adaptive quantizers in the tree-encoding of speech waveforms based on the ( M, L ) algorithm, which can provide useful speech outputs at bit rates in the order of 24 kbits/s.

...read moreread less

Abstract: This paper shows the utility of using adaptive quantizers in the tree-encoding of speech waveforms based on the ( M, L ) algorithm [1]. Resulting adaptive differential PCM (ADPCM) and adaptive delta modulation (ADM) encoders, with time-invariant prediction networks, can provide useful speech outputs at bit rates in the order of 24 kbits/s; at 16 kbits/s, on the other hand, the encoders exhibit clearly perceptible amounts of quantization noise.

...read moreread less

41 citations

Journal Article•DOI•

Digital Dynamic Speech Detectors

[...]

P. Drago, Alcide Molinari, F. Vagliani

01 Jan 1978-IEEE Transactions on Communications

TL;DR: This paper proposes two dynamic-type speech detectors based on the same operational principle: the presence of the speech signal is detected by analyzing the dynamic variations of the short-time-power of the channel signal.

...read moreread less

Abstract: This paper proposes two dynamic-type speech detectors; their performances are described also by means of in-field experimental results. The two detectors are based on the same operational principle: the presence of the speech signal is detected by analyzing the dynamic variations of the short-time-power of the channel signal.

...read moreread less

Journal Article•DOI•

Time-encoded speech

[...]

R.A. King¹, W. Gosling²•Institutions (2)

United Kingdom Ministry of Defence¹, University of Bath²

20 Jul 1978-Electronics Letters

TL;DR: A new method of digitising speech waveforms is described, based on the comparison of successive segments of the waveform with a suitably stored catalogue of possible distinct shapes.

...read moreread less

Abstract: A new method of digitising speech waveforms is described, based on the comparison of successive segments of the waveform with a suitably stored catalogue of possible distinct shapes.

...read moreread less

Proceedings Article•DOI•

32 Kbps CCITT Compatible split band coding scheme

[...]

D. Esteban¹, C. Galand•Institutions (1)

IBM¹

01 Apr 1978

TL;DR: It has been observed that transparency and apparent channel performances are not affected and that the behavior of the proposed 32 kbps coder is considerably less affected by transmission errors than the 64 kbps.

...read moreread less

Abstract: This paper deals with the application of the SVCS (Split band Voice Coding Scheme) concept to the coding of a PCM channel at half the rate of the presently used 64 kbps CCITT standard. This 32 kbps coder is shown to meet the specifications as recommended by the CCITT for a PCM channel operating at 64 kbps (8kHz sampling, 8 bits/sample A-Law). In addition, channel performances have been evaluated with and without transmission errors for different types of signals ranging from tones and modem signals to voice signals. It has been observed that transparency and apparent channel performances are not affected and that the behavior of the proposed 32 kbps is considerably less affected by transmission errors than the 64 kbps. Examples of taped results assuming different error rates on a voice signal with both the 32 kbps SVCS and the 64 kbps A-Law will be played at the conference.

...read moreread less

Proceedings Article•DOI•

Perceptual and objective evaluation of speech processed by adaptive differential PCM

[...]

B. McDermott¹, C. Scagliola, D. Goodman•Institutions (1)

Bell Labs¹

01 Apr 1978

TL;DR: Overall subjective quality of speech processed by adaptive differential PCM is well predicted by segmental signal-to-noise ratio and even better by a linear combination of measures of granular distortion and overload distortion.

...read moreread less

Abstract: An experiment has been performed to study the perceptual characteristics of speech processed by ADPCM. We created 18 three-bit and four-bit coders spanning a wide range of quantizer adaptation parameters. Subjects judged the difference between each pair of coders and rated the quality of each coder individually. The difference data reveal three important perceptual dimensions (overall clarity, signal vs. background distortion, muffled vs. hoarse) which are related to various objective measures of coder performance. Overall subjective quality is well predicted by segmental SNR and even better by a linear combination of measures of granular distortion and overload distortion.

...read moreread less

Proceedings Article•DOI•

9.6/7.2 Kbps Voice excited predictive coder (VEPC)

[...]

Daniel Esteban¹, Claude Galand, D. Mauduit, J. Menez•Institutions (1)

IBM¹

01 Apr 1978

TL;DR: This coding scheme, in addition to the baseband excitation concepts, takes advantage of the association of recently published digital speech processing techniques such that transversal predictive coding, splitband coding by signal decimation/interpolation and adaptive block quantization.

...read moreread less

Abstract: This paper describes a common voice coding architecture based on a Voice Excited Predictive Coding (VEPC) scheme allowing operation at different bit rates : 9600, 7200 bps or below by simply modifying the bandwidth allocated to the coding of the baseband excitation signal. This coding scheme, in addition to the baseband excitation concepts, takes advantage of the association of recently published digital speech processing techniques such that transversal predictive coding, splitband coding by signal decimation/interpolation and adaptive block quantization. Simulations have shown that the proposed architecture allows to obtain a 'standard telephone quality' assuming a 300-3400 Hz telephone bandwidth at transmission rates below 9600 bps.

...read moreread less

Proceedings Article•DOI•

Studies on pattern recognition approach to voiced-unvoiced-silence classification

[...]

V. Sarma¹, D. Venugopal¹•Institutions (1)

Indian Institute of Science¹

10 Apr 1978

TL;DR: It is demonstrated that it is possible to achieve pattern recognition classification with much less computational effort by adopting a scheme based on the concept of variable decision space, using only three features and by avoiding the time consuming linear prediction analysis.

...read moreread less

Abstract: A pattern recognition approach for deciding whether a given segment of speech should be classified as voiced speech, unvoiced speech or silence based on a set of five measurements of the signal is given by Atal and Rabiner [1]. In this paper, we demonstrate that it is possible to achieve this classification with much less computational effort. These computational savings are mainly achieved by adopting a scheme based on the concept of variable decision space, using only three features and by avoiding the time consuming linear prediction analysis.

...read moreread less

Journal Article•DOI•

Optimum Weighted PCM for Speech Signals

[...]

C.-E. Sundberg

01 Jun 1978-IEEE Transactions on Communications

TL;DR: Weighted digital modulation schemes which provide bit error probabilities matched to the PCM bits with respect to their sensitivity to digital errors are analyzed and a channel signal to noise ratio gain in threshold extension of 2 dB is obtained for standard 8 bit PCM.

...read moreread less

Abstract: Weighted digital modulation schemes which provide bit error probabilities matched to the PCM bits with respect to their sensitivity to digital errors are analyzed. The channel is additive, white Gaussian. The PCM system has arbitrary code, companding law and input signal density function. Especially optimum weighted PSK/PCM and QAM/PCM are given for speech signals. The average channel signal to noise ratio is kept constant when schemes are compared. We obtain a channel signal to noise ratio gain in threshold extension of 2 dB for standard 8 bit PCM. The performance of suboptimum schemes, where the number of different bit error probability levels are smaller than the number of PCM bits are also studied. Two levels per 8 bit PCM word yield more than half of the achievable gain (in dB) and 4 levels is almost equal to optimum.

...read moreread less

Fast Algorithms for Speech Modeling.

[...]

Martin Morf, D T Lee

15 Dec 1978

TL;DR: The usefulness of the new approach for speech modeling has been successfully established after several parameter quantization methods were considered to achieve the desired low bit rates.

...read moreread less

Abstract: : This constitutes our final report on a research program aimed at the development of a high quality low data rate speech transmission system based on new types of speech modeling algorithms. Several such algorithms were developed and tested on simulated and real speech data. These algorithms have many desirable features including the capability of rapidly tracking time-varying model parameters. The best algorithm was used as the basis of a speech transmission system in order to test the quality of the speech models. The model parameters (reflection coefficients) together with pitch information and speech energy form a speech parameter vector to be transmitted and used to reconstruct the original speech. Several parameter quantization methods were considered to achieve the desired low bit rates. The various algorithms as well as the complete transmission system were coded and tested. Simulation results are very promising and the usefulness of our new approach for speech modeling has been successfully established. (Author)

...read moreread less

Journal Article•DOI•

An approach to secure voice communication based on the data encryption standard

[...]

M. Orceyre¹, R. Heller•Institutions (1)

IBM¹

01 Nov 1978-IEEE Communications Magazine

TL;DR: The matter of secure voice communication-enabling speakers to converse naturally over telephone media without fear that their conversation can be usefully intercepted-poses special problems and is receiving close attention within both the commercial and the Government sectors.

...read moreread less

Abstract: Telephone communications have been understood from their beginnings to be vulnerable to interception (unauthorized reception). In recent years, with increasing public and private sector reliance upon electronic media for communicating sensitive technical, financia’l, military, political, economic, and personal information, and with the rapidly increasing use of microwave and satellite telephone carrier media, concern about these vulnerabilities .has mounted dramatically. Starting in mid-1977 there has been considerable attention given in the news media to the matter of wholesale interception by foreign governments of American private and commercial voice and data communications. Publicly available documents note he ase with which such ommon carrier transmissions can be “captured” for subsequent analysis arid use by unauthorized listeners. Fig. 1 illustrates the many vulnerabilities of a typical public switched telephone network. Within this broad framework, the matter of secure voice communication-enabling speakers to converse naturally over telephone media without fear that their conversation can be usefully intercepted-poses special problems and is receiving close attention within both the commercial and the Government sectors.

...read moreread less

Journal Article•DOI•

Soft Decision Demodulation for PCM Encoded Speech Signals

[...]

C.-E. Sundberg¹•Institutions (1)

Lund University¹

01 Jun 1978-IEEE Transactions on Communications

TL;DR: This work has analyzed soft decision demodulation schemes for standard PCM encoded speech signals transmitted over the Gaussian channel with coherent PSK (phase shift keying) and obtained a signal to noise ratio gain in E_{b}/N_{0} of the order of 1-2 dB.

...read moreread less

Abstract: The effect of digital errors in PCM encoded speech signals transmitted over a noisy channel is reduced by using soft decision demodulation at the receiver. The reliability information supplied by the soft decision demodulator is used to point out likely transmission errors, especially in the most significant PCM bits. When a likely transmission error is identified, the corresponding PCM word is rejected by the receiver and replaced by a predictor estimate or an interpolation estimate if delayed decisions are used. We have analyzed soft decision demodulation schemes for standard PCM encoded speech signals transmitted over the Gaussian channel with coherent PSK (phase shift keying). A signal to noise ratio gain in E_{b}/N_{0} of the order of 1-2 dB is Obtained at low input signal levels. The gain depends on the performance of the predictor or, alternatively, the interpolator. No modifications of the transmitter are required to obtain this improvement. The suggested soft decision schemes are optional at the receiver. The comparisons are made with hard decision demodulation.

...read moreread less

Journal Article•DOI•

Voice Encoding for the Space Shuttle Using Adaptive Delta Modulation

[...]

Donald L. Schilling¹, J. Garodnick, H. Vang•Institutions (1)

City College of New York¹

01 Nov 1978-IEEE Transactions on Communications

TL;DR: Waveforms of the response of the delta modulators to channel errors are given and performance data that shows the relationship between channel errors and word intelligibility are included.

...read moreread less

Abstract: Two algorithms for voice encoding are described. One, the modified Abate, is a simplified version that is the type designed for the Space Shuttle. Waveforms of the response of the delta modulators to channel errors are given and performance data that shows the relationship between channel errors and word intelligibility are included. An analytic derivation yielding a comparison between PCM and adaptive delta modulation with respect to channel errors is also given.

...read moreread less

Journal Article•DOI•

Multipurpose Hardware for Digital Coding of Audio Signals

[...]

James D. Johnston¹, D. Goodman•Institutions (1)

Bell Labs¹

01 Nov 1978-IEEE Transactions on Communications

TL;DR: An adaptive coding system capable of representing audio frequency signals in a wide variety of digital code formats and sine wave signal-to-noise ratio measurements demonstrate important relationships among the main code classifications.

...read moreread less

Abstract: We describe an adaptive coding system capable of representing audio frequency signals in a wide variety of digital code formats. To provide adaptive quantization, the step size is adjusted by voltage controlled amplifiers operating with analog control signals. Sine wave signal-to-noise ratio measurements demonstrate important relationships among the main code classifications: pcm, apcm, dpcm, adpcm. The multipurpose coder has proved a useful laboratory aid in the design of special purpose coders. After adjustment and evaluation of parameters using the multipurpose coder one proceeds to an economical, specialized design.

...read moreread less

Patent•DOI•

Method of communicating digital speech data and a memory for storing such data

[...]

Richard H. Wiggins¹, George L. Brantingham¹•Institutions (1)

Texas Instruments¹

19 Jun 1978-Journal of the Acoustical Society of America

TL;DR: In this article, a method of communicating Digital Speech Data to a speech synthesis circuit is described. But the data is stored in a memory which is coupled to the speech synthesis circuits.

...read moreread less

Abstract: A method of communicating Digital Speech Data to a speech synthesis circuit. The data is compressed to on the order of 1000-1200 bits, per second for normal human speech. The speech synthesis circuit utilizes linear predictive coding techniques for producing high quality speech or other sounds. The data is preferably stored in a memory which is coupled to the speech synthesis circuit. The data has variable frame lengths; in the disclosed embodiment, four different frame lengths are described having frame lengths from four bits to forty-nine bits. The memory stores the variable frame length data and communicates the same to the speech synthesis circuit in response to certain control signals.

...read moreread less

Proceedings Article•DOI•

Linear predictive coding of speech signals in a high ambient noise environment

[...]

H. Kobatake¹, J. Inari², S. Kakuta²•Institutions (2)

Tokyo University of Agriculture and Technology¹, University of Tokyo²

10 Apr 1978

TL;DR: This paper describes a method of speech coding in a high ambient noise environment and shows that the spectral envelope of speech signal is a most reliable information when the noise reduction method proposed in this paper is used.

...read moreread less

Abstract: Preservation of both the spectral distribution and the periodicity of speech signals are essential in speech processing. This paper describes a method of speech coding in a high ambient noise environment and shows that the spectral envelope of speech signal is a most reliable information when the noise reduction method proposed in this paper is used. Also reported in this paper comparisons of several pitch extraction methods with extensive experimental data, based on which a pitch extraction method suited for noisy speech signals is proposed.

...read moreread less

Dissertation•

Enhancement and bandwidth compression of noisy speech by estimation of speech and its model parameters.

[...]

Jae Soo Lim

01 Aug 1978

TL;DR: This paper presents a meta-modelling system that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of computer programming called “ CAD/CAM”.

...read moreread less

Abstract: Thesis. 1978. Sc.D.--Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.

...read moreread less

Proceedings Article•DOI•

High quality adaptive predictive coding of speech

[...]

M. Berouti¹, J. Makhoul•Institutions (1)

BBN Technologies¹

01 Apr 1978

TL;DR: The pitch predictor is not useful on balance and should be eliminated, and the residual should be quantized with no clipping and encoded using a variable-length code, which seems to be adequate for all speech and all conditions.

...read moreread less

Abstract: We report on the results of research to code speech at 16 kbps under the condition that the quality of the transmitted speech be equal to that of the original. Some of the original speech had been corrupted by noise and distortions typical of long distance telephone lines. The rigorous requirements of this work led to a new outlook on adaptive predictive coding. We have found that the pitch predictor is not useful on balance and should be eliminated, and that the residual should be quantized with no clipping and encoded using a variable-length code. A single coding scheme seems to be adequate for all speech and all conditions. In addition, the adaptive predictive coding system has been modified to include a noise spectral shaping filter that effectively eliminates the perception of background granular noise.

...read moreread less

Journal Article•DOI•

Bit Rate Per Channel Halving in PCM Multiplexes by Speech Interpolation and Adaptive Quantization

[...]

Alcide Molinari, F. Vagliani

01 May 1978-IEEE Transactions on Communications

TL;DR: An all digital system, labeled PCM.RR is presented, which enables the doubling of traffic capacity of PCM links, by properly using "Adaptive Quantization and Speech Interpolation" performed by means of a "Speech Detector" that works directly on the A -law compressed digital signal.

...read moreread less

Abstract: An all digital system, labeled PCM.RR. is presented, which enables the doubling of traffic capacity of PCM links. This is obtained, although keeping the transmission quality impairment very close to the normal PCM standards, by properly using "Adaptive Quantization" and "Speech Interpolation" performed by means of a "Speech Detector" that works directly on the A -law compressed digital signal.

...read moreread less

Patent•

Multiplexing speech signals

[...]

Cochrane P

14 Mar 1978

TL;DR: In this paper, the frequency range of each speech channel is broken into sub-channels and each of these is considered separately for operational activity, and composite speech signals are then formed from the active frequency subchannels of individual speech channels and these are transmitted with coding signals indicative of their composition.

...read moreread less

Abstract: To transmit a number of individual speech channels over a smaller number of transmission channels, the frequency range of each speech channel is broken into sub-channels and each of these is considered separately for operational activity. Composite speech signals are then formed from the active frequency sub-channels of the individual speech channels and these are transmitted with coding signals indicative of their composition.

...read moreread less

Proceedings Article•DOI•

LMS Adaptive filtering for enhancing the quality of noisy speech

[...]

M. Sambur

01 Apr 1978

TL;DR: Preliminary tests indicate that the proposed linear mean square adaptive filtering approach improves the perceived speech quality and increases the signal to noise ratio (SNR) by 7 db in a 0 db environment.

...read moreread less

Abstract: A linear mean square (LMS) adaptive filtering approach has been formulated for removing the deleterious effects of additive noise on the speech signal; Unlike the classical LMS adaptive filtering scheme, the proposed method is designed to cancel out the clean true speech signal. This method takes advantage of the quasi-periodic nature of the speech signal to form an estimate of the clean speech signal at time t from the value of the signal at time t minus the estimated pitch period. For additive white noise distortion, preliminary tests indicate that the method improves the perceived speech quality and increases the signal to noise ratio (SNR) by 7 db in a 0 db environment. The method has also been preliminarily shown to remove the perceived granularity of CVSD coded speech signals and to lead to an improvement in the linear prediction analysis/synthesis of noisy speech.

...read moreread less

Journal Article•DOI•

Frequency domain techniques for speech coding

[...]

R. Crochiere, José Tribolet

01 Nov 1978-Journal of the Acoustical Society of America

TL;DR: The basic aspects of the design of these four operations particularly as they apply to low bit‐rate adaptive transform coding are reviewed.

...read moreread less

Abstract: Frequency domain techniques for speech coding have recently received considerable attention. The basic concept of these methods is to divide the speech into frequency components by a filter bank (subband coding) or by a suitable transform (transform coding) and then encode them using adaptive PCM. Four basic operations are involved in the design of these coders: (1) the type of transform or filter bank (analysis/synthesis), (2) the adaptive quantizer design (quantization theory), (3) the choice of bit allocation used by the quantizers (noise shaping and auditory masking), and (4) the control of the step‐size of the quantizers (spectral estimation). This paper briefly reviews the basic aspects of the design of these four operations particularly as they apply to low bit‐rate adaptive transform coding.

...read moreread less