scispace - formally typeset
Search or ask a question

Showing papers on "Speech coding published in 1969"


Patent
Bishnu S. Atal1
29 Oct 1969
TL;DR: In this article, a speech wave is represented by the output of a linear filter which simulates an acoustic tube and which is excited by a combination of a quasi-periodic pulse train and white noise.
Abstract: A short-time spectral analysis of a nonstationary signal, such as a speech signal, does not ordinarily yield control signal information sufficient for subsequent synthesis. However, more reliable control signals for a speech synthesizer can be obtained by making use of natural constraints, applicable to a speech wave, in the analysis procedure. For frequencies below 5 kHz., the human vocal tract can be modeled as an acoustic tube in which only plane waves propagate. Thus, for vowels and vowellike sounds, the speech output of the vocal tract at any instant of time can be assumed to be a weighted sum of its past values and the input to the vocal tract at that instant of time. In the described invention, a speech wave is represented by the output of a linear filter which simulates an acoustic tube and which is excited by a combination of a quasi-periodic pulse train and white noise. The parameters of this filter are derived from the speech wave such that the mean-squared error between the synthetic speech samples at the output of the filter and the input speech samples is minimum.

45 citations


Journal ArticleDOI
TL;DR: An electronic speech processor is described which provides an analog voltage output based on the difference signal between the first speech formant and the second, which was very good when the speech processor output was sampled and compared with previously recorded memory-stored data in a small digital computer.
Abstract: In the present age of scientific discovery, man has become more and more dependent on the use of electronic computers. As this powerful tool becomes more universally important in man's day-to-day existence, it becomes increasingly more annoying that he has to speak to it in its mode of communication, paper tape or punch cards; and not in his own, the spoken word. Even today in the very infancy of the computer age, the time required to do many computations is less than the time required to instruct the machine in how to do them. All this points to the need of a method of achieving machine recognition of speech. In this paper an electronic speech processor is described which provides an analog voltage output based on the difference signal between the first speech formant and the second. Machine recognition of numbers zero through nine was very good when the speech processor output was sampled and compared with previously recorded memory-stored data in a small digital computer.

20 citations


Journal ArticleDOI
Lawrence R. Rabiner1
TL;DR: A general model for speech synthesis by rule is presented along with a discussion of one specific implementation of the model, and specific recommendations as to areas of speech synthesis and speech production requiring further study are made.
Abstract: A general model for speech synthesis by rule is presented along with a discussion of one specific implementation of the model. The conversion from discrete input signals to continuous synthesizer control signals is performed by the synthesis strategy. The details of the synthesis strategy, including linguistic preprocessing of the input and separate but interdependent segmental and suprasegmental models, are described. An experimental evaluation of the specific model is included, along with specific recommendations as to areas of speech synthesis and speech production requiring further study.

18 citations


Journal ArticleDOI
D. Ling1
TL;DR: Results indicate that coded speech can be discriminated but that scores for linearly amplified speech are generally superior, and the implications are examined.
Abstract: Discrimination of conventional (linear) and frequency transposed (coded) forms of amplified speech by deaf children were studied in a series of twelve experiments. Findings relating to each are presented. Results indicate that coded speech can be discriminated but that scores for linearly amplified speech are generally superior. The implications of the results are examined.

5 citations


Journal ArticleDOI
L. O'Neill1
TL;DR: It has been demonstrated by digital simulation that with a proper selection of parameters both temporal waveshape and the spectrum can be preserved by this method, and a detailed investigation of coding techniques will be necessary before its efficiency can be compared to other approaches.
Abstract: The efficient transmission or processing of speech requires that a compromise be made between quality and bandwidth. Systems for bandwidth reduction, such as the vocoder, are usually designed to preserve the spectral content of the signal. High-quality systems, on the other hand, generally preserve waveshape by using high digital sampling rates. The determination of an adequate compromise is seriously impeded by the basic differences in these two approaches. The objective here is to investigate an analysis-synthesis procedure, that has been used to represent other signals, as a vehicle for determining this compromise. The continuous speech is divided arbitrarily into time periods and each period is expressed as a set of coefficients of an exponential expansion. The distinctive nature of speech is reflected in the choice of basis and analysis period rather than by special processing operations such as the pitch extraction of a vocoder. It has been demonstrated by digital simulation that with a proper selection of parameters both temporal waveshape and the spectrum can be preserved by this method. The statistically selected basis consists of ten pairs of damped sines and cosines and the experimentally chosen analysis period is 5.2 milliseconds. The coefficients of this expansion were measured by digital filtering on the computer. The simulated system is capable of synthesizing high-quality speech for speakers whose average pitch varied from 80 to 245 Hz without changing either the basis or the period. Although the feasibility of such a system has been demonstrated, a detailed investigation of coding techniques will be necessary before its efficiency can be compared to other approaches.

2 citations


Journal ArticleDOI
TL;DR: The underlying theme of the work is to design a frequency quantized, continuous, short-term spectrum analysis technique capable of extracting statistically invariant properties of the human speech pattern to a binary quantized representation produced by the measurement apparatus.
Abstract: The paper deals with the design of an analyser-coder for construction of a digital code pattern from a human speech input suitable for the use of computers from which the intelligence content can be extracted without any human intervention.In the first part of this paper the major functions necessary in any pattern recognition system are noted, aspects of decision theory as pertinent to the present work are mentioned and a theoretical model for speech analysis by synthesis is described very briefly to introduce the work.The second part of the paper describes the experimental part of the system, human speech analyser and matrix coder. Some of the circuitry are conventional, others are special circuits designed for critical applications.The underlying theme of the work is to design a frequency quantized, continuous, short-term spectrum analysis technique capable of extracting statistically invariant properties of the human speech pattern to a binary quantized representation produced by the measurement appar...

2 citations


Journal ArticleDOI
L. Schweizer1, H. Hoss
TL;DR: An encoder is described which features nonuniform qnantizing by combining the processes of encoding and compression by utilizing the feedback weighing principle and achieves compression in accordance with a 13-segment characteristic.
Abstract: An encoder is described which features nonuniform qnantizing by combining the processes of encoding and compression. The encoder utilizes the feedback weighing principle and achieves compression in accordance with a 13-segment characteristic. A distinguishing feature of the encoder is that it uses a certain redundancy in circuit elements instead of employing elements with close tolerances.

1 citations


Book ChapterDOI
01 Jan 1969
TL;DR: I should like to take advantage of the combined presence of neural theorists and experimental neurophysiologists to discuss two issues on which I think there has been some implicit disagreement.
Abstract: I should like to take advantage of the combined presence of neural theorists and experimental neurophysiologists to discuss two issues on which I think there has been some implicit disagreement that ought to be made explicit. The two issues are: (a) Does any aspect of the pattern of firing of neurons, beyond sheer frequency of firing, matter for the functioning of the nervous system? (b) Is the firing of individual neurons substantially less reliable than behavior?