scispace - formally typeset
Search or ask a question

Showing papers on "Voice activity detection published in 1974"


Journal ArticleDOI
E. Lyghounis1, I. Poretti, G. Monti
TL;DR: The results obtained from measurement studies carried out with the ATIC equipment have enabled the determination of the criteria which must be followed in the selection of the most suitable interpolation procedures, and it has been possible to analyze the precise working characteristics under normal operating conditions.
Abstract: In a telephone conversation a channel is activated by the voice only 25 percent of the time. It is therefore possible to provide a number of m connections with a number of n channels, where n is less than m . When the number of connections is large, m/n (gain) tends to the inverse of the activity. The interpolation procedures used in overload conditions and the resulting degradations, as well as the typical circuits of a voice interpolation apparatus, are examined. Also, two very highly interesting developments are herein illustrated, the ATIC equipment and a single-channel digital speech interpolation (DSI) developed only for measurement purposes. The results obtained from measurement studies carried out with the aforementioned apparatus have enabled the determination of the criteria which must be followed in the selection of the most suitable interpolation procedures. Besides examining the complexity and the reliability which the DSI equipment calls for, it has been possible to analyze the precise working characteristics under normal operating conditions.

26 citations


Journal ArticleDOI
J.-P. Haton1
TL;DR: An application of a real-time isolated-word recognition system, to the recognition of sentences of a language used in numerical command of machine tools, which will then be adapted to the Recognition of connected speech, as well as in the design of new languages for man-machine voice communication.
Abstract: In this paper we describe an application of a real-time isolated-word recognition system, to the recognition of sentences of a language used in numerical command of machine tools. The acoustic level operates with dynamic matching procedure and knowledge about syntactics and semantics of the language is used to predict the incoming words. With such a syntax-directed system, real-time recognition of sentences pronounced word-by-word is very accurately achieved, even for several speakers. This system will then be adapted to the recognition of connected speech, as well as in the design of new languages for man-machine voice communication.

16 citations


01 Jan 1974
TL;DR: The intrisic characteristics and associated attractive features and problem areas of speech as a man-computer communication channel are investigated and the requirements these place on the application of speech understanding systems are discussed.
Abstract: : The report investigates the intrisic characteristics and associated attractive features and problem areas of speech as a man-computer communication channel. Speech is independent of the manual and visual channels normally used for communicating with computers; it may permit simultaneous communication with both men and machines, and it can be used while the speaker is in motion, behind obstructions, and in total darkness. Any telephone instrument can become a computer terminal. Effective techniques for automatic speech recognition by computers are lacking. In a few years the results of current efforts in speech understanding research should make spoken communication with computers technically and economically feasible. The report discusses the general characteristics of man-computer tasks and interaction and the requirements these place on the application of speech understanding systems.

8 citations


Journal ArticleDOI
TL;DR: A speech intelligibility measurement system that uses a computer to administer the test, record the listener response, and automatically evaluate it on-line, and is being used in the development and evaluation of analysis-synthesis type of speech compression systems and for identifying perceptually important parameters from the linear prediction model of speech.
Abstract: A speech intelligibility measurement system is described that uses a computer to administer the test, record the listener response, and automatically evaluate it on-line. This makes the intelligibility testing conditions uniform at all times and the test more efficient compared to conventional methods. The test words are presented in random scramblings by using a shuffling algorithm. The listener's response is entered via a graphic tablet used as a "software keyboard." The response evaluation is done based on the similarity of sounds and not of spellings. The system is being used in the development and evaluation of analysis-synthesis type of speech compression systems and for identifying perceptually important parameters from the linear prediction model of speech. Adaption of this system to various speech perception experiments is also discussed.

7 citations


Journal ArticleDOI
TL;DR: In analysis/synthesis systems for the digital coding of speech, the synthesis control information is normally required in ‘frames’ arriving at a constant rate, so a considerable reduction of frame rate is possible by transmitting appropriately selected frames, and deriving intermediate frames from those transmitted.
Abstract: In analysis/synthesis systems for the digital coding of speech, the synthesis control information is normally required in ‘frames’ arriving at a constant rate. At the expense of a small delay, a considerable reduction of frame rate is possible by transmitting appropriately selected frames, and deriving intermediate frames from those transmitted.

7 citations


Patent
16 Sep 1974
TL;DR: In this article, a system for enabling control data to be inserted into a speech memory of a time division switching system in place of speech samples is described, where the control data may then be used for maintenance and test purposes.
Abstract: A system is disclosed for enabling control data to be inserted into a speech memory of a time division switching system in place of speech samples. The control data may then be used for maintenance and test purposes. In the system, a control memory is provided with cells associated with a speech memory row which normally stores a speech sample. Depending upon the condition of the appropriate control memory cells, either the speech sample or the control data is inserted into the speech memory row. Control circuits provide appropriate switching signals during half-time of the TDM elementary time.

5 citations


01 Dec 1974
TL;DR: The authors have developed several methods for reducing the redundancy in the speech signal without sacrificing speech quality, including preemphasis of the incoming speech signal, adaptive optimal selection of predictor order, optimal selection and quantization of transmission parameters, variable frame rate transmission, optimal encoding, and improved synthesis methodology.
Abstract: : This report describes work in developing a linear predictive speech compression system that transmits high quality speech at low bit rates. The authors have developed several methods for reducing the redundancy in the speech signal without sacrificing speech quality. Included among these methods are preemphasis of the incoming speech signal, adaptive optimal selection of predictor order, optimal selection and quantization of transmission parameters, variable frame rate transmission, optimal encoding, and improved synthesis methodology. When all of these were incorporated a floating point simulation of a pitch-excited linear predictive vocoder, synthesized speech with high quality at average transmission rates as 1500 bps was obtained.

4 citations


Journal ArticleDOI
TL;DR: Current scientific efforts in the field of digital processing of speech are focused at improving the efficiency in the present state of the art, and of developing new digital speech communication systems.
Abstract: Current scientific efforts in the field of digital processing of speech are focused at the aims of improving the efficiency in the present state of the art, and of developing new digital speech communication systems. Therefore, thorough studies on the statistical characteristics of speech signals, speech coding, speech recognition, and speech synthesis are necessary. Recent results and actual trends are reviewed in this paper.

3 citations



Journal ArticleDOI
TL;DR: An economical (< 600-dollar) hardware realization of a 4-kHz digital linear predictive speech synthesizer which requires, at most, a CPU overhead of about 40 percent real time and permits the utilization of formant concatenation techniques and reduces the coefficient storage required to specify vowels/voiced consonants by about 60 percent.
Abstract: Speech analysis/synthesis algorithms utilizing linear prediction coefficients have certain advantages over those employing formantbased techniques. For example, 4-kHz speech samples may be synthesized using a basic sequence of 10 multiply/adds followed by a single addition of the current sample of the excitation function. Real-time software synthesis of 4-kHz speech is possible (using this technique) on certain 16-b minicomputers, but the central processing unit (CPU) overhead may approach 100 percent. We describe an economical (< 600-dollar) hardware realization of a 4-kHz digital linear predictive speech synthesizer which requires, at most, a CPU overhead of about 40 percent real time. The device is constructed of standard TTL/MOS logic and consists (essentially) of a high speed 2's complement multiplier/adder capable of calculating a 26-b product (10-b speech samples, 16-b coefficients) in 0.33 μs, and a dual shift register. In addition, a procedure is discussed which enables the device to be used both as a formant synthesizer for vowels or voiced consonant production, and as a predictive synthesizer for other speech sounds. This procedure, hybrid synthesis, permits the utilization of formant concatenation techniques and reduces the coefficient storage required to specify vowels/voiced consonants by about 60 percent.

3 citations


ReportDOI
01 Jul 1974
TL;DR: The objective is to conceive and demonstrate the feasibility of two or more improved strategies to estimate and encode the excitation parameters of human speech to excite a time-varying vocal tract 'filter' in the synthesizer.
Abstract: : This study is aimed at the broad goal of the DoD Secure Voice Consortium to develop hardware models of improved narrow-band voice coders. The study is focused on the 'pitch and voicing' problem. The objective is to conceive and demonstrate the feasibility of two or more improved strategies to estimate and encode the excitation parameters of human speech. The decoded parameters will be used to excite a time-varying vocal tract 'filter' in the synthesizer.


01 Jan 1974
TL;DR: In this paper, a VOX technique for reducing noise in voice communication systems is described which is based on the separation of voice signals into contiguous frequency-band components with the aid of an adaptive VOX in each band.
Abstract: A VOX technique for reducing noise in voice communication systems is described which is based on the separation of voice signals into contiguous frequency-band components with the aid of an adaptive VOX in each band. It is shown that this processing scheme can effectively reduce both wideband and narrowband quasi-periodic noise since the threshold levels readjust themselves to suppress noise that exceeds speech components in each band. Results are reported for tests of the adaptive VOX, and it is noted that improvements can still be made in such areas as the elimination of noise pulses, phoneme reproduction at high-noise levels, and the elimination of distortion introduced by phase delay.