Showing papers on "Voice activity detection published in 1979"

PDF

Open Access

Journal Article•DOI•

Suppression of acoustic noise in speech using spectral subtraction

[...]

S. Boll¹•Institutions (1)

01 Apr 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

...read moreread less

Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

...read moreread less

4,862 citations

Journal Article•DOI•

Predictive coding of speech signals and subjective error criteria

[...]

Bishnu S. Atal¹, Manfred R. Schroeder²•Institutions (2)

Bell Labs¹, University of Göttingen²

01 Jun 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Improved speech quality is obtained by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and by effective masking of the quantizer noise by the speech signal.

...read moreread less

Abstract: Predictive coding methods attempt to minimize the rms error in the coded signal. However, the human ear does not perceive signal distortion on the basis of rms error, regardless of its spectral shape relative to the signal spectrum. In designing a coder for speech signals, it is necessary to consider the spectrum of the quantization noise and its relation to the speech spectrum. The theory of auditory masking suggests that noise in the formant regions would be partially or totally masked by the speech signal. Thus, a large part of the perceived noise in a coder comes from frequency regions where the signal level is low. In this paper, methods for reducing the subjective distortion in predictive coders for speech signals are described and evaluated. Improved speech quality is obtained: 1) by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and 2) by effective masking of the quantizer noise by the speech signal.

...read moreread less

376 citations

Journal Article•DOI•

Adaptive noise spectral shaping and entropy coding in predictive coding of speech

[...]

John Makhoul¹, M. Berouti¹•Institutions (1)

BBN Technologies¹

01 Feb 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Research to code speech at 16 kbit/s with the goal of having the quality of the coded speech be equal to that of the original is reported, finding that the pitch predictor is not cost-effective on balance and may be eliminated.

...read moreread less

Abstract: We report on research to code speech at 16 kbit/s with the goal of having the quality of the coded speech be equal to that of the original. Some of the original speech had been corrupted by noise and distortions typical of long-distance telephone lines. The basic structure chosen for our system was adaptive predictive coding. However, the rigorous requirements of this work led to a new outlook on the different aspects of adaptive predictive coding. We have found that the pitch predictor is not cost-effective on balance and may be eliminated. Solutions are presented to deal with the two types of quantization noise: clipping and granular noise. The clipping problem is completely eliminated by allowing the number of quantizer levels to increase indefinitely. An appropriate self-synchronizing variable-length code is proposed to minimize the average data rate; the coding scheme seems to be adequate for all speech and all conditions tested. The granular noise problem is treated by modifying the predictive coding system in a novel manner to include an adaptive noise spectral shaping filter. A design for such a filter is proposed that effectively eliminates the perception of granular noise.

...read moreread less

99 citations

Journal Article•DOI•

Speech synthesis from concept: A method for speech output from information systems

[...]

S. J. Young, F. Fallside

01 Sep 1979-Journal of the Acoustical Society of America

TL;DR: The speech synthesis from concept system converts an input concept into speech by using a transformational grammar to generate a well‐formed English sentence and a word concatenation synthesizer to generate the actual speech output.

...read moreread less

Abstract: A synthesis method, called speech synthesis from concept, is described which has been designed specifically for providing speech output from information systems. It differs from conventional techniques in that data is passed from the information system to the speech synthesis system, not in the form of text or phonetic transcription, but in the form of an abstract structure called an input concept. The speech synthesis from concept system converts an input concept into speech by using a transformational grammar to generate a well‐formed English sentence and a word concatenation synthesizer to generate the actual speech output. The ’’top down’’ nature of this process reduces the computation required within the information system and enables high‐quality speech to be produced.

...read moreread less

69 citations

Patent•DOI•

Residual excited predictive speech coding system

[...]

Bishnu S. Atal¹•Institutions (1)

Bell Labs¹

30 Mar 1979-Journal of the Acoustical Society of America

TL;DR: In this paper, a speech signal is partitioned into intervals, and a set of coded prediction parameter signals, pitch period and voicing signals, and signals corresponding to the spectrum of the prediction error signal are produced.

...read moreread less

Abstract: In a speech processing arrangement for synthesizing more natural sounding speech, a speech signal is partitioned into intervals. For each interval, a set of coded prediction parameter signals, pitch period and voicing signals, and a set of signals corresponding to the spectrum of the prediction error signal are produced. A replica of the speech signal is generated responsive to the coded pitch period and voicing signals as modified by the coded prediction parameter signals. The pitch period and voicing signals are shaped responsive to the prediction error spectral signals to compensate for errors in the predictive parameter signals whereby the speech replica is natural sounding.

...read moreread less

48 citations

Proceedings Article•DOI•

On synthesizing natural-sounding speech by linear prediction

[...]

B. Atal¹, N. David•Institutions (1)

Bell Labs¹

01 Apr 1979

TL;DR: A modified analysis-synthesis procedure which, although relying on the basic LPC technique for analysis and synthesis, avoids spectral amplitude and phase distortions introduced by these techniques.

...read moreread less

Abstract: In speech analysis and synthesis based on linear prediction, it is a common assumption that predictor coeffcients contain all the necessary spectral and phase information for accurate synthesis of the speech signal. However, even under the best circumstances, the synthetic speech sounds unnatural to the critical listener. Subjective tests reveal that spectral errors introduced by the linear prediction analysis techniques are a major source of unnatural sound quality in synthetic speech. This paper describes a modified analysis-synthesis procedure which, although relying on the basic LPC technique for analysis and synthesis, avoids spectral amplitude and phase distortions introduced by these techniques. In new method, proper reproduction of speech spectrum at the receiver is ensured by transmitting the short-time spectrum of prediction residual to the receiver.

...read moreread less

39 citations

Journal Article•DOI•

Data Performance in a System Where Data Packets are Transmitted During Voice Silent Periods--Single Channel Case

[...]

M. Fischer

01 Sep 1979-IEEE Transactions on Communications

TL;DR: A mathematical analysis for the steady state performance of a system where voice calls and data packets are transmitted over the same channel is developed.

...read moreread less

Abstract: We develop a mathematical analysis for the steady state performance of a system where voice calls and data packets are transmitted over the same channel The voice calls have priority over the data packets, in that the data packets are transmitted only when there are no voice calls present in the system or the voice conversation is in a long silent period

...read moreread less

30 citations

Proceedings Article•DOI•

LPC voice digitizer with background noise suppression

[...]

D. Fulghum¹, J. Gunn•Institutions (1)

Raytheon Intelligence and Information Systems¹

01 Apr 1979

TL;DR: This paper describes a unique design that attacks two problem areas of LPC: noise suppression input level control and real time simulation/ test.

...read moreread less

Abstract: This paper describes a unique design that attacks two problem areas of LPC: noise suppression input level control and real time simulation/ test The noise level design uses algorithms to digitally process speech data before input to the LPC algorithm processor The LPC processor described in the paper is based on a microprocessor design conceived specifically for speech The noise suppression and level control algorithms are performed in a separate front end processor that detects noise patterns and deletes them from the normal voice input The operational hardware system is shown to the block diagram level as well as the particular simulation/test scheme Test results are also described in this paper

...read moreread less

30 citations

Patent•DOI•

Preprocessing method and device for speech recognition device

[...]

Nakajima Akira¹, Akira Ichikawa¹, Kazuo Nakata¹•Institutions (1)

Hitachi¹

08 May 1979-Journal of the Acoustical Society of America

TL;DR: In this article, the upper sideband of a sampled-speech signal with the original baseband signal for further signal processing is enhanced by shifting both bands to form a continuum from 0 Hz to the sampling frequency.

...read moreread less

Abstract: Signal-to-noise is enhanced by including the upper sideband of a sampled-speech signal with the original baseband signal for further signal processing. The invention features shifting both bands to form a continuum from 0 Hz to the sampling frequency. Application in a speech recognition is shown.

...read moreread less

25 citations

Journal Article•DOI•

Issues in packet voice communication

[...]

Daniel Minoli¹•Institutions (1)

Bell Labs¹

01 Aug 1979

TL;DR: A queuing model for a link carrying packetised voice is introduced and solved, and results on optimal packet length, transient behaviour, and buffer length are presented.

...read moreread less

Abstract: Because of perceived economic and technical benefits, digital voice techniques and corresponding packet network architectures are receiving considerable attention. In this paper we summarise speech traffic models, followed by a discussion of performance criteria. A queuing model for a link carrying packetised voice is introduced and solved. The results and network implications of this link model are addressed; results on optimal packet length, transient behaviour, and buffer length are presented.

...read moreread less

25 citations

Patent•DOI•

Voice modification system

[...]

Neil R. McCanney

04 Apr 1979-Journal of the Acoustical Society of America

TL;DR: In this paper, human voice sounds are modified to produce the effect of a different person speaking, where a signal representative of the original voice sounds is separated into a plurality of voice signal components each having a different frequency band.

...read moreread less

Abstract: Human voice sounds are modified to produce the effect of a different person speaking. A signal representative of the original voice sounds is separated into a plurality of voice signal components each having a different frequency band. The frequency of at least one voice signal component is shifted and the voice signal components are recombined to produce a modified voice signal representative of the modified but intelligible voice sounds.

...read moreread less

Journal Article•DOI•

Reliable voiced/Unvoiced decision

[...]

S. Knorr¹•Institutions (1)

University of California, Los Angeles¹

01 Jun 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A speech processing system has been developed which is capable of providing an accurate indication of whether or not a given speech segment is voiced or unvoiced and offers a realiable voiced/unvoiced (V/UV) decision even in the presence of some competing speech and noise sources.

...read moreread less

Abstract: A speech processing system has been developed which is capable of providing an accurate indication of whether or not a given speech segment is voiced or unvoiced. In comparison to other existing techniques, this one avails itself for easy implementation, and it offers a realiable voiced/unvoiced (V/UV) decision even in the presence of some competing speech and noise sources. In addition, the V/UV decision is achieved in real time with time delays of \leq 4 ms going from voiced to unvoiced speech, and \leq 2 ms going from unvoiced to voiced speech.

...read moreread less

Patent•

Digital speech interpolation system

[...]

Yohtaro Yatsuzuka

30 Aug 1979

TL;DR: In this paper, a digital speech interpolation system is combined with an adaptive differential PCM (ADPCM), employing a speech detector for detecting speech signals and for discriminating voiced and unvoiced sounds.

...read moreread less

Abstract: A digital speech interpolation system is combined with an adaptive differential PCM (ADPCM), employing a speech detector for detecting speech signals and for discriminating voiced and unvoiced sounds. An adaptive quantization bit assignment to the speech is adopted to cope with any freeze-out condition. And further PCM speech signals with 8 KHz sampling are applied to ADPCM after shifted 250 Hz down and then converted into 6 KHz sampling frequency, thereby attaining a total gain of about 7 without degrading speech quality.

...read moreread less

Journal Article•DOI•

Computers: The computer speaks: Rapid speech synthesis from printed text input could accommodate an unlimited vocabulary

[...]

B. A. Sherwood¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Aug 1979-IEEE Spectrum

TL;DR: The author expects such methods to be available to the business world before too long including rapid speech synthesis from printed test inputs able to accommodate an unlimited vocabulary.

...read moreread less

Abstract: Reviews the techniques for producing synthetic speech from a computer. The author expects such methods to be available to the business world before too long including rapid speech synthesis from printed test inputs able to accommodate an unlimited vocabulary. Topics include: analogue recording; human speech; compressed digital speech; speech synthesis from text; software; conversion to sound; synthetic speech for business.

...read moreread less

Proceedings Article•DOI•

A new system for continuous speech recognition - preliminary results

[...]

Stephen E. Levinson¹, A. Rosenberg•Institutions (1)

Bell Labs¹

01 Apr 1979

TL;DR: A speaker dependent system for recognizing carefully articulated continuous speech that accepts English sentences composed from a 127 word vocabulary appropriate to an airline information reservation task and achieves 75% sentence recognition.

...read moreread less

Abstract: A speaker dependent system for recognizing carefully articulated continuous speech is described. The system accepts English sentences composed from a 127 word vocabulary appropriate to an airline information reservation task. The system is controlled by a finite state parser which generates word candidates and established their temporal locations in hypothetical sentences. The word candidates are evaluated by an LPC distance measure and a dynamic programming algorithm which nonlinearly time aligns isolated word reference templates with the input speech stream. The input is recognized as the hypothetical sentence having the lowest distance according to a well-defined criterion. In a preliminary test based on 100 sentences spoken over dialed up telephone lines by two male talkers, 90% word accuracy, resulting in 75% sentence recognition, was achieved.

...read moreread less

Proceedings Article•DOI•

Variable rate coding

[...]

J. J. Dubnowski¹, R. Crochiere•Institutions (1)

Bell Labs¹

01 Apr 1979

TL;DR: It is shown that by careful design the algorithm can be made to be as robust to channel errors as that of a fixed rate ADPCM coder.

...read moreread less

Abstract: In this paper we examine a number of concepts and issues concerning variable rate coding of speech. We formulate the problem as a multistate coder (i.e. a coder that can operate at several bit rates) coupled with a time buffer. We first analyze the theoretical aspects of the problem by examining it in the context of a block processing formulation. We also allude to a multiple user configuration of variable rate coding for TASI type applications. A practical example of a variable rate ADPCM coder is presented and applied to speech coding. It is shown that by careful design the algorithm can be made to be as robust to channel errors as that of a fixed rate ADPCM coder.

...read moreread less

Proceedings Article•DOI•

Eight-channel digital speech synthesizer based on LPC techniques

[...]

L. Nebbia¹, P. Lucchini¹•Institutions (1)

CSELT¹

02 Apr 1979

TL;DR: An automatic vocal response system for the Italian language has been implemented at CSELT, consisting of a hardware speech synthesizer controlled by a programmed device (mini or micro computer) and two excitation generators for voiced and unvoiced sounds.

...read moreread less

Abstract: An automatic vocal response system for the Italian language has been implemented at CSELT, consisting of a hardware speech synthesizer controlled by a programmed device (mini or micro computer). The synthesizer exploits a speech production model composed of a 10th order digital lattice filter and two excitation generators for voiced and unvoiced sounds. The hardware includes also a module, which controls the updating and transfer of the parameters, and an output module which provides the analog speech signal. The synthesizer configuration is modular and expandible up to 8 channels. For each channel, the minicomputer supplies the synthesizer with the start-stop command plus 13 parameters: 10 filter coefficients, a gain factor, the pitch period and voiced-unvoiced information and the updating interval. For each channel, every 125 µs, 20 multiplications, 9 addition and 10 subtractions are executed. The filter and the source generator are time-shared among the 8 channels. The complete digital equipment is implemented by TTL-LS integrated circuits.

...read moreread less

Journal Article•DOI•

A speech digitizer at 2400 bits/s

[...]

S. Maitra, C. Davis

01 Dec 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: This paper describes a speech digitizer that is capable of transmitting and receiving at 2400 bits/s and typical applications of such digitizers are described.

...read moreread less

Abstract: This paper describes a speech digitizer that is capable of transmitting and receiving at 2400 bits/s. Comparisons are made between this implementation and past approaches. Typical applications of such digitizers are also described.

...read moreread less

Proceedings Article•DOI•

Narrowband LPC speech transmission over noisy channels

[...]

E. Blackman¹, R. Viswanathan, W. Russell, John Makhoul•Institutions (1)

BBN Technologies¹

01 Apr 1979

TL;DR: This paper describes continuing efforts which have concentrated on minimizing loss of synchronization between the receiver and the transmitter, and applies constraints which guarantee synchronization at a cost of some freedom in the selection of data for transmission.

...read moreread less

Abstract: Recently we described a variable-frame-rate LPC vocoder designed to transmit good quality speech over 2400 bps fixed-rate noisy channels with bit-error probabilities ranging up to 5% [3]. The basic idea was to lower the data rate by transmitting LPC parameters only when speech characteristics have changed sufficiently since the last transmission, and to employ the resulting bit-rate savings for protecting important transmission data against channel noise. This paper describes our continuing efforts which have concentrated on minimizing loss of synchronization between the receiver and the transmitter. In one approach, we emphasize heavy protection of header, and rapid resynchronization. Alternatively, we apply constraints which guarantee synchronization at a cost of some freedom in the selection of data for transmission. Results from the first approach are presented; results from both methods will be compared at the conference.

...read moreread less

Proceedings Article•DOI•

SIRENE, a system for speech training of deaf people

[...]

M.-C. Haton, J.-P. Haton

01 Apr 1979

TL;DR: The SIRENE system is an interactive computer-based system of speech-training aids for the deaf that features the use of automatic speech recognition algorithms in the training of sounds and words.

...read moreread less

Abstract: This paper describes the SIRENE system which is being developed in our laboratory. SIRENE is an interactive computer-based system of speech-training aids for the deaf. It also includes a variety of procedures for analysis and classification of pathological voices. The basic idea of speech-training aids consists of compensating for the lack of auditory feedback in deaf children by use of visual displays. The system is intended to be used by speech teachers ; several acoustic and phonetic parameters of speech can be displayed and trained : pitch, voicing, intensity, etc... SIRENE also features the use of automatic speech recognition algorithms in the training of sounds and words.

...read moreread less

Proceedings Article•DOI•

Real-time text processing for Italian speech synthesis

[...]

E. Vivalda¹, S. Sandri, C. Miotti•Institutions (1)

CSELT¹

01 Apr 1979

TL;DR: The paper describes the software architecture of an Italian text-to-speech synthesis system based on the joining of LPC coded diphones, which is designed according to multichannel and real time criteria.

...read moreread less

Abstract: The paper describes the software architecture of an Italian text-to-speech synthesis system based on the joining of LPC coded diphones. The automatic voice response system is designed according to multichannel and real time criteria. For each output channel, the following operations are performed: pre-processing of the input string of characters, translation into the proper sequence of diphones, generation of prosodic contours and real-time control of a hardware speech synthesizer.

...read moreread less

Proceedings Article•DOI•

An approach to speaker normalization in an automatic speech recognition system

[...]

J. Jaschul

01 Apr 1979

TL;DR: Under the present restriction to vowel spectra adaptation methods by spectral amplitude weighting and by spectral shifting are investigated, by a special method it was enabled to adapt test spectra class specifically.

...read moreread less

Abstract: An automatic speech recognition system based on the reference set of a single speaker can be extended for use by several speakers by applying appropriate preprocessing transformations. These transformations adapt the incoming patterns of a new speaker to the patterns of the reference set. Under the present restriction to vowel spectra adaptation methods by spectral amplitude weighting and by spectral shifting are investigated. By a special method it was enabled to adapt test spectra class specifically.

...read moreread less

Proceedings Article•DOI•

The MISS speech synthesis system

[...]

A. Levine¹, W. Sanders•Institutions (1)

Stanford University¹

01 Apr 1979

TL;DR: The quality of reproduction and storage requirements for the Microprogrammed Intoned Speech Synthesizer utilizing linear predictive coding techniques is evaluated in comparison with other systems.

...read moreread less

Abstract: To provide speech output for a major demonstration project of sophisticated computer based instruction, the Microprogrammed Intoned Speech Synthesizer (MISS) utilizing linear predictive coding techniques was developed. In addition to simple resynthesis of preanalyzed recorded speech, the MISS system can apply procedures to modify the pitch, duration and amplitude of individual words so that they can be concatenated into natural sounding utterances. Text analysis programs provide the linguistic parameters that direct MISS to apply the appropriate manipulations. We evaluate the quality of reproduction and storage requirements for this system in comparison with other systems.

...read moreread less

Proceedings Article•DOI•

Speech characterization from a rough spectral analysis

[...]

J. Lienard

01 Apr 1979

TL;DR: It is proposed to characterize the speech short-term spectrum with a reduced number of parameters (4 to 7) computed from a rough spectral analysis that permits a correct classification of the steady-state French speech sounds pronounced by different speakers.

...read moreread less

Abstract: Tracking and identifying the formants in order to perform speech recognition is a time-consuming, error full and speaker-dependent operation. It is proposed to characterize the speech short-term spectrum with a reduced number of parameters (4 to 7) computed from a rough spectral analysis. These parameters permit a correct classification of the steady-state French speech sounds (vowels, including nasals, and unvoiced fricatives) pronounced by different speakers. A word recognition experiment based on the same parameters gives good results with words differing from each other by one phoneme only (single speaker, one learning pass).

...read moreread less

Proceedings Article•DOI•

An experimental system for acoustic-phonetic decoding of continuous speech

[...]

J.-P. Haton, C. Sanchez¹•Institutions (1)

Nancy-Université¹

02 Apr 1979

TL;DR: A prototype for the acoustic-phonetic processing level, which makes it possible to test various parameters and strategies for phonemic transcription of continuous speech, is realized in the framework of MYRTILLE II Speech Understanding System.

...read moreread less

Abstract: In the framework of MYRTILLE II Speech Understanding System under development in our Laboratory we have realized a prototype for the acoustic-phonetic processing level. This prototype makes it possible to test various parameters and strategies for phonemic transcription of continuous speech. It can be considered as a metasystem in the sense that, given a hierarchy of recognition algorithms and a strategy, it can generate the optimal system for phoneme recognition. The system directly works on the digitized speech wave, which makes it possible to get the best accuracy on the parameters. The speech signal is segmented into phoneme-like units by a decision function which incorporates voicing, energy, zero-crossing rate and curve length. The segments thus obtained are then processed by the recognition system which can be viewed as a tree structure the nodes of which are algorithms. These algorithms take into account one or several features and their answers can be phoneme classes and/or other algorithms. Problems involved in the design of such a system are also presented in this paper together with a particular implementation.

...read moreread less

Switched-capacitor applications in speech processing.

[...]

R. W. Brodersen, Paul J. Hurst, D. J. Allstot, T. Tsuda

01 Jan 1979

Proceedings Article•DOI•

The presentation of continuous speech with synchronous printed text

[...]

D. Sargent¹, A. Malcolm•Institutions (1)

Rochester Institute of Technology¹

02 Apr 1979

TL;DR: A system is being developed which permits a fully synchronized presentation of recorded speech with its corresponding printed text, and any synchronization errors are corrected through operator intervention prior to the creation of the synchronized speech/text material for the classroom.

...read moreread less

Abstract: A communications problem encountered by most hearing impared people is their inability to understand spoken English. Since present technology appears unable to eliminate this speech perception problem, it is hoped that a better understanding of the relationship between printed and spoken English will permit the hearing impared person to better use their residual hearing. To aid in such instruction, a system is being developed which permits a fully synchronized presentation of recorded speech with its corresponding printed text. A series of computer algorithms are employed to segment the speech signal into syllable-like units, and to separate the corresponding printed text into syllables. The resulting data are then combined on a syllable-by-syllable basis, and any synchronization errors are corrected through operator intervention prior to the creation of the synchronized speech/text material for the classroom.

...read moreread less

Proceedings Article•DOI•

A new configuration for speech digitization at 9600 bits per second

[...]

D. Cohn¹, J. Melsa•Institutions (1)

University of Notre Dame¹

01 Apr 1979

TL;DR: A speech coding algorithm for digital transmission of speech at a rate of 9600 bits per second which can be implemented on a speech processing system is described and yielded a signal-to-noise ratio which is indicative of very high quality speech.

...read moreread less

Abstract: A speech coding algorithm for digital transmission of speech at a rate of 9600 bits per second which can be implemented on a speech processing system is described. The algorithm combines the following: a pitch extraction loop, a pitch compensating adaptive quantizer, a sequentially adaptive linear predictor, an adaptive source coding, and a multipath tree searching to generate very high quality speech output. Although each of these elements has been previously applied to speech coding, the combination of all five of these elements has not been discussed before. Preliminary simulation studies of the algorithm have yielded a signal-to-noise ratio which is indicative of very high quality speech.

...read moreread less

Journal Article•DOI•

Communicating with computers by voice

[...]

Arthur L. Robinson

01 Sep 1979-IEEE Transactions on Professional Communication

TL;DR: The linguistic and contextual knowledge that must be supplied or programmed into a computer to accomplish speech interpretation is the subject of several research activities which are described.

...read moreread less

Abstract: A major motivation is to achieve in man-machine interactions the efficiency of speech communication among humans. Continuous speech is more difficult to understand than are isolated words. Commercially available speech recognition systems of the latter type are highly successful despite their limited capability. To recognize continuous speech, more information is needed than is contained in acoustic waves alone. The linguistic and contextual knowledge that must be supplied or programmed into a computer to accomplish speech interpretation is the subject of several research activities which are described. Speech synthesis systems face similar problems but are further advanced.

...read moreread less

Proceedings Article•DOI•

A Microcomputer-based Speech Recognition System

[...]

M. Yuschik¹, H.R. Martens•Institutions (1)

University at Buffalo¹

04 Sep 1979