Showing papers on "Speech enhancement published in 1984"

PDF

Open Access

Journal Article•DOI•

Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator

[...]

Yariv Ephraim¹, David Malah²•Institutions (2)

Stanford University¹, Technion – Israel Institute of Technology²

01 Dec 1984-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: In this article, a system which utilizes a minimum mean square error (MMSE) estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm.

...read moreread less

Abstract: This paper focuses on the class of speech enhancement systems which capitalize on the major importance of the short-time spectral amplitude (STSA) of the speech signal in its perception. A system which utilizes a minimum mean-square error (MMSE) STSA estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm. In this paper we derive the MMSE STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables. We analyze the performance of the proposed STSA estimator and compare it with a STSA estimator derived from the Wiener estimator. We also examine the MMSE STSA estimator under uncertainty of signal presence in the noisy observations. In constructing the enhanced signal, the MMSE STSA estimator is combined with the complex exponential of the noisy phase. It is shown here that the latter is the MMSE estimator of the complex exponential of the original phase, which does not affect the STSA estimation. The proposed approach results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise. The complexity of the proposed algorithm is approximately that of other systems in the discussed class.

...read moreread less

3,905 citations

Journal Article•

Speech enhancement using a minimum mean square error short-time spectral amplitude estimator

[...]

Ephraim

01 Jan 1984-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: This paper derives a minimum mean-square error STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables, which results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise.

...read moreread less

Abstract: Absstroct-This paper focuses on the class of speech enhancement systems which capitalize on the major importance of the short-time spectral amplitude (STSA) of the speech signal in its perception. A system which utilizes a minimum mean-square error (MMSE) STSA estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the \" spectral subtraction \" algorithm. In this paper we derive the MMSE STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables. We analyze the performance of the proposed STSA estimator and compare it with a STSA estimator derived from the Wiener estimator. We also examine the MMSE STSA estimator under uncertainty of signal presence in the noisy observations. In constructing the enhanced signal, the MMSE STSA estimator is combined with the complex exponential of the noisy phase. It is shown here that the latter is the MMSE estimator of the complex exponential of the original phase, which does not affect the STSA estimation. The proposed approach results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise. The complexity of the proposed algorithm is approximately that of other systems in the discussed class.

...read moreread less

2,714 citations

Journal Article•DOI•

Signal estimation from modified short-time Fourier transform

[...]

D. Griffin¹, Jae Lim¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Apr 1984-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: An algorithm to estimate a signal from its modified short-time Fourier transform (STFT) by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT magnitude is presented.

...read moreread less

Abstract: In this paper, we present an algorithm to estimate a signal from its modified short-time Fourier transform (STFT). This algorithm is computationally simple and is obtained by minimizing the mean squared error between the STFT of the estimated signal and the modified STFT. Using this algorithm, we also develop an iterative algorithm to estimate a signal from its modified STFT magnitude. The iterative algorithm is shown to decrease, in each iteration, the mean squared error between the STFT magnitude of the estimated signal and the modified STFT magnitude. The major computation involved in the iterative algorithm is the discrete Fourier transform (DFT) computation, and the algorithm appears to be real-time implementable with current hardware technology. The algorithm developed in this paper has been applied to the time-scale modification of speech. The resulting system generates very high-quality speech, and appears to be better in performance than any existing method.

...read moreread less

1,899 citations

Proceedings Article•DOI•

Optimal estimators for spectral restoration of noisy speech

[...]

J. Porter, S. Boll

19 Mar 1984

TL;DR: Results for a speaker dependent connected digit speech recognition task with a base error rate of 1.6%, show that preprocessing the noisy unknown speech with a 10 dB signal-to-noise ratio reduces the error rate from 42% to 10%.

...read moreread less

Abstract: Acoustic noise suppression is treated as a problem of finding the minimum mean square error estimate of the speech spectrum from a noisy version. This estimate equals the expected value of its conditional distribution given the noisy spectral value, the mean noise power and the mean speech power. It is shown that speech is not Gaussian. This results in an optimal estimate which is a non-linear function of the spectral magnitude. This function differs from the Wiener filter, especially at high instantaneous signal-to-noise ratios. Since both speech and Gaussian noise have a uniform phase distribution, the optimal estimator of the phase equals the noisy phase. The paper describes how the estimator can be calculated directly from noise-free speech. It describes how to find the optimal estimator for the complex spectrum, the magnitude, the squared magnitude, the log magnitude, and the root-magnitude spectra. Results for a speaker dependent connected digit speech recognition task with a base error rate of 1.6%, show that preprocessing the noisy unknown speech with a 10 dB signal-to-noise ratio reduces the error rate from 42% to 10%. If the template data are also preprocessed in the same way, the error rate reduces to 2.1%, thus recovering 99% of the recognition performance lost due to noise.

...read moreread less

138 citations

Proceedings Article•DOI•

The harmonic magnitude suppression (EMS) technique for intelligibility enhancement in the presence of interfering speech

[...]

B. Hanson, D. Wong

01 Mar 1984

TL;DR: Algorithms based on spectral subtraction are developed for improving the intelligibility of speech that has been interfered by a second talker's voice, and significant gain in intelligibility for low signal-to-noise ratio conditions is achieved.

...read moreread less

Abstract: Algorithms based on spectral subtraction are developed for improving the intelligibility of speech that has been interfered by a second talker's voice. A number of new properties of spectral subtraction are shown, including the effects of phase on the output speech intelligibility, and the choice of magnitude spectral differences for best results. A harmonic extraction algorithm is also developed. Results of formal testing on the final system show that significant gain in intelligibility for low signal-to-noise ratio conditions is achieved.

...read moreread less

45 citations

Proceedings Article•DOI•

Spectral envelope sampling and interpolation in linear predictive analysis of speech

[...]

Hynek Hermansky, Hidehiko Fujisaki¹, Y. Sato²•Institutions (2)

University of Tokyo¹, Fujitsu²

19 Mar 1984

TL;DR: It is shown, on analyses of both synthetic and natural speech, that the averaged parabolic approximation between harmonic peaks of voiced speech spectrum reduces the sensitivity of the LP analysis to changes in the fundamental frequency Fo and to noise.

...read moreread less

Abstract: In spite of its extensive use, speech analysis based on linear prediction (LP) is liable to various causes of inaccuracy. This paper presents a novel approach to improve the accuracy in the estimation of the voiced speech production model based on the LP method. The presented method uses interpolation between spectral points which are least influenced by artifacts in the spectral analysis and by noise in the signal. We show, on analyses of both synthetic and natural speech, that the averaged parabolic approximation between harmonic peaks of voiced speech spectrum reduces the sensitivity of the LP analysis to changes in the fundamental frequency Fo and to noise. The method is well suited for combination with the Spectral Transform LP method, previously proposed by the authors [1].

...read moreread less

34 citations

Speech Enhancement Using a- Minimum Mean- Square Error Short-Time Spectral

[...]

Amplitude Estimator

01 Jan 1984

...read moreread less

Abstract: Absstroct-This paper focuses on the class of speech enhancement systems which capitalize on the major importance of the short-time spectral amplitude (STSA) of the speech signal in its perception. A system which utilizes a minimum mean-square error (MMSE) STSA estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the “spectral subtraction” algorithm. In this paper we derive the MMSE STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables. We analyze the performance of the proposed STSA estimator and compare it with a STSA estimator derived from the Wiener estimator. We also examine the MMSE STSA estimator under uncertainty of signal presence in the noisy observations. In constructing the enhanced signal, the MMSE STSA estimator is combined with the complex exponential of the noisy phase. It is shown here that the latter is the MMSE estimator of the complex exponential of the original phase, which does not affect the STSA estimation. The proposed approach results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise. The complexity of the proposed algorithm is approximately that of other systems in the discussed class.

...read moreread less

31 citations

Proceedings Article•DOI•

A performance comparison of pitch extraction algorithms for noisy speech

[...]

K. Oh¹, C. Un•Institutions (1)

KAIST¹

19 Mar 1984

TL;DR: It has been found that for pitch detection of noisy speech the algorithm that uses an AMDF or an autocorrelation function yields relatively good performance than others.

...read moreread less

Abstract: Results of a performance comparison study of eight pitch extraction algorithms for noisy as well as clean speech are presented. These algorithms are the autocorrelation method with center clipping, the autocorrelation method with modified center clipping, the simplified inverse filter tracking (SIFT) method, the average magnitude difference function (AMDF) method, the pitch detection method based on LPC inverse filtering and AMDF, the data reduction method, the parallel processing method and the cepstrum method. It has been found that for pitch detection of noisy speech the algorithm that uses an AMDF or an autocorrelation function yields relatively good performance than others. A pitch detector that uses center clipped speech as an input signal is effective in pitch extraction of noisy speech. In general, preprocessing such as LPC inverse filtering or center clipping of input speech yields remarkable improvement in pitch detection.

...read moreread less

26 citations

Proceedings Article•DOI•

Knowledge based speech analysis and enhancement

[...]

C. Myers¹, Alan V. Oppenheim, Randall Davis, W. Dove•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 1984

TL;DR: A system for speech analysis and enhancement which combines signal processing and symbolic processing in a closely coupled manner and attempts to reconstruct the original speech waveform using symbolic processing to help model the signal and to guide reconstruction.

...read moreread less

Abstract: This paper describes a system for speech analysis and enhancement which combines signal processing and symbolic processing in a closely coupled manner. The system takes as input both a noisy speech signal and a symbolic description of the speech signal. The system attempts to reconstruct the original speech waveform using symbolic processing to help model the signal and to guide reconstruction. The system uses various signal processing algorithms for parameter estimation and reconstruction.

...read moreread less

20 citations

Proceedings Article•DOI•

Speech synthesis from short-time Fourier transform magnitude and its application to speech processing

[...]

D. Griffin¹, D. Deadrick, Jae S Lim•Institutions (1)

Massachusetts Institute of Technology¹

19 Mar 1984

TL;DR: For the applications of speech synthesis from speech model parameters, time-scale modification of clean speech, speech enhancement by spectral subtraction, and helium speech enhancement, significant improvement is not gained by using the LSEE-MSTFTM algorithm.

...read moreread less

Abstract: In this paper, speech synthesis directly from the processed Short-Time Fourier Transform Magnitude (STFTM) using the LSEE-MSTFTM algorithm [6,7] is compared to more conventional algorithms for several speech processing applications. For the applications considered, the most improvement occurs for time-scale modification of multiple speaker speech and noisy speech since these input signals are not well modeled by the analysis/synthesis system used for comparison. However, for the applications of speech synthesis from speech model parameters, time-scale modification of clean speech, speech enhancement by spectral subtraction, and helium speech enhancement, significant improvement is not gained by using the LSEE-MSTFTM algorithm. Significantly better results are not obtained since a good STFT phase estimate is available and employed in the conventional approaches to these applications.

...read moreread less

15 citations

Proceedings Article•DOI•

An endpoint detector for LPC speech using residual error look-ahead for vector quantization applications

[...]

Chieh Tsao¹, Robert M. Gray•Institutions (1)

Stanford University¹

01 Mar 1984

TL;DR: An end-point detector for LPC speech using squared prediction error look-ahead and automatic/manual threshold determination is described, which is relatively immune to transient pulses and various low-level noises, yet preserves low- level speech sounds such as weak fricatives to a significant extent under moderate noise conditions.

...read moreread less

Abstract: An end-point detector for LPC speech using squared prediction error look-ahead and automatic/manual threshold determination is described. The detector is algorithmically simple, computationally efficient,and uses only one decision parameter. Preliminary tests indicate that it is relatively immune to transient pulses and various low-level noises, yet preserves low-level speech sounds such as weak fricatives to a significant extent under moderate noise conditions. Tests indicate that 93.8% of automatically determined endpoints agree to within two frames of manually determined endpoints. The detector is especially suitable for use in vector-quantization based LPC systems, where the squared prediction error is easily available.

...read moreread less

Proceedings Article•DOI•

Adaptive noise cancellation in a fighter cockpit environment

[...]

W. Harrison¹, Jae Lim¹, Elliot Singer¹•Institutions (1)

Massachusetts Institute of Technology¹

19 Mar 1984

TL;DR: It is demonstrated that good (>10 dB) cancellation of the additive noise and little speech distortion can be achieved by having the reference microphone attached to the outside of the facemask and by updating the filter coefficients only during silence intervals.

...read moreread less

Abstract: In this paper we discuss some preliminary results on using Widrow's Adaptive Noise Cancelling (ANC) algorithm to reduce the background noise present in a fighter pilot's speech. With a dominant noise source present and with the pilot wearing an oxygen facemask, we demonstrate that good (>10 dB) cancellation of the additive noise and little speech distortion can be achieved by having the reference microphone attached to the outside of the facemask and by updating the filter coefficients only during silence intervals.

...read moreread less

Proceedings Article•DOI•

Low bit rate speech enhancement using a new method of multiple impulse excitation

[...]

A. Parker¹, S. Alexander, H. Trussell•Institutions (1)

North Carolina State University¹

01 Mar 1984

TL;DR: A simple method is presented for extracting the amplitudes and locations of a multiple impulse excitation model which allows a more accurate recomputation of the autoregressive coefficients based upon incorporating the multipulse excitation.

...read moreread less

Abstract: One of the sources of degradation in LPC-synthesized speech is the mechanical quality due to a single impulse excitation per pitch period. This paper presents a simple method for extracting the amplitudes and locations of a multiple impulse excitation model. These multipulse parameters are obtained very easily from the autoregressive (LPC) residual. Additionally, a method is developed which allows a more accurate recomputation of the autoregressive coefficients based upon incorporating the multipulse excitation.

...read moreread less

Proceedings Article•DOI•

Multisensor speech input for enhanced immunity to acoustic background noise

[...]

V. Viswanathan¹, K. Karnofsky, Kenneth N. Stevens, M. Alakel•Institutions (1)

BBN Technologies¹

01 Mar 1984

TL;DR: Results from formal speech intelligibility and quality tests in simulated fighter aircraft cockpit noise show clearly that each of the two-sensor signals under test outperforms the signal from the gradient microphone alone and that the performance improvement generally increases with the noise level.

...read moreread less

Abstract: The aim of this work is to develop multisensor configurations of sensors for transducing speech in order to achieve enhanced immunity to acoustic background noise. We performed detailed measurements of the sound field in the vicinity of the mouth and neck during speech using pressure and pressure gradient microphones and an accelerometer. We investigated the properties of the measured signals from the various sensor types and positions through long-term and short-term spectral analyses and from articulation index scores computed assuming ambient noise typical of that in a fighter aircraft cockpit. From the results of this investigation, we developed a two-sensor configuration involving an accelerometer and a gradient microphone. Results from formal speech intelligibility and quality tests in simulated fighter aircraft cockpit noise show clearly that each of the two-sensor signals under test outperforms the signal from the gradient microphone alone and that the performance improvement generally increases with the noise level.

...read moreread less

Proceedings Article•DOI•

A nonlinear spectrum processing technique for speech enhancement

[...]

T. Eger¹, J. Su, L. Varner•Institutions (1)

TASC, Inc¹

01 Mar 1984

TL;DR: The algorithm is an extension of the spectral subtraction concept, the soft nonlinearity provides less distortion of low-amplitude speech components and less sensitivity to the subtraction level than previously reported techniques.

...read moreread less

Abstract: This paper discusses a nonlinear spectrum processing approach to speech enhancement. The technique incorporates a "soft" nonlinearity which suppresses low-level noise components while passing higher-level speech components. Although the algorithm is an extension of the spectral subtraction concept, the soft nonlinearity provides less distortion of low-amplitude speech components and less sensitivity to the subtraction level than previously reported techniques. In addition to noticeable improvement in perceptual quality, the algorithm offers a substantial increase in SNR.

...read moreread less

Proceedings Article•DOI•

Efficient algorithm for multi-pulse LPC analysis of speech

[...]

Vijay Kumar Jain¹, R. Hangartner•Institutions (1)

University of South Florida¹

01 Mar 1984

TL;DR: A simplified multi-pulse analysis is proposed here with particular emphasis on the speech model developed as a result of a two-pass method, incorporating knowledge of the estimated pulse locations and amplitudes along with perceptual synthesis error weighting.

...read moreread less

Abstract: The introduction of multi-pulse excitation for LPC coders has increased the quality achievable for digitally coded speech at bit rates in the 9.6 Kbs range. A simplified multi-pulse analysis is proposed here with particular emphasis on the speech model developed as a result of a two-pass method. In the first pass, estimated LPC parameters generated by conventional covariance analysis are used to generate the forward prediction error; the multi-pulse sequence is then detected by thresholding the residual. The second pass generates the final LPC parameters by a covariance analysis incorporating knowledge of the estimated pulse locations and amplitudes along with perceptual synthesis error weighting. Experimental results are presented to demonstrate the method.

...read moreread less

Proceedings Article•DOI•

Speech recognition under additive noise

[...]

Chin-Hui Lee, K. Genesan

01 Mar 1984

TL;DR: The results indicate that smoothing in addition to twicing provides significant performance improvement in high noise background, and the twicing algorithm reduces the error rate in the noisy environments by a significant amount.

...read moreread less

Abstract: In this paper we present a study on the performance of a speaker-dependent continuous speech recognition algorithm in various background noise levels, including the mismatch tolerance of the algorithm. This mismatch exists in most applications where the user is trained in one noise level and does the recognition in different and highly variable noise levels. Finally, we introduce a pre-processing technique called 'twicing' and a simple 3-point moving average post-processor. The twicing algorithm, while still maintaining the high performance in the quiet background, reduces the error rate in the noisy environments by a significant amount. The results indicate that smoothing in addition to twicing provides significant performance improvement in high noise background.

...read moreread less

Proceedings Article•DOI•

Automatic gain control

[...]

G. Kang¹, M. Lidd•Institutions (1)

United States Naval Research Laboratory¹

01 Mar 1984

TL;DR: In this paper, an automatic gain control (AGC) is used to self-adjust the front-end gain of the LPC analyzer in such a way that the speech waveform is more accurately quantized by the analog-to-digital converter.

...read moreread less

Abstract: The purpose of an automatic gain control (AGC) is to self-adjust the front-end gain of the LPC analyzer in such a way that the speech waveform is more accurately quantized by the analog-to-digital converter. Tests in the past have indicated that properly amplified speech produces higher intelligibility scores at the narrowband LPC output because both filter and excitation parameters are more accurately estimated. In addition, properly amplified input speech results in properly amplified speech at the receiver, which is highly desirable for listening in a noisy environment.

...read moreread less

Proceedings Article•DOI•

Improvement of the narrowband LPC synthesis

[...]

G.S. Kang¹, S. Everett•Institutions (1)

United States Naval Research Laboratory¹

01 Mar 1984

TL;DR: Modifications are presented which improve the quality of the synthesized speech without requiring the transmission of any additional data and show an increase of up to 5 points in overall speech quality with the implementation of each of these improvements.

...read moreread less

Abstract: The major weakness of the current narrowband LPC synthesizer lies in the use of a "canned" invariant excitation signal. The use of such an excitation signal is based on three primary assumptions, namely, (1) that the amplitude spectrum of the excitation signal is flat and time-invariant, (2) that the phase spectrum of the voiced excitation signal is a time-invariant function of frequency, and (3) that the probability density function (PDF) of the phase spectrum of the unvoiced excitation signal is also time-invariant. This paper critically examines these assumptions and presents modifications which improve the quality of the synthesized speech without requiring the transmission of any additional data. Diagnostic Acceptability Measure (DAM) tests show an increase of up to 5 points in overall speech quality with the implementation of each of these improvements.

...read moreread less

Proceedings Article•DOI•

Performance of isolated word recognition system for degraded speech

[...]

B. Yegnanarayana¹, S. Chandran•Institutions (1)

Indian Institute of Technology Madras¹

01 Mar 1984

TL;DR: This paper proposes a recognition scheme which adapts itself to mild degradations in speech, and suggests techniques which adaptively discriminate between noisy and noise-free parameters by using a selective weighting procedure in the final distance calculations.

...read moreread less

Abstract: The performance of an isolated word speech recognition (IWSR) system is known to drop rapidly with increase in the degradation of the input speech. In this paper we propose a recognition scheme which adapts itself to mild degradations in speech. The scheme does not need apriori information regarding the nature and extent of noise. We suggest techniques which adaptively discriminate between noisy and noise-free parameters by using a selective weighting procedure in the final distance calculations. A suitable index is used to study the performance of the recognition system for small data sets. Our scheme lends itself to greater flexibility in handling degradations in speech input than do the existing recognition schemes. We illustrate our scheme by simulating an adaptive differential pulse code modulated (ADPCM) speech, where the main distortion is contributed by the quatization noise.

...read moreread less

Journal Article•DOI•

Sequential noise spectral shaping in ADPCM

[...]

J. Asmuth¹, Jerry D. Gibson•Institutions (1)

Bell Labs¹

01 Apr 1984-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Subjective listening tests and spectrograms show the Kalman algorithm to be a viable alternative to the block-adaptive algorithms.

...read moreread less

Abstract: The output speech from a fixed-tap differential pulse code modulation system with adaptive quantization and adaptive noise spectral shaping (NSS) is compared for block-adaptive and sequentially adaptive NSS filters. Block-adaptive systems incorporate a delay which can build up in analog communications systems and cause echoing problems. The buildup of delay can be eliminated by implementing the sequentially adaptive Kalman algorithm in the NSS filter. Simulations are performed for 4-, 8-, and 16-level quantizers with fourth- and ninth-order Kalman adaptation of the NSS filter. A block-adaptive system is implemented as a reference. Subjective listening tests and spectrograms show the Kalman algorithm to be a viable alternative to the block-adaptive algorithms. Signal-to-noise ratios are also given.

...read moreread less

Proceedings Article•DOI•

Acoustic tube analysis of formant bandwidths and frequencies in helium speech

[...]

Mark Richards¹, R. Schafer•Institutions (1)

Georgia Institute of Technology¹

01 Mar 1984

TL;DR: The lossy electric transmission-line analog model of the vocal tract is used to study the acoustics of speech produced in a hyperbaric helium-oxygen atmosphere and shows that the classic Fant and Lindquist formula somewhat overstates the formant frequency shift when glottal and radiation effects are included.

...read moreread less

Abstract: We use the lossy electric transmission-line analog model of the vocal tract to study the acoustics of speech produced in a hyperbaric helium-oxygen atmosphere The analysis extends previous work by including more completely the effects of the wall vibration, glottal, and radiation impedances, and by analyzing the formant bandwidths and amplitudes in addition to the formant frequencies It shows that (1) the classic Fant and Lindquist formula somewhat overstates the formant frequency shift when glottal and radiation effects are included; (2) the lower formant bandwidths increase by much more than commonly assumed; and (3) the upper formant amplitudes are higher relative to the lower formants in helium speech than in normal speech These results are useful in developing advanced helium speech enhancement algorithms

...read moreread less

Speech Enhancement Using Multiple Microphones.

[...]

W A Harrison

15 Nov 1984

TL;DR: In this article, a new application of Widrow's adaptive noise cancellation (ANC) algorithm is presented, where the ambient environment is generalized to include the case where an acoustic barrier exists between the primary and reference microphones.

...read moreread less

Abstract: : A new application of Widrow's Adaptive Noise Cancelling (ANC) algorithm is presented. Specifically, the ambient environment is generalized to include the case where an acoustic barrier exists between the primary and reference microphones. By updating the coefficients of the noise estimation filter only during silence, it is shown that the ANC technique can provide substantial noise reduction with little speech distortion even when the acoustic barrier provides only moderate attenuation of acoustic signals. The use of the modified ANC method is evaluated using an oxygen facemask worn by fighter aircraft pilots. Experiments demonstrate that if a noise field is created using a single source, 11 dB signal-to-noise ratio improvement can be achieved by attaching the reference microphone to the exterior of the facemask. The length of the ANC filter required for this particular environment is only about 50 points long.

...read moreread less

Proceedings Article•DOI•

16 kbps APC with hybrid quantization

[...]

M. Cowing, J. Fussell

01 Mar 1984

TL;DR: This paper presents the results of a two year effort to develop an adaptive predictive coder for transmission of high quality speech over 16 Kbps channels with up to a 5% bit error rate to maximize intelligibility and maintain bandwidth compatability with existing speech compression systems.

...read moreread less

Abstract: This paper presents the results of a two year effort to develop an adaptive predictive coder for transmission of high quality speech over 16 Kbps channels with up to a 5% bit error rate. To maximize intelligibility and maintain bandwidth compatability with existing speech compression systems, an 8 KHz A-D sampling rate constraint was imposed. Our algorithm, called APC-HQ for "hybrid quantization," concentrates on improving the residual coding and uses both segmental quantization and center clipping quantization in a perceptually optimum manner. The result is an algorithm running in real time on an AP-120B [4] which yields high quality under the input bandwidth and noisy channel constraints, and whose speech quality exceeds that attainable with either technique alone.

...read moreread less

Proceedings Article•

On the Estimation of the Short-Time Phase in Speech Enhancement Systems.

[...]

Yariv Ephraim, David Malah

01 Jan 1984

Proceedings Article•DOI•

Recognition of consonants using an ARMA model of the speech signal

[...]

A. Kumar¹, George A. Bekey•Institutions (1)

University of Southern California¹

01 Mar 1984

TL;DR: A new approach to the recognition of consonants is presented based on the modelling of the vocal tract by two acoustic cavities with the voicing source at the input of the first cavity and the fricative noise or pulse sources at the junction of the two cavities.

...read moreread less

Abstract: In this paper we present a new approach to the recognition of consonants. This approach is based on the modelling of the vocal tract by two acoustic cavities with the voicing source at the input of the first cavity and the fricative noise or pulse source at the junction of the two cavities. The separation of the system into these dual cavities results in an ARMA structure for the acoustic signal.

...read moreread less

Proceedings Article•DOI•

Performance analysis of DPCM codecs operating on adaptively frequency mapped 7.6 khz speech

[...]

C. Evci, P. Patrick

01 Mar 1984

TL;DR: The performance results of the adaptive DPCM speech codecs which have been used in conjunction with an Adaptive Frequency Mapping system (AFMAP) significantly enhances the recovered speech compared to other bandwidth compression systems.

...read moreread less

Abstract: The performance results of the adaptive DPCM speech codecs which have been used in conjunction with an Adaptive Frequency Mapping system (AFMAP) are presented. The AFMAP preprocessor operates on a wideband speech (0.3 - 7.6 KHz) and compresses the speech signal into 0.3 - 3.3 KHz telephone channel. This signal is subsequently digitized by DPCM codecs employing simple, but efficient prediction algorithms in order to reproduce wideband speech from AFMAP postprocessor output. The informal listening tests, SNRSEG measures indicate that AFMAP system in tandem with DPCM codecs significantly enhances the recovered speech compared to other bandwidth compression systems. The reproduced AFMAP speech is also preferable to 0.3 - 3.3 KHz telephone speech, digitized by the some codecs at the same transmission bit rates.

...read moreread less