Showing papers on "Code-excited linear prediction published in 1996"

PDF

Open Access

Proceedings Article•DOI•

A 2.4 kbit/s MELP coder candidate for the new U.S. Federal Standard

[...]

Alan V. McCree¹, Kwan Truong¹, E.B. George¹, Thomas P. Barnwell², V. Viswanathan² - Show less +1 more•Institutions (2)

07 May 1996

TL;DR: The enhanced MELP speech coder is described, which is a candidate for the new U.S. Federal Standard at 2.4 kbits/s and has been optimized for performance in acoustic background noise and in channel errors, as well as for efficient real-time implementation.

...read moreread less

Abstract: This paper describes our enhanced mixed excitation linear prediction (MELP) speech coder which is a candidate for the new U.S. Federal Standard at 2.4 kbits/s. The new coder is based on the MELP model, and it uses a number of enhancements as well as efficient quantization algorithms to improve performance while maintaining a low bit rate. In addition, the coder has been optimized for performance in acoustic background noise and in channel errors, as well as for efficient real-time implementation. Listening tests confirm that the enhanced 2.4 kbit/s MELP coder performs as well as the higher bit rate 4.8 kbit/s FS1016 CELP standard.

...read moreread less

169 citations

Journal Article•DOI•

Channel codes that exploit the residual redundancy in CELP-encoded speech

[...]

Fady Alajaji¹, N. Phamdo², Thomas E. Fuja³•Institutions (3)

Queen's University¹, State University of New York System², University of Maryland, College Park³

01 Sep 1996-IEEE Transactions on Speech and Audio Processing

TL;DR: The objective is to design efficient coding/decoding schemes for the transmission of the CELP line spectral parameters (LSPs) over very noisy channels by quantifying the amount of "residual redundancy" inherent in the LSPs of Federal Standard 1016 CELF.

...read moreread less

Abstract: We consider the problem of reliably transmitting CELP-encoded speech over noisy communication channels. Our objective is to design efficient coding/decoding schemes for the transmission of the CELP line spectral parameters (LSPs) over very noisy channels. We begin by quantifying the amount of "residual redundancy" inherent in the LSPs of Federal Standard 1016 CELP. This is done by modeling the LSPs as first- and second-order Markov chains. Two models for LSP generation are proposed; the first model characterizes the intraframe correlation exhibited by the LSPs, while the second model captures both intraframe and interframe correlation. By comparing the entropy rates of the models thus constructed with the CELP rates, it is shown that as many as one-third of the LSP bits in every frame of speech are redundant. We next consider methods by which this residual redundancy can be exploited by an appropriately designed channel decoder. Before transmission, the LSPs are encoded with a forward error control (FEC) code; we consider both block (Reed-Solomon) codes and convolutional codes. Soft-decision decoders that exploit the residual redundancy in the LSPs are implemented assuming additive white Gaussian noise (AWGN) and independent Rayleigh fading environments. Simulation results employing binary phase-shift keying (BPSK) indicate coding gains of 2-5 dB over soft-decision decoders that do not exploit the residual redundancy.

...read moreread less

139 citations

Patent•DOI•

Method and apparatus for efficient multiband celp wideband speech and music coding and decoding

[...]

Anil Wamanrao Ubale¹, Allen Gersho¹•Institutions (1)

University of California¹

26 Feb 1996-Journal of the Acoustical Society of America

TL;DR: In this article, a method of digitally compressing speech and music by use of multiple band fixed excitations stored in codebooks was proposed, along with a coupling method for interconnecting the excitation codebooks and adaptive codebooks, and for generating the composite excitation signal.

...read moreread less

Abstract: A method of digitally compressing speech and music by use of multiple band ("multiband") fixed excitations stored in codebooks. The use of multiband fixed excitations, along with a coupling method for interconnecting the excitation codebooks and adaptive codebooks and for generating the composite excitation signal, improve the long-term and short-term prediction, and the use of voice-music classification allows the coding structure to be adapted to the statistical character of the audio signal.

...read moreread less

90 citations

Patent•

Apparatus and method for encoding/decoding a speech signal using adaptively changing codebook vectors

[...]

Kazuyuki Iijima¹, Masayuki Nishiguchi¹, Jun Matsumoto¹, Shiro Omori¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

25 Oct 1996

TL;DR: In this article, an encoding unit for CELP encoding with a noise codebook memory containing codebook vectors generated by clipping Gaussian noise and learned using the code vectors obtained by learning using the Gaussian noises as initial values.

...read moreread less

Abstract: An encoding apparatus in which an input speech signal is divided into blocks and encoded in units of blocks. The encoding apparatus includes an encoding unit for performing CELP encoding having a noise codebook memory containing having codebook vectors generated by clipping Gaussian noise and codebook vectors obtained by learning using the code vectors generated by clipping the Gaussian noise as initial values. The encoding apparatus enables optimum encoding for a variety of speech configurations.

...read moreread less

43 citations

Patent•

Algebraic code-excited linear prediction speech coding method

[...]

Claude Lamblin¹•Institutions (1)

Orange S.A.¹

04 Jan 1996

TL;DR: In this paper, the authors used the technique of CELP coding with algebraic codebook to find the excitation of the pulses p and q in the codebook using a compound filter made up of synthesis filters and perceptual weighting filter.

...read moreread less

Abstract: The method uses the technique of CELP coding with algebraic codebook The search for the CELP excitation includes a calculation of certain components of the covariance matrix U=H T ·H where H denotes a lower triangular Toeplitz matrix formed on the basis of the impulse response of a compound filter made up of synthesis filters and of a perceptual weighting filter The memory-stored components of the covariance matrix are only those of the form U(pos i ,p,pos i ,p) and those of the form U(pos i ,p, pos j ,q), pos i ,p and pos j ,q respectively denoting position i and position j for the pulses p and q in the codes of the algebraic codebook

...read moreread less

42 citations

Proceedings Article•DOI•

Low-delay CELP with multi-pulse VQ and fast search for GSM EFR

[...]

Shin-Ichi Taumi¹, K. Ozawa², T. Nomura², M. Serizawa²•Institutions (2)

NEC¹, Carnegie Mellon University²

07 May 1996

TL;DR: A novel multi-pulse excitation signal quantization method is proposed, where the pulse amplitudes are vector-quantized (VQ), which remarkably enhances the performance and drastically reduces the position search complexity.

...read moreread less

Abstract: This paper proposes a speech codec, named MP-CELP (multi-pulse-based CELP), with a 10 msec frame length, which has been developed for the GSM EFR (enhanced full-rate) codec standardization. A novel multi-pulse excitation signal quantization method is proposed, where the pulse amplitudes are vector-quantized (VQ). The combination search of the pulse position and the amplitude VQ remarkably enhances the performance. By restricting the pulse positions based on the algebraic-type structure, the search complexity and the bits are reduced. The divided pulse position search drastically reduces the position search complexity. The speech quality for MP-CELP is higher than that for G.728 LD-CELP. MP-CELP also satisfies all the speech quality requirements of the GSM EFR standardization except for the background noise condition.

...read moreread less

36 citations

Proceedings Article•DOI•

A multi-mode variable rate speech coder for CDMA cellular systems

[...]

N. Tanaka, T. Morii, K. Yoshida, K. Homma

28 Apr 1996

TL;DR: This paper presents a multi-mode variable rate speech coder based on the CELP algorithm that got the highest overall score on the tests in speech quality and average data rate and proposed this coder as a candidate for a new speech service standard of the North American CDMA digital cellular system IS-95.

...read moreread less

Abstract: This paper presents a multi-mode variable rate speech coder based on the CELP algorithm. The coder operates at a rate of 8.5 kbps, 4 kbps or 0.8 kbps with a 20 ms frame. The coder consists of five coding modes applied to distinct speech features. One out of the five coding modes is selected for each frame by using a mode selector which comprises a neural network and a speech power variation detector. To improve the coding performance, an inter-frame predictive LSP quantizer and a coding strategy for speech onsets are utilized. In low bit-rate speech coding, decoded speech quality is severely degraded in high background noise. A noise suppressor based on the spectral subtraction algorithm is also introduced in order to reduce background noises. We proposed this coder as a candidate for a new speech service standard of the North American CDMA digital cellular system IS-95. As a result of the first evaluation conducted by the Telecommunications Industry Association, the coder got the highest overall score on the tests in speech quality and average data rate.

...read moreread less

33 citations

Patent•

Multistage low bit-rate CELP speech coder with switching code books depending on degree of pitch periodicity

[...]

Kazunori Ozawa¹•Institutions (1)

NEC¹

29 Feb 1996

TL;DR: In this paper, a quantization unit quantizes the spectral parameters of at least one subframe by switching between a plurality of quantization code books to obtain quantized spectral parameters.

...read moreread less

Abstract: A voice coder system is capable of coding speech at low bit rates with high speech quality. Speech signals are divided into frames and further divided into subframes. A spectral parameter calculator calculates spectral parameters representing a spectral characteristic of the speech signals in at least one subframe. A quantization unit quantizes the spectral parameters of at least one subframe by switching between a plurality of quantization code books to obtain quantized spectral parameters. A mode classifier includes means for calculating a degree of pitch periodicity based on pitch prediction distortions and determines one of a plurality of modes for each frame using the degree of pitch periodicity. A weighting part weights perceptual weights to the speech signals depending on the spectral parameters obtained in the spectral parameter calculator to obtain weighted signals. An adaptive code book obtains a set of pitch parameters representing pitch periods of the speech signals in a predetermined mode by using the determined mode, the spectral parameters, the quantized spectral parameters, and the weighted signals. An excitation quantization unit searches a plurality of stages of excitation code books and gain code books by using the spectral parameters, the quantized spectral parameters, the weighted signals and the pitch parameters to obtain quantized excitation signals of the speech signals and is able to switch between a plurality of excitation code books and a plurality of gain code books based on the mode determined by the mode classifier.

...read moreread less

33 citations

Patent•

Method and apparatus for CELP coding an audio signal while distinguishing speech periods and non-speech periods

[...]

Katsutoshi Itoh¹•Institutions (1)

Oki Electric Industry¹

22 Aug 1996

TL;DR: In this paper, a new autocorrelation matrix based on the combination of the autocorerelation matrix of the current frame and that of a past period determined to be a noise is proposed.

...read moreread less

Abstract: For the CELP (Code Excited Linear Prediction) coding of an input audio signal, an autocorrelation matrix, a speech/noise decision signal and a vocal tract prediction coefficient are fed to an adjusting section. In response, the adjusting section computes a new autocorrelation matrix based on the combination of the autocorrelation matrix of the current frame and that of a past period determined to be a noise. The new autocorrelation matrix is fed to an LPC (Linear Prediction Coding) analyzing section. The analyzing section computes a vocal tract prediction coefficient based on the autocorrelation matrix and delivers it to a prediction gain computing section. At the same time, in response to the above new autocorrelation matrix, the analyzing section computes an optimal vocal tract prediction coefficient by correcting the vocal tract prediction coefficient. The optimal vocal tract prediction coefficient is fed to a synthesis filter.

...read moreread less

32 citations

Proceedings Article•DOI•

16 kbit/s wideband speech coding based on unequal subbands

[...]

J.W. Paulus, J. Schnitzler

07 May 1996

TL;DR: A split-band encoding scheme for 16 kbit/s wideband speech coding (50-7000 Hz), using 2 unequal subbands from 0-6 kHz and from 6-7 kHz, which was motivated by an experimental evaluation of the signal bandwidth of speech frames.

...read moreread less

Abstract: We propose a split-band encoding scheme for 16 kbit/s wideband speech coding (50-7000 Hz), using 2 unequal subbands from 0-6 kHz and from 6-7 kHz. This approach was motivated by an experimental evaluation of the signal bandwidth of speech frames. The higher subband is simply represented by white noise with adjustment of the short term energy. For the lower subband code-excited linear prediction (CELP) is used. The analysis filter bank, which performs the unequal band splitting combined with critical subsampling of the sub-bands, is described. A bit error concealment technique and the bit allocation is also presented. By informal listening tests the speech quality was rated higher than the speech quality of the CCITT G.722 wideband codec operating at 48 kbit/s.

...read moreread less

31 citations

Patent•DOI•

Method and apparatus for speech encoding and decoding by sinusoidal analysis and waveform encoding with phase reproducibility

[...]

Masayuki Nishiguchi¹, Kazuyuki Iijima¹, Jun Matsumoto¹, Omori Shiro¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

24 Oct 1996-Journal of the Acoustical Society of America

TL;DR: A speech encoding method and apparatus in which an input speech signal is divided in terms of blocks or frames as encoding units and encoded in termsof the encoding units, whereby explosive and fricative consonants can be impeccably reproduced.

...read moreread less

Abstract: A speech encoding method and apparatus in which an input speech signal is divided in terms of blocks or frames as encoding units and encoded in terms of the encoding units, whereby explosive and fricative consonants can be impeccably reproduced, while there is an attenuation of the occurrence of foreign sounds being generated at a transient portion between voiced (V) and unvoiced (UV) portions, so that the speech with high clarity devoid of “stuffed” feeling may be produced. The encoding apparatus includes a first encoding unit for finding residuals of linear predictive coding (LPC) of an input speech signal for performing harmonic coding and a second encoding unit for encoding the input speech signal by waveform coding. The first encoding unit and the second encoding unit are used for encoding a voiced (V) portion and an unvoiced (UV) portion of the input signal, respectively. Code excited linear prediction (CELP) encoding employing vector quantization by a closed loop search of an optimum vector using an analysis-by-synthesis method is used for the second encoding unit. A corresponding decoding method and apparatus is also provided.

...read moreread less

Journal Article•DOI•

An 8-kb/s conjugate structure CELP (CS-CELP) speech coder

[...]

A. Kataoka, Takehiro Moriya, S. Hayashi

01 Nov 1996-IEEE Transactions on Speech and Audio Processing

TL;DR: Subjective testing indicates that the quality of this coder is equivalent to that of 32-kb/s adaptive differential pulse code modulation (ADPCM) under error-free conditions, and testing has further demonstrated that the coding is robust against random bit errors.

...read moreread less

Abstract: This paper describes a high-quality 8-kb/s speech coder called conjugate structure code-excited linear prediction (CS-CELP) with a 10-ms frame length. To provide a short delay and high quality under both error-free and channel error conditions, it uses three new schemes: line spectrum pair (LSP) quantization using interframe prediction, preselection in the codebook search, and gain vector quantization (VQ) with backward prediction. The LSP parameters are quantized by using multistage VQ with moving-average (MA) prediction. This scheme can operate efficiently with various frequency responses of speech. The preselection of the codebook reduces the computational complexity and improves the robustness to channel errors. The gain VQ with backward prediction can provide a high quality and robustness without transmission of input speech power information. A conjugate structure for both random codebook and gain codebook is introduced to improve the ability to handle random bit errors and to reduce codebook storage memory requirements. Subjective testing indicates that the quality of this coder is equivalent to that of 32-kb/s adaptive differential pulse code modulation (ADPCM) under error-free conditions. Testing has further demonstrated that the coder is robust against random bit errors.

...read moreread less

Patent•

Pitch-synchronous speech coding by applying multiple analysis to select and align a plurality of types of code vectors

[...]

Mitsuo Fujimoto¹•Institutions (1)

Sanyo¹

20 May 1996

TL;DR: A speech coder using a pitch synchronous innovation code excited linear prediction (PSI-CELP) speech coding system is described in this paper, which is capable of representing a portion which is not sufficiently represented by an adaptive codebook in a periodic portion of input speech and capable of improving the quality of reproduced speech.

...read moreread less

Abstract: A speech coder using a pitch synchronous innovation code excited linear prediction (PSI-CELP) speech coding system. The speech coder is capable of representing a portion which is not sufficiently represented by an adaptive codebook in a periodic portion of input speech and capable of improving the quality of reproduced speech. The periodicity corresponds to the pitch cycle of input speech by preliminarily reproducing speech from simple impulse trains. The speech coder depending on the particular embodiment includes an adaptive code book, a fixed code book, a noise code book, and a pulse codebook. A pulse code book stores a plurality of types of codevectors corresponding to pitch waveforms of voiced sounds. At the time of coding input speech, the pulse code book is searched.

...read moreread less

Proceedings Article•DOI•

Reconstruction of missing speech frames using sub-band excitation

[...]

K. Cluver, P. Noll

18 Jun 1996

TL;DR: The combination of the reconstruction method with adaptive speech coders showed virtually the same good results for forward adaptation, whereas a higher degradation is caused by backward-adaptive coders.

...read moreread less

Abstract: A new reconstruction method for frame erasures in speech transmission is presented which is based on parameterization of the speech signal by means of linear prediction (LPC) and voicing analysis. The problem of generating partially voiced substitute speech signals is solved by performing separate voicing decisions in sub-bands. The method yields considerable improvements compared with silence substitution for frame erasure ratios of up to 10% or even 20%. The combination of the reconstruction method with adaptive speech coders showed virtually the same good results for forward adaptation, whereas a higher degradation is caused by backward-adaptive coders.

...read moreread less

Patent•DOI•

Speech compressor using trellis encoding and linear prediction

[...]

Victor D. Kolesnik, Victor Yu Krachkovsky, Boris D. Kudrjashov, Eugene P. Ovsjannikov, Boris K. Trojanovsky, Vladimir Egorov - Show less +2 more

18 Jun 1996-Journal of the Acoustical Society of America

TL;DR: A speech compressor utilizing Trellis Encoding and Linear Prediction (TELP), which provides improved signal generation and search technique for a code-excited linear prediction (CELP) speech encoder.

...read moreread less

Abstract: A speech compressor utilizing Trellis Encoding and Linear Prediction (TELP). A TELP speech compressor provides improved signal generation and search technique for a code-excited linear prediction (CELP) speech encoder. TELP is a frame oriented coding that breaks the quantized speech signals into frames of prescribed length N and each frame into subframes of prescribed length L, which are processed as dependent units utilizing an analysis-by-synthesis approach. The approach is based on constructing the best mean square linear predicting filter and searching the best exciting sequence for the filter in order to produce synthesized speech. A trellis encoder is used instead of a stochastic code book. The Q-ary analysis of a given subframe and previous excitations is proposed for a fast vector search in an adaptive code book. It simplifies the implementation of digital speech compression.

...read moreread less

Patent•

Voice coder for coding voice signal with code-excited linear prediction coding

[...]

Keiichi Funaki¹•Institutions (1)

NEC¹

01 Apr 1996

TL;DR: In this paper, a voice coder has an LPC (linear prediction coding) analyzer, a parameter quantizer for quantizing the LPC coefficients to output a quantized code CL, an adaptive codebook, a long-term predicting circuit for searching the codebook to determine a delay code CD and an adaptive vector, an excitation codebook and a gain codebook searching circuit for determining an optimum quantised code CS.

...read moreread less

Abstract: A voice coder for coding a speech signal at a low bit rate with high speech quality and improved efficiency for gain quantization according to code-excited linear prediction (CELP) coding. The voice coder has an LPC (linear prediction coding) analyzer for calculating LPC coefficients, a parameter quantizer for quantizing the LPC coefficients to output a quantized code CL, an adaptive codebook, a long-term predicting circuit for searching the adaptive codebook to determine a delay code CD and an adaptive code vector, an excitation codebook, an excitation codebook searching circuit for determining an optimum quantized code CS and an excitation vector, and a gain codebook searching circuit for outputting a gain code CG by determining quantized gains representing quantized vectors of gains of the adaptive code vector and the excitation vector. The gain codebook searching circuit has a plurality of gain codebooks each for storing quantized gains corresponding to one of searching ranges divided by predetermined ranges with respect to the value of a searching parameter, and gain codebook selector for selecting one of the gain codebooks depending on the value of the searching parameter. The gain code CG is determined by using the gain codebook selected by the gain codebook selector.

...read moreread less

Proceedings Article•DOI•

CELP coding system based on mel-generalized cepstral analysis

[...]

Kazuhito Koishida¹, Keiichi Tokuda, Takao Kobayashi, S. Imai•Institutions (1)

Tokyo Institute of Technology¹

03 Oct 1996

TL;DR: The subjective performance test indicates that the quality of the proposed CELP coder is about 2 dB higher than that of the conventional one and the spectrum represented by mel-generalized cepstrum has frequency resolution similar to that of human ear.

...read moreread less

Abstract: This paper presents a CELP speech coding system based on mel-generalized cepstral analysis. In the mel-generalized cepstral analysis, we can vary the model spectrum continuously from AR to cepstral modeling by changing the value of a parameter /spl gamma/ and we can choose an appropriate model spectrum. Furthermore, the spectrum represented by mel-generalized cepstrum has frequency resolution similar to that of human ear. Since the perceptual weighting and postfiltering are carried out through the mel-generalized cepstrum, we expect the perceptual performance of the proposed coder to be improved. The subjective performance test indicates that the quality of the proposed CELP coder is about 2 dB higher than that of the conventional one.

...read moreread less

Proceedings Article•DOI•

Wideband re-synthesis of narrowband CELP-coded speech using multiband excitation model

[...]

Cheung-Fat Chan¹, Wai-Kwong Hui•Institutions (1)

City University of Hong Kong¹

03 Oct 1996

TL;DR: The approach is to reduce the hoarse voice in CELP-coded speech by enhancing the pitch periodicity in the reproduction signal and also to reduced the muffing characteristics of narrowband speech by regenerating the highband components of speech spectra from the reproduction Signal.

...read moreread less

Abstract: In this paper, a method for improving the quality of narrowband CELP-coded speech is present. The approach is to reduce the hoarse voice in CELP-coded speech by enhancing the pitch periodicity in the reproduction signal and also to reduce the muffing characteristics of narrowband speech by regenerating the highband components of speech spectra from the reproduction signal. In the proposed method, multiband excitation (MBE) analysis is performed on the reproduction speech signal from a CELP decoder and the pitch periodicity is enhanced by resynthesizing the speech signal using a harmonic synthesizer according to the MBE model. The highband magnitude spectra are regenerated by matching to lowband spectra using a trained wideband spectral codebook. Information about the voiced/unvoiced (V/UV) excitation in the highband are derived from a training procedure and then stored alongside with the wideband spectral codebook so that they can be recovered by indexing to the codebook using the matched lowband index. Simulation results indicate that the quality of the wideband resynthesized speech is significantly improved over the narrowband CELP-coded speech.

...read moreread less

Proceedings Article•DOI•

Robust classification of speech based on the dyadic wavelet transform with application to CELP coding

[...]

Joachim Stegmann¹, G. Schroder², K.A. Fischer²•Institutions (2)

Deutsche Telekom¹, Carnegie Mellon University²

07 May 1996

TL;DR: A new algorithm for the classification of telephone-bandwidth speech that is designed for efficient control of bit allocation in low bit-rate speech coders and in comparison with a classifier based on the long-term autocorrelation function, the D/sub y/WT classifier proves to be superior.

...read moreread less

Abstract: This paper describes a new algorithm for the classification of telephone-bandwidth speech that is designed for efficient control of bit allocation in low bit-rate speech coders. The algorithm is based on the dyadic wavelet transform (D/sub y/WT) and classifies each unit subframe into one of the three categories background noise/unvoiced, transients/voicing onsets, periodic/voiced. A set of three parameters is derived from the D/sub y/WT coefficients, each giving a decision score that the associated class is active. Taking the history into account, a finite-state model controlled by these parameters computes the classifier's decision. The proposed algorithm is robust to various types of background noise. In comparison with a classifier based on the long-term autocorrelation function, the D/sub y/WT classifier proves to be superior. To evaluate its performance in CELP-type speech coders, a variety of excitation coding schemes with bit rates between 2200 and 4800 bit/s is investigated.

...read moreread less

Patent•DOI•

Method for searching an excitation codebook in a code excited linear prediction (CELP) coder

[...]

Andrew P. Dejaco¹, Bi Ning¹•Institutions (1)

Qualcomm¹

31 Jul 1996-Journal of the Acoustical Society of America

TL;DR: In this paper, the analysis window for the coder is extended beyond the length of the target speech frame by using a one-dimensional autocorrelation matrix to reduce the computational complexity and memory required for the search.

...read moreread less

Abstract: A method for selecting a code vector in an algebraic codebook wherein the analysis window for the coder is extended beyond the length of the target speech frame By extending the analysis window, the two dimensional impulse response matrix can be stored as a one dimensional autocorrelation matrix greatly saving on the computational complexity and memory required for the search

...read moreread less

Proceedings Article•DOI•

Linked split-vector quantizer of LPC parameters

[...]

Moo Young Kim¹, Nam Ha, Sang Ryong Kim•Institutions (1)

Samsung¹

07 May 1996

TL;DR: The linked split-vector quantizer (LSVQ) where the lower and the upper codebook are selected according to the preselected middle codevector, using the ordering property of LSFs, links three codebooks for the efficient use of the codebook space.

...read moreread less

Abstract: In speech coding, several vector quantization (VQ) methods for the LPC (linear predictive coding) parameters have been developed. Because LPC parameters are too dynamic to quantize directly, the LSFs (line spectrum frequencies) are used instead. In this study, we propose the linked split-vector quantizer (LSVQ) where the lower and the upper codebook are selected according to the preselected middle codevector. Using the ordering property of LSFs, LSVQ links three codebooks for the efficient use of the codebook space. Compared with the conventional split-vector quantizer (SVQ), LSVQ increases the usage of codebook space by 10.84%, and shows lower spectral distortion at 23 bits/frame than the SVQ at 24 bits/frame.

...read moreread less

Patent•

Method for reducing pitch search time for vocoder

[...]

Kyung-Jin Byun¹, Ha-Young Yoo, Kim Jong-Jae, Han Ki-Chun, Kim Jae-Suk, Myung-Jin Bae - Show less +2 more•Institutions (1)

Electronics and Telecommunications Research Institute¹

24 Jun 1996

TL;DR: In this paper, the authors proposed a method to receive a speech signal, perform a recognition weighting process on it, synthesize a synthetic speech signal and calculate an autocorrelation of the synthesized speech signal whose delay is a predetermined value, to divide the square of the former by the latter, to calculate a pitch lag and a pitch filter coefficient by calculating only the part of a positive peak with skipping over the negative peak.

...read moreread less

Abstract: The present invention relates to the method to receive a speech signal, to perform a recognition weighting process on it, to synthesize a synthetic speech signal, to calculate an autocorrelation of the synthetic speech signal whose delay is a predetermined value and an autocorrelation whose delay is 0, to divide the square of the former by the latter, to calculate a pitch lag and a pitch filter coefficient by calculating only the part of a positive peak with skipping over the part of a negative peak by using the results from the dividing operation, and to calculate and output the pitch lag and the pitch filter coefficient by repeating the above process Thus, real-time implementation of CELP vocoder can be achieved.

...read moreread less

Proceedings Article•DOI•

A new fast pitch search algorithm using the abbreviated correlation function in CELP vocoder

[...]

Joohun Lee¹, Myung-Jin Bae, Hah-Young Yoo•Institutions (1)

Soongsil University¹

21 Oct 1996

TL;DR: To find the optimum pitch lag in a CELP vocoder with reduced computation requirement, a new pitch search algorithm is proposed, based on the sign of the abbreviated correlation function and agrees with that of the original correlation function.

...read moreread less

Abstract: To find the optimum pitch lag in a CELP vocoder with reduced computation requirement, a new pitch search algorithm is proposed. This algorithm is based on the sign of the abbreviated correlation function and agrees with that of the original correlation function. The abbreviated correlation function makes it possible to confine the candidates for the optimum pitch to the positively correlated lags, which reduces the computation load considerably. However, since the optimum pitch can be found without omission, the degradation of segmental SNR (SEGSNR) does not occur with the proposed algorithm. Experimental results show that the proposed algorithm can achieve 40% of computation time reduction compared to the conventional full search method.

...read moreread less

Journal Article•DOI•

A software tool for introducing speech coding fundamentals in a DSP course

[...]

Andreas Spanias¹, E. Painter¹•Institutions (1)

Arizona State University¹

01 May 1996-IEEE Transactions on Education

TL;DR: An educational software tool on speech coding is presented that is used in a senior-level DSP class at Arizona State University, USA, to expose undergraduate students to speech coding and present speech analysis/synthesis as an application paradigm for many DSP fundamental concepts.

...read moreread less

Abstract: An educational software tool on speech coding is presented. Portions of this program are used in a senior-level DSP (digital signal processing) class at Arizona State University, USA, to expose undergraduate students to speech coding and present speech analysis/synthesis as an application paradigm for many DSP fundamental concepts. The simulation software provides an interactive environment that allows users to investigate and understand speech coding algorithms for a variety of input speech records. Time- and frequency-domain representations of input and reconstructed speech can be graphically displayed and played back on a PC equipped with a standard 16-bit sound card. The program has been developed for use in the MATLAB environment and includes implementations of the FS-1015 LPC-10e, the FS-1016 CELP, the ETSI GSM, the IS-54 VSELP, the G.721 ADPCM, and the G.728 LD-CELP speech coding algorithms, integrated under a common graphical interface.

...read moreread less

Proceedings Article•DOI•

Wideband enhancement of narrowband coded speech using MBE re-synthesis

[...]

Cheung-Fat Chan¹, Wai-Kwong Hui•Institutions (1)

City University of Hong Kong¹

14 Oct 1996

TL;DR: Simulation results indicate that the duality of the wideband resynthesized speech is significantly improved over the narrowband CELP-coded speech.

...read moreread less

Abstract: A method for improving the quality of narrowband CELP-coded speech is presented. The approach is to reduce the hoarse quality of CELP-coded speech by enhancing the pitch periodicity in the reproduction signal and also to reduce the muffing characteristics of narrowband speech by regenerating the highband components of speech spectra from the reproduction signal. In the proposed method, multiband excitation (MBE) analysis is performed on the reproduction speech signal from a CELP decoder and the pitch periodicity is enhanced by re-synthesizing the speech signal using a harmonic synthesizer according to the MBE model. The highband magnitude spectra are regenerated by matching to lowband spectra using a trained wideband spectral codebook. Information about the voiced/unvoiced (V/UV) excitation in the highband are derived from a training procedure and then stored alongside with the wideband spectral codebook so that they can be recovered by indexing to the codebook using the matched lowband index. Simulation results indicate that the duality of the wideband resynthesized speech is significantly improved over the narrowband CELP-coded speech.

...read moreread less

Patent•

Speech encoding method and apparatus and speech decoding method and apparatus

[...]

Kazuyuki Iijima¹, Jun Matsumoto¹, Masayuki Nishiguchi¹, Shiro Omori¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

25 Oct 1996

TL;DR: In this article, the authors proposed a speech encoding method and apparatus in which an input speech signal is divided into blocks or frames as encoding units and encoded in terms of the encoding units, in which explosive and fricative consonants can be impeccably reproduced, while there is no risk of foreign sound being generated at a transient portion between voiced (V) and unvoiced (UV) portions.

...read moreread less

Abstract: A speech encoding method and apparatus in which an input speech signal is divided.in terms of blocks or frames as encoding units and encoded in terms of the encoding units, in which explosive and fricative consonants can be impeccably reproduced, while there is no risk of foreign sound being generated at a transient portion between voiced (V) and unvoiced (UV) portions, so that the speech with high clarity devoid of "stuffed" feeling may be produced. The encoding apparatus includes a first encoding unit 110 for finding residuals of linear predictive coding (LPC) of an input speech signal for performing harmonic coding and a second encoding unit 120 encoding the input speech signal by waveform coding. The first encoding unit 110 and the second encoding unit 120 are used for encoding a voiced (V) portion and an unvoiced (UV) portion of the input signal, respectively. The constitution of a code excited linear prediction (CELP) encoding employing vector quantization by a closed loop search of an optimum vector using an analysis-by-synthesis method is used for the second encoding unit 120.

...read moreread less

Journal Article•DOI•

Artificial neural networks for nonlinear time-domain filtering of speech

[...]

T.T. Le¹, J.S. Mason¹•Institutions (1)

Swansea University¹

01 Jun 1996

TL;DR: Direct comparisons of MLPs and linear filters show that with CELP degradation the SNR improvements achieved by the MLP is measurably better than with an equivalent linear structure but when the degradation is additive noise the two structures perform equally well.

...read moreread less

Abstract: A multilayer perceptron (MLP) is applied as a time domain nonlinear filter to two classes of degraded speech, namely gaussian white noise and nonlinear system degradation introduced by a low bit-rate CELP coder. The goal of the study is to examine the influence of the inherent nonlinearity within the MLP, and this is achieved by varying the levels of nonlinearity within the structure. Direct comparisons of MLPs and linear filters show that with CELP degradation the SNR improvements achieved by the MLP is measurably better than with an equivalent linear structure (3 dB cf 1.5 dB) but when the degradation is additive noise the two structures perform equally well. The study highlights the importance of scaling to achieve optimum performance, and of matching the enhancer to the degradation.

...read moreread less

Proceedings Article•DOI•

Image quality prediction for bitrate allocation

[...]

P. Fleury¹, J. Reichel, T. Ebrahimi•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

16 Sep 1996

TL;DR: This paper investigates a way to predict the coding quality from the image content, based on a neural network, which can be based on those predicted coding qualities, and does not require the computation of all coding algorithms.

...read moreread less

Abstract: In image coding, the choice of a good image coding algorithm is very dependent on the image content. Based on this fact, dynamic coding algorithms have been designed. They try to find an optimal coding scheme for each image segment. They rely on an exhaustive search of the best coding algorithm. Evaluation of all algorithms is computationally very intensive and strongly limits the number of considered algorithms for a given application. Therefore, current standards rely on a single coding algorithm. This paper investigates a way to predict the coding quality from the image content. This prediction is based on a neural network. The coding quality is computed from image region features. Those features are easy and fast to compute, and are common to the whole set of considered coding algorithms. Therefore, the choice of the best algorithm can be based on those predicted coding qualities, and does not require the computation of all coding algorithms. The system is also fast enough to be used for dynamic bitrate allocation, and a simple algorithm to do this is proposed.

...read moreread less

Proceedings Article•DOI•

Speech synthesis using the CELP algorithm

[...]

G. Lino de Campos¹, Evandro B. Gouvêa•Institutions (1)

University of São Paulo¹

03 Oct 1996

TL;DR: The paper presents a phoneme/diphone based speech synthesis system for the (Brazilian) Portuguese language and detailing the process of building the phoneme library and the interpolation techniques used.

...read moreread less

Abstract: The paper presents a phoneme/diphone based speech synthesis system for the (Brazilian) Portuguese language. The basic idea of this system is the construction of a library of phonetic units, and processing of those basic units to build an utterance. The system is complemented by a text to phoneme translator described previously. The phonate's representation in the library is based on a linear prediction model; the filter which models the vocal tract is represented by line spectrum pairs, and the excitation by code excited linear prediction (CELP) parameters. The paper is organized as follows. After a brief introduction, CELP coding is briefly presented and the relevant points to be applied in speech synthesis are presented. The main contribution of the paper is detailing the process of building the phoneme library and the interpolation techniques used.

...read moreread less

Patent•

Pitch searching time reducing method for code excited linear prediction vocoder using line spectral pair

[...]

Kyung-Jin Byun¹, Hah-Yong Yoo¹, Han Ki-Chun¹, Kim Jong-Jae¹, Myung-Jin Bae¹ - Show less +1 more•Institutions (1)

Electronics and Telecommunications Research Institute¹

19 Sep 1996

TL;DR: An improved pitch searching time reducing method for a CELP vocoder using a Line Spectral Pair (LSP) frequency which is capable of significantly reducing the pitch search time by separating the speech signal using a first formant frequency of the line spectral pair of the digital type personal communication system is presented in this paper.

...read moreread less

Abstract: An improved pitch searching time reducing method for a CELP vocoder using a Line Spectral Pair (LSP) frequency which is capable of significantly reducing the pitch search time by separating the speech signal using a first formant frequency of the line spectral pair of the digital type personal communication system, which includes the steps of computing a decimation interval of a pitch search interval using an LSP frequency of a first formant computed by a formant filter so as to compute a preparatory pitch of a given speech; determining a preparatory pitch to be used when searching a pitch by detecting a peak and a valley within each decimation interval; and computing a preparatory pitch by adapting a first formant frequency of an LSP computed by a formant filter with a decimation rate and performing a pitch search with respect to the obtained preparatory pitch.

...read moreread less