scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 2007"


Proceedings ArticleDOI
15 Apr 2007
TL;DR: This paper describes the scalable coder - G.729.1 - which has been recently standardized by ITU-T for wideband telephony and voice over IP (VoIP) applications and which can operate at 12 different bit rates from 32 down to 8 kbit/s with wideband quality starting at 14 k bit/s.
Abstract: This paper describes the scalable coder - G.729.1 - which has been recently standardized by ITU-T for wideband telephony and voice over IP (VoIP) applications. G.729.1 can operate at 12 different bit rates from 32 down to 8 kbit/s with wideband quality starting at 14 kbit/s. This coder is a bitstream interoperable extension of ITU-T G.729 based on three embedded stages: narrowband cascaded CELP coding at 8 and 12 kbit/s, time-domain bandwidth extension (TDBWE) at 14 kbit/s, and split-band MDCT coding with spherical vector quantization (VQ) and pre-echo reduction from 16 to 32 kbit/s. Side information - consisting of signal class, phase, and energy - is transmitted at 12, 14 and 16 kbit/s to improve the resilience and recovery of the decoder in case of frame erasures. The quality, delay, and complexity of G.729.1 are summarized based on ITU-T results.

108 citations


Book
24 Sep 2007
TL;DR: This book discusses Speech Signals and Wavelets and Pitch Detection, Predictive Coding, and the Quadratic Spline Wavelets, and concludes with a comparison of Speech Transceivers and their applications.
Abstract: About the Authors. Other Wiley and IEEE Press Books on Related Topics. Preface and Motivation. Acknowledgements. I Speech Signals andWaveform Coding. 2 Predictive Coding. 3 Analysis-by-synthesis Principles. 4 Speech Spectral Quantization. 5 RPE Coding. 6 Forward-Adaptive CELP Coding. 7 Standard CELP Codecs. 8 Backward-Adaptive CELP Coding. 9 Wideband Speech Coding. 10 MPEG-4 Audio Compression and Transmission. 11 Overview of Low-rate Speech Coding. 12 Linear Predictive Vocoder. 13 Wavelets and Pitch Detection. 14 Zinc Function Excitation. 15 Mixed-Multiband Excitation. 16 Sinusoidal Transform Coding Below 4kbps. 17 Conclusions on Low Rate Coding. 18 Comparison of Speech Transceivers. 19 Voice Over the Internet Protocol. A Constructing the Quadratic Spline Wavelets. B Zinc Function Excitation. C Probability Density Function for Amplitudes. Bibliography. Index. Author Index.

74 citations


Proceedings ArticleDOI
15 Apr 2007
TL;DR: This work proposes an improved low bit rate bandwidth extension algorithm along with a robust watermarking scheme for CELP-type speech codecs which is especially tailored to state-of-the-art narrowband speech communication networks such as GSM or UMTS.
Abstract: We consider the problem of transmitting a wideband speech signal with a cut-off frequency of fc = 7 kHz over a standardized narrowband (fc = 3.4 kHz) communication link in a backwards compatible manner. In a previous contribution we have shown that backwards compatibility can be achieved by using digital watermarking: we embedded compact side information about the missing high frequency band (3.4 - 7 kHz) into the narrowband speech signal. Here, we present a related system which is especially tailored to state-of-the-art narrowband speech communication networks such as GSM or UMTS. Therefore, we propose an improved low bit rate bandwidth extension algorithm along with a robust watermarking scheme for CELP-type speech codecs. The practical relevance of our system is shown by speech quality evaluations and by link-level simulations for the "enhanced full rate traffic channel" (TCH/EFS) of the GSM cellular communication system.

52 citations


Journal ArticleDOI
TL;DR: This article investigates a new type of hybrid vector quantisers, called the switched split vector quantiser (SSVQ), that addresses the memory and shape suboptimality of SVQ, leading to better quantisation performance.

47 citations


Patent
29 Aug 2007
TL;DR: In this article, a seamless switching method for voice/music dual-mode en-decoders was proposed, where the audio signal-rear of the final CELP frame in the time domain before switching adopts window-adding and folding process, and the overlapping nature of MDCT transforming ensures the continuity of switching.
Abstract: The invention relates to a seamless switching method for voice / music dual-mode en-decoding. When a dual-mode en-decoder switches from CELP voice mode to MDCT music mode, the audio signal-rear of the final CELP frame in the time domain before switching adopts window-adding and folding process, and the overlapping nature of MDCT transforming ensures the continuity of switching. When a dual-mode en-decoder switches from MDCT music mode to CELP voice mode, the final MDCT frame before switching adopts a new window type in order to ensure there is no overlapping time domain with the first CELP frame, and the pre-coding technology ensures the continuity of switching.

34 citations


Journal ArticleDOI
TL;DR: The objective of this paper is to detect speech forgery using digital audio watermarking and pattern recovery techniques, which uses the cyclic pattern embedding to overcome synchronizing problems of previous detection techniques.

30 citations


Journal ArticleDOI
TL;DR: A technique to improve the recovery after a frame erasure is proposed in a constrained excitation search at the encoder and a resynchronization procedure at the decoder that does not need additional delay.
Abstract: The adaptive codebook used in code-excited linear prediction (CELP)-like speech codecs is very effective for modeling the quasi-periodic component of the excitation signal but, unfortunately, introduces a strong interframe dependency that renders the decoder vulnerable to frame erasures. For voiced speech, the error affects not only the erased frame but also all the subsequent frames. In this paper, a technique to improve the recovery after a frame erasure is proposed. The technique consists in a constrained excitation search at the encoder and a resynchronization procedure at the decoder. The constraint aims at reducing the contribution of the adaptive codebook by making the innovation codebook partially model the pitch excitation. Further, for highly voiced frames, the pitch-related information contained in the innovation excitation is exploited at the decoder to speed up the resynchronization of the adaptive codebook after a frame erasure. When applied to the adaptive multirate wideband (AMR-WB) codec, the method brings a significant improvement in the case of frame erasures, at the cost of a minor quality loss compared to the standard codec at the same bit rate. The method does not need additional delay and has the advantage of maintaining full interoperability between the standard codec and its modified version.

26 citations


Journal ArticleDOI
TL;DR: A novel method is presented to construct reduced complexity algorithms based on the classification of the signal domain and efficient approximation of the residual redundancy or the a priori transition probabilities, which provide high quality error concealment solutions for code excited linear prediction (CELP) coders.
Abstract: Exploiting the residual redundancy in a source coder output stream during the decoding process has been proven to be a bandwidth efficient way to combat the noisy channel degradations. In this paper, we consider soft reconstruction of speech spectrum, in GSM adaptive multirate and IS-641 vocoders, transmitted over a channel disturbed with noise and/or packet loss. Several schemes are presented which exploit different levels of intraframe and interframe residual redundancy for improved source decoding at the receiver. A packetization strategy is proposed which is matched to the presented error concealment units. For decoders that exploit the residual redundancy, extensive complexity has been a serious concern, especially as the quantizer bitrate increases . In this paper, a novel method is presented to construct reduced complexity algorithms. The proposed methodology is based on the classification of the signal domain and efficient approximation of the residual redundancy or the a priori transition probabilities. The presented schemes provide high quality error concealment solutions for code excited linear prediction (CELP) coders

25 citations


Journal ArticleDOI
TL;DR: Both the theoretical analyses and the speech coding experiments show that with packet overheads, the simple PD methods may be preferable to MD coding.
Abstract: A key feature of wireless mesh networks is that multiple independent paths through the network are available. Multiple descriptions coding is often suggested as a source coding scheme to take advantage of this path diversity. We compare multiple description (MD) coding with path diversity (PD) against a full-rate single description (SD) coder without PD, and two simple PD methods of 1) repeating a half-rate SD coder over both paths and 2) repeating the full-rate parent SD coder over the two paths. We first present a theoretical analysis comparing the average distortion per symbol in packetized communication using the above mentioned MD and PD methods to transmit a memoryless Gaussian source over additive white Gaussian noise channels. Next, using two new MD speech coders with balanced side descriptions derived from the AMR-WB and G.729 standards, we evaluate delivered voice quality using PESQ-MOS and compare MD coding against the PD methods for random and bursty packet losses. Both the theoretical analyses and the speech coding experiments show that with packet overheads, the simple PD methods may be preferable to MD coding. A new performance measure that incorporates both quality and bit rate is shown to account for the tradeoffs more explicitly.

25 citations


Patent
13 Feb 2007
TL;DR: In this article, an improved dictionaries of CELP excitation vectors for coding/decoding digital audio signals were constructed by constructing a set of dictionaries with particular structure by: providing a common sequence of pulses forming a base pattern; and assigning the base pattern to each excitation vector of the dictionary, based on one or more occurrences at one or multiple respective positions among said N valid positions.
Abstract: The invention aims at constructing improved dictionaries of CELP excitation vectors for coding/decoding digital audio signals. Usually, each vector of dimension N comprises pulses capable of occupying N valid positions. The invention concerns the construction of dictionaries with particular structure by: providing a common sequence of pulses forming a base pattern; and assigning the base pattern to each excitation vector of the dictionary, based on one or more occurrences at one or more respective positions among said N valid positions. The invention also concerns a combination of dictionaries thus constructed with optionally standard multipulse dictionaries, by union or summation or cascading.

19 citations


Proceedings ArticleDOI
01 Nov 2007
TL;DR: Experimental evaluations show that the speech-in-speech hiding framework is capable of hiding one speech message inside another host speech segment to produce a stego speech segment that is indistinguishable from the original host speech, while being able to extract the hidden speech message without any degradations in quality.
Abstract: This paper presents a speech-in-speech hiding framework for the purpose of reducing the storage and transmission overhead in electronic voice mail applications, as well as for steganography applications of hiding secret speech messages for voice mail security. The technique used exploits the low-pass spectral properties of the Fourier magnitude of a host speech signal to embed another speech signal in the low-amplitude-high-frequency region of the host speech signal's spectral magnitude. Experimental evaluations on real male and female voice segments show that our technique is capable of hiding one speech message inside another host speech segment to produce a stego speech segment that is indistinguishable from the original host speech, while being able to extract the hidden speech message without any degradations in quality.

Patent
26 Feb 2007
TL;DR: In this article, a method for transcoding a CELP-based compressed voice bitstream from source codec to destination codec is proposed, which includes processing a source codec input cELP bitstream to unpack at least one or more CELPs from the input bitstream and interpolating a plurality of unpacked cELPs.
Abstract: A method for transcoding a CELP based compressed voice bitstream from source codec to destination codec. The method includes processing a source codec input CELP bitstream to unpack at least one or more CELP parameters from the input CELP bitstream and interpolating one or more of the plurality of unpacked CELP parameters from a source codec format to a destination codec format if a difference of one or more of a plurality of destination codec parameters including a frame size, a subframe size, and/or sampling rate of the destination codec format and one or more of a plurality of source codec parameters including a frame size, a subframe size, or sampling rate of the source codec format exist. The method includes encoding the one or more CELP parameters for the destination codec and processing a destination CELP bitstream by at least packing the one or more CELP parameters for the destination codec.

Patent
03 Apr 2007
TL;DR: In this article, a system and method for data communication over a cellular communications network that allows the transmission of digital data over a voice channel using a vocoder that monitors parameters of a Levinson Durbin recursion and then uses full rate CELP if the monitored prediction error falls to below a predetermined threshold within a pre-selected number of iterations of the recursion.
Abstract: A system and method for data communication over a cellular communications network that allows the transmission of digital data over a voice channel using a vocoder that monitors parameters of a Levinson Durbin recursion and then uses full rate CELP if the monitored prediction error falls to below a predetermined threshold within a pre-selected number of iterations of the recursion. The system and method encode digital data to be transmitted using a continuous signal modulation technique at a selected bit rate and one or more frequencies that are selected such that the resulting modulated carrier signal is processed by the vocoder using full rate CELP as a result of the monitored prediction error.

Patent
29 Oct 2007
TL;DR: In this article, a high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression decoding and decoding of a speech signal to a digital signal in CELP speech coding method according to a code-excited linear prediction (CELP) speech coding.
Abstract: A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result

Proceedings ArticleDOI
01 Nov 2007
TL;DR: Experiments indicate that, compared with a non-scalable conventional fixed-rate code-excited linear predictive (CELP) coding scheme, the real time scalable coder with scalar quantization performs at least as well in the constrained entropy case, and has nearly identical performance for the constrained resolution case.
Abstract: We describe a coding scheme based on audio and speech quantization with an adaptive quantizer derived from the autoregressive model under high-rate assumptions The main advantage of this scheme compared to state-of-the-art training-based coders is its flexibility The scheme can adapt in real time to any particular rate and has a computational complexity independent of the rate Experiments indicate that, compared with a non-scalable conventional fixed-rate code-excited linear predictive (CELP) coding scheme, our real time scalable coder with scalar quantization performs at least as well in the constrained entropy case, and has nearly identical performance for the constrained resolution case

Proceedings ArticleDOI
27 Aug 2007
TL;DR: Subjective evaluation test results demonstrate that the proposed coder outperforms G.729.1 for music signals at 16 and 24 kbit/s in particular with competitive or even better performance in other conditions like clean speech, background noise, and frame erasure.
Abstract: In this paper, we present a 6.8-32 kbit/s scalable speech and audio coder using a modified-discrete-cosine-transform (MDCT)-based bandwidth extension on top of a 6.8 kbit/s code-excited-linear-prediction (CELP) coder. The proposed coder comprises a 6.8 kbit/s narrowband CELP as its corelayer and eight enhancement layers with the bitrates of 0.8, 1.2, 3.2, or 4.0 kbit/s. After encoding of a narrowband signal by the core-layer, the first enhancement layer extends the bandwidth of a narrowband decoded signal, and the other enhancement layers increase the fidelity of an extended wideband signal or robustness against frame erasure conditions. Subjective evaluation test results demonstrate that the proposed coder outperforms G.729.1 for music signals at 16 and 24 kbit/s in particular with competitive or even better performance in other conditions like clean speech, background noise, and frame erasure.

Patent
23 Apr 2007
TL;DR: In this article, an apparatus for transcoding an audio signal between a CELP-based coder and a hybrid coder includes a source bitstream unwrapper and a compression parameter converter coupled to frame interpolator.
Abstract: An apparatus for transcoding an audio signal between a CELP-based coder and a hybrid coder includes a source bitstream unwrapper configured to receive a source bitstream, extract one or more CELP compression parameters from the source bitstream, and construct an audio signal vector from the source bitstream while maintaining the one or more extracted CELP compression parameters. The apparatus also includes a frame interpolator coupled to the source bitstream unwrapper and a compression parameter converter coupled to frame interpolator. The compression parameter converter is configured to calculate output compression parameters from at least one of the interpolated compression parameters or the one or more extracted CELP compression parameters. Additionally, the apparatus includes a destination bitstream wrapper coupled to the compression parameter converter and a mapping parameter tuner coupled to the frame interpolator. The mapping parameter tuner is configured to select one or more parameters for use by the compression parameter converter.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a speech coder using melfrequency cepstral coefficients (MFCCs) instead of linear predictive coefficients (LPCs) to improve the performance of a server-based speech recognition system in network environments.
Abstract: Existing standard speech coders can provide high quality speech communication. However, they tend to degrade the performance of automatic speech recognition (ASR) systems that use the reconstructed speech. The main cause of the degradation is in that the linear predictive coefficients (LPCs), which are typical spectral envelope parameters in speech coding, are optimized to speech quality rather than to the performance of speech recognition. In this paper, we propose a speech coder using mel-frequency cepstral coefficients (MFCCs) instead of LPCs to improve the performance of a server-based speech recognition system in network environments. To develop the proposed speech coder with a low-bit rate, we first explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel errors. As a result, we propose an 8.7 kbps MFCC-based CELP coder. It is shown that the proposed speech coder has a comparable speech quality to 8 kbps G.729 and the ASR system using the proposed speech coder gives the relative word error rate reduction by 6.8% as compared to the ASR system using G.729 on a large vocabulary task (AURORA4).

Patent
30 Oct 2007
TL;DR: In this article, a band divider divides an input signal into a high-band signal and a lowband signal, a narrowband encoder encodes the low-band signals using a Code Excited Linear Prediction (CELP)-based narrowband speech codec, a frequency characteristic collector converts the highband signal to a signal in a frequency domain and obtains Modified Discrete Cosine Transform (MDCT) coefficients, a subband determiner determines subbands in a final stage based on the MDCT coefficients and determines subband for quantization based on subbands for quant
Abstract: Provided is a speech coding apparatus and method. A band divider divides an input signal into a high-band signal and a low-band signal, a narrowband encoder encodes the low-band signal using a Code Excited Linear Prediction (CELP)-based narrowband speech codec, a frequency characteristic collector converts the high-band signal to a signal in a frequency domain and obtains Modified Discrete Cosine Transform (MDCT) coefficients, a subband determiner determines subbands in a final stage based on the MDCT coefficients and determines subbands for quantization based on the subbands in a final stage, a gain quantizer performs gain quantization of the subbands, a bit assignment unit assigns bits to the subbands according to the magnitude of the gain quantization, and a shape quantizer performs shape quantization of the subbands in an algebraic method. Accordingly, algorithm consistency can be maintained and a complexity can be reduced by extending a bandwidth with a small number of bits in a speech codec.

Book ChapterDOI
13 Aug 2007
TL;DR: This chapter contains sections titled: Sub-band-ADPCM Wideband Coding at 64 kbps Wideband Transform-coding at 32 kbps Sub- Band CELP Codecs Fullband Wideband ACELP Coding A Turbo-coded Burst-by-burst AdaptiveWideband Speech Transceiver Turbo-detected Unequal Error Protection Irregular Convolutional Coded AMR-WB Transceivers.
Abstract: This chapter contains sections titled: Sub-band-ADPCM Wideband Coding at 64 kbps Wideband Transform-coding at 32 kbps Sub-band-split Wideband CELP Codecs Fullband Wideband ACELP Coding A Turbo-coded Burst-by-burst AdaptiveWideband Speech Transceiver Turbo-detected Unequal Error Protection Irregular Convolutional Coded AMR-WB Transceivers The AMR-WB+ Audio Codec Chapter Summary

Journal ArticleDOI
TL;DR: Simulation results show that an average log spectral distortion of about 1.5dB can be achievable at an event rate of 20events/s, and subjective test results indicate that the performance of the proposed speech coding method is comparable to that of the 4.8kbps FS-1016 CELP coder.

Patent
11 Apr 2007
TL;DR: In this article, a fixed codebook search method based on iteration-free global pulse replacement in a speech codec, and a Code-Excited Linear-Prediction (CELP)-based speech codec using the method are presented.
Abstract: Provided are a fixed codebook search method based on iteration-free global pulse replacement in a speech codec, and a Code-Excited Linear-Prediction (CELP)-based speech codec using the method. The fixed codebook search method based on iteration-free global pulse replacement in a speech codec includes the steps of: (a) determining an initial codevector using a pulse-position likelihood vector or a correlation vector; (b) calculating a fixed-codebook search criterion value for the initial codevector; (c) calculating fixed-codebook search criterion values for respective codevectors obtained by replacing a pulse of the initial codevector each time for respective tracks, and determining a pulse position generating the largest fixed-codebook search criterion value as a candidate pulse position for the respective tracks, respectively; (d) calculating fixed-codebook search criterion values for respective codevectors of all combinations obtained by replacing at least one pulse position of the initial codevector with the candidate pulse positions of the respective tracks, and determining the largest value of the fixed-codebook search criterion values; and (e) comparing the fixed-codebook search criterion value for the initial codevector obtained in step (b) with the largest value determined in step (d) to determine an optimum fixed codevector.

Proceedings ArticleDOI
15 Apr 2007
TL;DR: The first stage which is a narrowband embedded CELP coder at 8 and 12 kbit/s ensures interoperability with ITU-T G.729 standard with a reduced complexity, and with a quality better than G.728 Annex E in spite of the embedded structure.
Abstract: ITU-T G.729.1 is a scalable coder recently standardized in ITU-T for wideband telephony and voice over IP (VoIP) applications. Composed of three stages, this codec provides a scalable bitstream between 8 and 32 kbit/s both in narrowband and wideband. This paper describes the first stage which is a narrowband embedded CELP coder at 8 and 12 kbit/s. The 8 kbit/s layer ensures interoperability with ITU-T G.729 standard with a reduced complexity, and with a quality better than G.729 Annex A. At 12 kbit/s, G.729.1 reaches the quality level of the 11.8 kbit/s G.729 Annex E in spite of the embedded structure. The modifications brought to the original G.729 scheme to achieve this performance are explained and formal test results provided.

Proceedings ArticleDOI
01 Nov 2007
TL;DR: Techniques which can be used at the encoder and at the decoder of CELP coders to improve their robustness to frame erasures are described.
Abstract: This paper describes techniques which can be used at the encoder and at the decoder of CELP coders to improve their robustness to frame erasures The techniques address specific problems in different speech classes, in particular stationary voiced and voiced onsets

Patent
Jes Thyssen1, Juin-Hwey Chen1
03 Jul 2007
TL;DR: In this article, a system and method for encoding and decoding speech signals that includes a specially-designed Code Excited Linear Prediction (CELP) encoder and a vector quantization (VQ) based Noise Feedback Coding (NFC) decoder was described.
Abstract: A system and method for encoding and decoding speech signals that includes a specially-designed Code Excited Linear Prediction (CELP) encoder and a vector quantization (VQ) based Noise Feedback Coding (NFC) decoder or that includes a specially-designed VQ-based NFC encoder and a CELP decoder. The VQ based NFC decoder may be a VQ based two-stage NFC (TSNFC) decoder. The specially-designed VQ-based NFC encoder may be a specially-designed VQ based TSNFC encoder. In each system, the encoder receives an input speech signal and encodes it to generate an encoded bit stream. The decoder receives the encoded bit stream and decodes it to generate an output speech signal. A system and method is also described in which a single decoder receives and decodes both CELP-encoded audio signals as well as VQ-based NFC-encoded audio signals.

Patent
Kaoru Sato1, Toshiyuki Morii1
14 Dec 2007
TL;DR: In this paper, an adaptive sound source vector quantization device capable of improving quantization accuracy of adaptive sound sources quantization while suppressing increase of the calculation amount in CELP sound encoding which performs encoding in sub-frame unit.
Abstract: Disclosed is an adaptive sound source vector quantization device capable of improving quantization accuracy of adaptive sound source vector quantization while suppressing increase of the calculation amount in CELP sound encoding which performs encoding in sub-frame unit. In the device, a search adaptive sound source vector generation unit (103) cuts out an adaptive sound source vector of a frame length (n) from an adaptive sound source codebook (102), a search impulse response matrix generation unit (105) generates a search impulse response matrix of n n by using an impulse response matrix for each of sub-frames inputted from a synthesis filter (104), a search target vector generation unit (106) adds the target vector of each sub-frame so as to generate a search target vector of frame length (n), an evaluation scale calculation unit (107); calculates the evaluation scale of the adaptive sound source vector quantization by using the search adaptive sound source vector, the search impulse response matrix, and the search target vector.

Proceedings ArticleDOI
01 Nov 2007
TL;DR: The quality of the pre-weighted approach is comparable to the quality achieved by the standard AMR codec, and requires an additional bit-rate of 1.35 kbps to communicate the linear prediction coefficients of the original speech input to the decoder.
Abstract: We investigate the effect on voice quality of perceptual pre-weighting of the input speech to a codec, and post- inverse weighting the output of the codec. The G.726 adaptive differential pulse code modulation (ADPCM) codec and the AMR narrowband (AMR-NB) code excited linear prediction (CELP) codec are employed in our experiments. The weighting function used has the same form as that of the perceptual weighting function for the analysis-by-synthesis codebook search in AMR- NB. We observe a significant improvement in voice quality at rates of 16 and 24 kbps in the case of G.726 when perceptual weighting is used. When we use pre-weighting with the AMR codec, the unweighted squared error is used within the analysis- by-synthesis codebook search loop, and we find that the quality of the pre-weighted approach is comparable to the quality achieved by the standard AMR codec. The proposed pre-weighting method requires an additional bit-rate of 1.35 kbps to communicate the linear prediction (LP) coefficients of the original speech input to the decoder.

Journal ArticleDOI
TL;DR: The results demonstrate in analytic form the relation between the noise floor level and the stability radius of the LPC model.
Abstract: White-noise correction is a technique used in speech coders using linear predictive coding (LPC). This technique generates an artificial noise-floor in order to avoid stability problems caused by numerical round-off errors. In this letter, we study the effect of white-noise correction on the roots of the LPC model. The results demonstrate in analytic form the relation between the noise floor level and the stability radius of the LPC model

Patent
02 Aug 2007
TL;DR: In this paper, the authors propose a method and apparatus for a voice transcoder that converts a bitstream representing frames of data encoded according to a first voice compression standard to a binary representation of the data using perceptual weighting that uses tuned weighting factors to produce a higher quality decoded voice signal than a comparable tandem transcoding solution.
Abstract: A method and apparatus for a voice transcoder that converts a bitstream representing frames of data encoded according to a first voice compression standard to a bitstream representing frames of data according to a second voice compression standard using perceptual weighting that uses tuned weighting factors, such that the bitstream of a second voice compression standard to produce a higher quality decoded voice signal than a comparable tandem transcoding solution. The method includes pre-computing weighting factors for a perceptual weighting filter optimized to a specific source and destination codec pair, pre-configuring the transcoding strategies, mapping CELP parameters in the CELP parameter space according to the selected coding strategy, performing Linear Prediction analysis if specified by the transcoding strategy, perceptually weighting the speech using with tuned weighting factors, and searching for adaptive codebook and fixed-codebook parameters to obtain a quantized set of destination codec parameters.

Patent
25 Jan 2007
TL;DR: In this paper, a scalable speech coding/decoding method of a mixed structure and an apparatus thereof are provided to use a CELP(Code Excited Linear Prediction) structure as a lowband coding method so that excellent speech quality can be provided at a low bit-rate of a speech signal, and a signal outputted from a high-band coder is added to a low-band signal, accordingly, a speech signals can be outputted with high sound quality at low transmission rate.
Abstract: A scalable speech coding/decoding method of a mixed structure and an apparatus thereof are provided to use a CELP(Code Excited Linear Prediction) structure as a low-band coding method so that excellent speech quality can be provided at a low bit-rate of a speech signal, and a signal outputted from a high-band coder is added to a low-band signal, and accordingly, a speech signal can be outputted with high sound quality at a low transmission rate. A scalable speech coding/decoding apparatus of a mixed structure comprises the followings: a band divider(100) which divides a speech input signal into a low-band signal and a high-band signal according to a specific frequency and outputs the low-band signal and the high-band signal; a low-band coder(200) which outputs the first index corresponding to a low bandwidth by coding the low-band signal, transmits information required for coding the high-band signal to a high-band coder(300), and transmits the uncoded first error signal to a wide-band coder(400); the high-band coder(300) which outputs the high-band second index acquired when the high-band signal is coded by using information received from the low-band coder(200), and transmits the uncoded second error signal to the wide-band coder(400); the wide-band coder(400) which quantizes coefficients of the first and second error signals using a MDCT(Modified Discrete Cosine Transform) method through time-frequency mapping, and outputs the third index corresponding to a wide bandwidth; and a bit-stream generator(500).