scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 2009"


Journal ArticleDOI
TL;DR: Analytical expressions for performance of code-tracking loops using early-late discriminators, under small-error conditions are provided, and numerical results are provided to examine the effect of different modulation designs and interference conditions.
Abstract: Code tracking is an important attribute of receivers for Global Positioning System (GPS) and other global navigation satellite systems (GNSS). This paper and its antecedent provide analytical expressions for performance of code-tracking loops using early-late discriminators, under small-error conditions. Expressions are provided for output signal-to-noise-plus-interference ratio (SNIR) and code-tracking error, for arbitrary signal spectra, and Gaussian noise and interference having arbitrary spectral shapes. This second paper addresses noncoherent early-late processing (NELP) for given receiver precorrelation bandwidth and given early-late spacing, comparing the results to results for coherent early-late processing (CELP) and to a lower bound (LB) on code-tracking error. Theoretical expressions are derived and compared, and numerical results are provided to examine the effect of different modulation designs and interference conditions.

98 citations


Patent
15 Sep 2009
TL;DR: In this paper, a method for decoding a decoded audio signal that has a transmitted pitch lag is disclosed, which includes estimating pitch correlations of possible short pitch lags that are smaller than a minimum pitch limitation and have an approximated multiple relationship with the transmitted pitch delay, and selecting a short pitch lag as a corrected pitch lag if a corresponding pitch correlation is large enough.
Abstract: In one embodiment, a method of receiving a decoded audio signal that has a transmitted pitch lag is disclosed. The method includes estimating pitch correlations of possible short pitch lags that are smaller than a minimum pitch limitation and have an approximated multiple relationship with the transmitted pitch lag, checking if one of the pitch correlations of the possible short pitch lags is large enough compared to a pitch correlation estimated with the transmitted pitch lag, and selecting a short pitch lag as a corrected pitch lag if a corresponding pitch correlation is large enough. The postprocessing is performed using the corrected pitch lag. In another embodiment, when the existence of irregular harmonics or wrong pitch lag is detected, a coded-excited linear prediction (CELP) postfilter is made more aggressive.

47 citations


Patent
12 Mar 2009
TL;DR: In this paper, the authors describe methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters.
Abstract: The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.

34 citations


Proceedings ArticleDOI
Jianle Chen1, Woo-Jin Han1
07 Nov 2009
TL;DR: This paper investigates the linear prediction method for block-based lossy image coding and proposes a method that merges linear prediction technique into H.264/AVC video coding framework and shows that the proposed technique improves coding efficiency.
Abstract: Linear prediction model has been well investigated and applied in lossless image and video coding. In this paper, we investigate the linear prediction method for block-based lossy image coding and propose a method that merges linear prediction technique into H.264/AVC video coding framework. A block-based linear prediction method is designed instead of pixel-based one in order to cooperate with transform module. Furthermore, line-based linear prediction with 1D transform is developed by considering coding gain tradeoff between prediction and transform. Linear prediction model coefficients are derived by using neighboring reconstructed data with least square error method. The model coefficients implicitly embed the local texture characteristics and no bits overhead is needed for signaling the coefficients since we can derive them with same process at decoder side. We insert block-based and line-based linear prediction modes into H.264/AVC as additional intra prediction modes and select the best mode by minimum rate-distortion sense. Experimental results show that the proposed technique improves coding efficiency of H.264/AVC intra picture with average 4.3% bit saving and up to 7.0% bit saving.

23 citations


Proceedings ArticleDOI
01 Dec 2009
TL;DR: The main purpose of this work is showing that better statistical modeling in the context of speech analysis creates an output that offers better coding properties, and leads to a convex optimization problem, which can be solved efficiently using interior-point methods.
Abstract: This paper describes a novel speech coding concept created by introducing sparsity constraints in a linear prediction scheme both on the residual and on the prediction vector. The residual is efficiently encoded using well known multi-pulse excitation procedures due to its sparsity. A robust statistical method for the joint estimation of the short-term and long-term predictors is also provided by exploiting the sparse characteristics of the predictor. Thus, the main purpose of this work is showing that better statistical modeling in the context of speech analysis creates an output that offers better coding properties. The proposed estimation method leads to a convex optimization problem, which can be solved efficiently using interior-point methods. Its simplicity makes it an attractive alternative to common speech coders based on minimum variance linear prediction.

21 citations


Journal ArticleDOI
TL;DR: This study conducted two investigations into a financial company's CELP and found a gap between INT and UGE and identified potential factors that contributed to the differences in UGE between two groups of high-INT learners.
Abstract: While numerous previous studies have focused on the use of some corporate e-learning programmes (CELP), little is known about the difference between users’ pre-installation reactions to CELP and user's post-installation reactions to CELP. This study narrows the above gap with two investigations into a financial company's CELP. In the pre-installation phase, we surveyed users' need for cognition, attitudes towards corporate e-learning and intentions for the use of corporate e-learning (INT) in relation to the CELP. Ten months after the installation, we conducted a second investigation to examine learning outcomes on the basis of users' perceptions of CELP utility, CELP satisfaction, affective reaction to CELP and the actual CELP usage (UGE). We examined the proposed model with multiple regression analyses and found a gap between INT and UGE. The second investigation identified potential factors that contributed to the differences in UGE between two groups of high-INT learners. Results indicated that time management and technical problems were the two critical factors that led to UGE differences. Findings of this study can illuminate relationships among learners' attitudes, usage and reaction towards the CELP. [ABSTRACT FROM AUTHOR]

19 citations


01 Jan 2009
TL;DR: A packet loss concealment scheme based interleaving is presented to improve speech quality deterioration caused by packet losses for code-excited linear prediction (CELP) based coders and the obtained results prove that theInterleaving method is better at the expense of extra delay.
Abstract: In VoIP applications, packet loss is a major source of speech impairment. In this paper, a packet loss concealment scheme based interleaving is presented to improve speech quality deterioration caused by packet losses for code-excited linear prediction (CELP) based coders. We applied the proposed scheme to the ITU-T G729 8 kb/s speech coding standard to evaluate the performance of the proposed method. The perceptual evaluation of speech quality (PESQ) and enhanced modified bark spectral distortion (EMBSD) tests under various packet loss conditions confirm that the proposed algorithm is superior to the concealment algorithm embedded in the G729. The spectral distortion measure is also used as an objective distortion measure; the obtained results prove that the interleaving method is better at the expense of extra delay.

11 citations


Proceedings ArticleDOI
07 Nov 2009
TL;DR: A fast algorithm that selects the temporal prediction type for the dyadic hierarchical prediction structure in scalable video coding (SVC) and makes use of the strong correlations in the large block partitions to eliminate the unnecessary computations for bi-directional prediction.
Abstract: In this paper, we propose a fast algorithm that selects the temporal prediction type for the dyadic hierarchical prediction structure in scalable video coding (SVC). We make use of the strong correlations in the large block partitions to eliminate the unnecessary computations for bi-directional prediction. Moreover, based upon the information of an 8×8 partition, either forward or backward prediction is skipped for its smaller block partitions. Comparing to the JSVM 9.11, our method saves the encoding time from 50% to 60% for a number of test videos over a typical range of coding bit-rates and its coding penalty is negligible.

10 citations


Journal ArticleDOI
TL;DR: Two well known algorithms, LPC (Linear Predictive Coding) and subband coding are combined to reduce data transmission rate from 128 Kbps to as low as 12 Kbps at minimum complexity and implemented on a digital signal processor.
Abstract: Speech compression is a mature technology with many applications. Over the past decade, huge advances have been made in the area of speech coding for reduced bit-rate transmission. With perceptual audio coding, the signal is coded efficiently using a psychoacoustic model, as in MPEG standards. In this paper, two well known algorithms, LPC (Linear Predictive Coding) and subband coding are combined to reduce data transmission rate from 128 Kbps to as low as 12 Kbps at minimum complexity and implemented on a digital signal processor. The performance of the proposed algorithm is almost same as the MPEG algorithm without its complexity.

9 citations


01 Jan 2009
TL;DR: Empirical results show that gain prediction by Elman and MLP neural networks improve the mean opinion score and segmental signal to noise ratio as compared to traditional implementation of encoder, but fuzzy ARTMAP reduces the computational complexity noticeably, without significant degradations in MOS and SNRseg.
Abstract: Reducing the computational complexity is desired in speech coding algorithms. In this paper, three neural gain predictors are proposed which can function as backward gain adaptation module of low delay-code excited linear prediction (LD-CELP) G.728 encoder, recommended by International Telecommunication Union-Telecom sector (ITU-T, formerly CCITT). Elman, multilayer perceptron (MLP) and fuzzy ARTMAP are candidate neural models in this work. Empirical results show that gain prediction by Elman and MLP neural networks improve the mean opinion score (MOS) and segmental signal to noise ratio (SNRseg) as compared to traditional implementation of encoder. However, fuzzy ARTMAP reduces the computational complexity noticeably, without significant degradations in MOS and SNRseg.

9 citations


01 Jan 2009
TL;DR: This work proposes a multi-self organizing map (SOM) neural model for codebook search in low delay-code excited linear prediction (LD-CELP) G.728 coder, which has an average classification rate of 97.8%, and leads to 28% reduction in execution time.
Abstract: In the family of CELP coders, codebook search has high computational complexity. In this paper, the codebook search in low delay-code excited linear prediction (LD-CELP) G.728 coder is performed by a multi-self organizing map (SOM) neural model. A modified-supervised SOM training algorithm is also used in this work. In this algorithm, the codebook vectors are assigned to a class during training and a rejection term for codebook entries is used. The proposed neural search codebook module consists of 48 SOMs, which determine optimum index values of shape codebook. Empirical results show that the proposed model, which has an average classification rate of 97.8%, leads to 28% reduction in execution time as compared to a traditional implementation of G.728 encoder. However, the degradations in mean opinion score (MOS) and segmental signal to noise ratio (SNRseg) are 0.16 and 0.17 dB, respectively.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: The underlying additive noise model is accurate enough to enhance speech which is recorded in an enclosed space where the resulting early reflections are usually modeled as a convolutive distortion.
Abstract: In this paper we investigate the application of adaptive postfiltering for the enhancement of reverberant speech. The considered method is commonly used in Code Excited Linear Prediction (CELP) speech coding to lower the impact of quantization noise in the excitation signal and the spectral envelope. We show that the underlying additive noise model is accurate enough to enhance speech which is recorded in an enclosed space where the resulting early reflections are usually modeled as a convolutive distortion. By means of adaptive filtering, the amplitudes of the unwanted peaks in the excitation signal are attenuated and the signal components at the harmonic peaks are emphasized. Both, single- and multi-channel dereverberation algorithms are proposed having a moderate computational complexity. Experiments have shown that this approach is capable of reducing early reverberation and attenuate the “distance-effect” arising from room reflections.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: An embedded CELP coder is described in which an adaptive codebook is included in every enhancement layer and the lower-layer codebook gains are re-optimized in the higher layers to further improve speech quality.
Abstract: The paper describes an embedded CELP coder in which an adaptive codebook is included in every enhancement layer and the lower-layer codebook gains are re-optimized in the higher layers to further improve speech quality. Each layer maintains its own filter memories to generate required target vectors, adds adaptive and fixed codebook contributions, and re-optimizes all codebook gains to improve coder performance (multi-layer gain optimization). The common elements across the embedded layers include the lower-layer adaptive and fixed codebook entries. The pitch-lag used in the core layer is also re-used in the enhancement layers to maintain time synchronization between layers. Estimation and encoding of selected lower-layer parameters may take into account their estimated impact on the higher layers. The described Embedded CELP coder has been implemented in the Embedded Variable Bit-Rate (EV-VBR) codec standardized by ITU-T as Recommendation G.718. The characterization test results of the G.718 Embedded CELP are summarized.

Proceedings ArticleDOI
01 Dec 2009
TL;DR: The experiments results showed quality bias toward the English language-the scores were higher and the performance was more stable, and the quality of transformed speech signals was estimated using two quality estimation algorithms 3SQM and PESQ.
Abstract: This paper investigates the performance of CELP speech codecs over different languages. The English language has had a dominating influence in the advance of telecommunications. With many of the major developments coming from primarily English speaking areas there is the risk that these advances may not be linguistically robust. It is noted that quality of a speech produced by voice codecs mainly is assessed using samples of English language. Some investigation show that language influence to codecs performance could be noticed. In order to judge the performance of the most popular CELP voice codecs (Speex and AMR), we encoded and decoded the speech samples from three different languages: English, Arabic and Lithuanian. The quality of transformed speech signals was estimated using two quality estimation algorithms 3SQM (ITU recommendations P.563) and PESQ (ITU recommendations P.862). The experiments results showed quality bias toward the English language—the scores were higher and the performance was more stable.

Proceedings ArticleDOI
18 Aug 2009
TL;DR: Experimental results show that this speech information-hiding scheme can be effective and the distortion of the speech signal is trivial and imperceptible.
Abstract: This paper presents a speech information-hiding scheme that is integrated with the CELP (Coded Exited Linear Prediction) speech coding method The G729 codec (CS-ACELP) is used to test the efficiency of this scheme The index-constrained method is applied for the secret information embedding The selected bits of first-stage and residual vector indices during the Predictive Two-stage Vector Quantization procedure are modulated by the secret information bits For the imperceptibility of this scheme, the quality of the watermarked speech signal can be controlled by adaptive embedding through referring to the original and watermarked speech Experimental results show that this scheme can be effective and the distortion of the speech signal is trivial and imperceptible

Proceedings ArticleDOI
28 Oct 2009
TL;DR: This paper studies various methods and proposes a fast Codebook Search method, it can largely reduce the complexity, and speech still maintain at almost the same quality.
Abstract: G.729 is called Conjugate Structured-Algebraic Code Excited Linear Prediction (CS-ACELP), its fixed-codebook adopts algebraic- codebook. The algebraic structure of the codebook allows for a fast search procedure, but it still requires large computational load using the nested-loop search method. The focused search used in G.729 and the depth-first tree search used in G.729a can reduce the searching complexity efficiently. However it still occupies a larger proportion of whole encoding complexity. In this paper, we study various methods and propose a fast Codebook Search method, it can largely reduce the complexity, and speech still maintain at almost the same quality.

Proceedings ArticleDOI
30 Oct 2009
TL;DR: An effective transcoding scheme is presented which makes full use of the interoperability between AMR-WB and VMR-WB, and also the encoding similarity between the bit-rate differences between the two standards.
Abstract: The Adaptive Multirate Wideband (AMR-WB) speech codec and Variable-Rate Multimode Wideband (VMR-WB) speech codec are two coding standards based on CELP model for processing wideband input speech. When communication occurs between them, transcoding must be performed to translate the encoding format from one standard to another one. In this paper, an effective transcoding scheme is presented which makes full use of the interoperability between AMR-WB and VMR-WB, and also the encoding similarity between the bit-rates from 12.65 kb/s to 23.85 kb/s in AMR-WB. When transcoded from the non- interoperable modes in AMR-WB to VMR-WB, the speech frames of the non-interoperable modes are first translated to the interoperable 12.65 kb/s mode, then the AMR-WB interoperable 12.65 kb/s mode is mapped into the corresponding mode in VMR-WB. Algorithm evaluation indicates that the proposed transcoding scheme has less computational complexity and similar or higher quality compared to the traditional tandem transcoding method.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: A novel technique to enhance music signals encoded using a low bit rate CELP coder based on reduction of inter-tone quantization noise for decoded music signals without affecting the quality for speech signals is presented.
Abstract: In this paper we present a novel technique to enhance music signals encoded using a low bit rate CELP coder. The method is based on reduction of inter-tone quantization noise for decoded music signals without affecting the quality for speech signals. The proposed technique consists of two modules. The first module is used to discriminate between stable tonal sounds and other sounds and the second module is used to reduce the inter-tone quantization noise in the stable tonal segments. The inter-tone noise is reduced by means of spectral subtraction. The proposed method is a part of the newly standardised ITU-T G.718 codec.

Journal Article
TL;DR: A frame erasure concealment algorithm based on nonlinear regression analysis is presented to minimize speech quality deterioration in code-excited linear prediction (CELP) based coders and obtained improved perceptual evaluation of speech quality (PESQ) scores compared to the conventional methods.
Abstract: Frame erasure is one of the most difficult problems in voice over IP (VoIP) networks and is a major source of speech quality degradation. In this paper, a frame erasure concealment algorithm based on nonlinear regression analysis is presented to minimize speech quality deterioration in code-excited linear prediction (CELP) based coders. We applied the proposed scheme to the ITU-T G.729 standard and obtained improved perceptual evaluation of speech quality (PESQ) scores compared to the conventional methods.

Proceedings ArticleDOI
16 Mar 2009
TL;DR: A novel hybrid harmonic/CELP scheme for bandwidth scalable wideband codec is proposed that utilizes a band-split technique, where the low-band (0-4 kHz) is critically subsampled and coded using 11.8 kbps G.729E.
Abstract: A novel hybrid harmonic/CELP scheme for bandwidth scalable wideband codec is proposed that utilizes a band-split technique, where the low-band (0-4 kHz) is critically subsampled and coded using 11.8 kbps G.729E. The high-band signal is divided into stationary mode (SM) and non-stationary mode (NSM) components based on its unique characteristics. In the SM portion, the high-band signal is compressed using a multi-stage coding combined sinusoidal model based on the matching pursuit (MP) algorithm and CELP with the circular codebook. In the NSM portion, the high-band signals are coded by CELP with both pulse and circular codebooks. For efficient bit allocation and enhanced performance, the pitch of the high-band codec is estimated using the quantized pitch parameter in low-band codec. In an informal listening test, the subjective speech quality was rated as comparable to that obtainable with 48 kbps G.722 and 12.85 kbps G.722.2.

Journal Article
TL;DR: This paper explains the implementation of Linear Prediction Coding(LPC) algorithm based on Matlab, which deduced the linear prediction equation from the all-pole model which is commonly used in the speech signal.
Abstract: The principle of LPC and its encoding algorithm are mainly discussed.It is deduced the linear prediction equation from the all-pole model which is commonly used in the speech signal.And then it gives a description of the linear prediction analysis in G.729.Above all,it explains the implementation of Linear Prediction Coding(LPC) algorithm based on Matlab.At first,it tells the window and the computation of auto-correlation,then gives the program of Matlab and the picture of windowed result.At last,it lists the classic Levinson-Durbin algorithm for linear prediction coefficients,at the same time it gives the program too.The analytical result can be gained by Matlab linear prediction,and make a foundation for DSP.

Proceedings ArticleDOI
08 Dec 2009
TL;DR: A golden model of the codec which best suits as a reference for its hardware implementation is developed using SIMUIINK bit-true modeling of the conjugate structure-algebraic CELP (CS-ACELP) speech coder.
Abstract: In this work, we present SIMUIINK bit-true modeling of the conjugate structure-algebraic CELP (CS-ACELP) speech coder which has been chosen as the core layer of speech coder standard ITU-T G.729.1. The optimum bit numbers of the computational blocks are defined as the minimum word-widths that maintain the quality of the output with minimum chip area and power. Such optimum bit-width of the coefficients and the internal computations are extracted. As a result, a golden model of the codec which best suits as a reference for its hardware implementation is developed. The power and area improvements are estimated in two blocks of CS-ACELP speech coder.

Proceedings ArticleDOI
30 Oct 2009
TL;DR: Experimental results show that the speech quality of the proposed vocoder is intelligible with good naturalness and the dynamic bit allocation scheme is developed to improve speech quality.
Abstract: A 450bps speech coder based on multi-frame structure and multi-mode matrix quantization is presented. The multi- frame structure consisting of four frames is adopted to reduce the algorithm delay. The parameter matrices are classified into different modes based on the voicing vector information of superframe. To improve speech quality, a dynamic bit allocation scheme is developed. Experimental results show that the speech quality of the proposed vocoder is intelligible with good naturalness.

01 Dec 2009
TL;DR: By incorporating basic hybrid video coding techniques into the encoder, the proposed residual prediction algorithm can significantly improve the coding efficiency.
Abstract: This paper proposes a novel residual prediction algorithm to improve the coding efficiency It extends the scope of prediction to the residual data The proposed algorithm is adopted for intra prediction which accounts for a major portion of the bit stream The simulation results show that the proposed algorithm can reduce the bit-rate by up to 434% compared with the JM reference software without any degradation of the peak signal to noise ratio By incorporating basic hybrid video coding techniques into the encoder, the proposed residual prediction algorithm can significantly improve the coding efficiency

Proceedings ArticleDOI
30 Oct 2009
TL;DR: Several high-level and low-level optimization techniques have been concurrently employed to reduce the MIPS count and complexity of the SPEEX decoder in order to achieve good speech quality on an embedded DSP receiver.
Abstract: SPEEX is a flexible codec based on the Code Excited Linear Prediction (CELP) algorithm and supports a wide range of speech quality. In this paper, several high-level and low-level optimization techniques have been concurrently employed to reduce the MIPS count and complexity of the SPEEX decoder in order to achieve good speech quality on an embedded DSP receiver. We reassign the decoder’s stack and the memory, reconstruct the C code and reduce the redundancies. According to the way CodeWarrior compiles, we optimize C code and the algorithm of decoding bits from the bit stream, and then the assembly code of key modules is rewritten by using the special DSP instructions. The results show that the complexity of SPEEX decoder has been decreased from 31.64 MIPS to 12.40 MIPS. It can be used in voice over IP (VoIP) applications.

Proceedings ArticleDOI
Yanyan Cao, Dayi Zhao, Lijuan Zou, Zhen Ma1, Chunxia Liu1 
01 Dec 2009
TL;DR: Order-variable all-poles model according to instability of track complexity is provided and this model is applied in code-excited linear predictive speech coding to keep better speech quality on the base of decreasing coding rates.
Abstract: On the base of all-poles model, this paper provides order-variable all-poles model according to instability of track complexity and applies this model in code-excited linear predictive speech coding This method is simulated in Matlab and quality of synthesized speech is evaluated, order-variable model is founded to keep better speech quality on the base of decreasing coding rates

Proceedings ArticleDOI
01 Jul 2009
TL;DR: LPC is proposed, a technique used to characterize the vocal track and inverse filter is used to describe the vocal source and therefore it is used as the input for the coding.
Abstract: Speech coding has been and still is a major issue in the area of digital speech processing in which speech compression is needed for storing digital voice and it requires fixed amount of available memory and compression makes it possible to store longer messages. Several techniques of speech coding such as Linear Predictive Coding (LPC), Waveform Coding and Sub band Coding exist. In this paper we proposed a technique called LPC. This is used to characterize the vocal track and inverse filter is used to describe the vocal source and therefore it is used as the input for the coding. The speech coder that will be developed is going to be analyzed using both subjective and objective analysis. Subjective analysis will consist of listening to the encoded speech signal and making judgments on its quality. The quality of the played back speech will be solely based on the opinion of the listener. The speech can possibly be rated by the listener either impossible to understand, intelligible or natural sounding. Even though this is a valid measure of quality, an objective analysis will be introduced to technically assess the speech quality and to minimize human bias. The objective analysis will be performed by computing Segmental Signal to Noise Ratio (SEGSNR) between the original and the coded speech signal.

01 Oct 2009
TL;DR: In this article, a speech synthesizer that does not convert text to speech (TTS), but rather uses a touch-screen Thin Film Transistor (TFT) panel as user input to create and control voice-like audio sound synthesis is presented.
Abstract: Numerous voice compression methods are available today for communications over low bandwidth channels. Worthy of note in particular are Linear Predictive Coding (LPC), Mixed Excitation LPC (MELP), and Code Excited LPC (CELP). The channel in these coding schemes is typically a digital transmission line or radio link, such as in cellular telephone communications, but may be other media such as files on a computer hard disk. Linear Predictive Coding is explored in some detail as a basis for creating a new speech synthesizer that does not convert text to speech (TTS), but rather uses a touch-screen Thin Film Transistor (TFT) panel as user input to create and control voice-like audio sound synthesis. Research has been carried out to conceptually try different methods for mapping TFT touch panel input (or any 2-dimensional input) to LPC synthesis coefficient vectors for artificial speech reproduction. To achieve this, various LPC coefficient quantization algorithms have been explored and evaluated using Octave v.3 scripts, resulting in selection and comparison in the final hardware and software implementation. The hardware and software development platform used for the final implementation is the Altium Nanoboard 3000 Xilinx Edition, along with the Altium Designer EDA package. The Nanoboard 3000 was chosen as it provided a convenient FPGA platform and all the necessary IP, IP Synthesis, and C compilers needed to prototype the design and perform further research

Journal ArticleDOI
Moo Young Kim1
TL;DR: For efficient variable-rate speech coding, Karhunen-Loeve transform based adaptive entropy-constrained vector quantization (KLT-AECVQ) is proposed, which produces superior perceptual quality to KLT-based classified vector quantize that yielded better quality than conventional code excited linear predictive (CELP) codec.
Abstract: For efficient variable-rate speech coding, Karhunen-Loeve transform based adaptive entropy-constrained vector quantization (KLT-AECVQ) is proposed. The proposed method consists of backward-adaptive linear predictive coding (LPC) analysis, KLT estimation based on LPC coefficients, and lattice vector quantization followed by Huffman coding according to KLT statistics. As different statistics in an original-signal domain can be mapped into identical statistics in a KLT domain, only a few classified Huffman codebooks are sufficient to represent KLT-domain source statistics. KLT-AECVQ with 32 Huffman codebooks has comparable rate-distortion performance with theoretically optimal AECVQ with infinite number of Huffman codebooks. KLT-AECVQ also produces superior perceptual quality to KLT-based classified vector quantization (KLTCVQ) that yielded better quality than conventional code excited linear predictive (CELP) codec. Under five-sample delay constraints, KLT-AECVQ has also three times lower complexity than CELP codec.

Proceedings Article
Amr Nabil1, M. Hesham1
14 Dec 2009
TL;DR: It was found that the coded pitch period did not suffer any significant change through the coding/decoding process and the spectral distortion is large compared to other studies and is largest for MELP1200.
Abstract: In this work, we present results on the effect of well-known mixed excitation linear prediction (MELP) and code-excited linear prediction (CELP) codecs (coder/decoder) on voicing and vocal tract parameters of Arabic sounds. The study shows that the spectral distortion is large compared to other studies and is largest for MELP1200. Vowel formants have a shift which may exceed one critical band below or above its reference value. Finally, it was found that the coded pitch period did not suffer any significant change through the coding/decoding process.