scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 2014"


Journal ArticleDOI
TL;DR: Results show that the proposed coding scheme has achieved average Mean Opinion Score of the synthesized speech 3.083 in an appropriate bit rate (4.2 Kbps), which outperforms the quality of Code excited linear prediction (CELP).
Abstract: In this paper, we propose a novel speech coding scheme based on compressed sensing and sparse representation. Compressed sensing (CS) attracts great interest for its ability to utilize a few measurements to recover original signals. Measurements preserve part of speech features while projected by row echelon matrix. A dictionary is learned in order to contain redundant information about speech measurements. The synthesized speech is recovered from a sparse approximation of the corresponding measurement. A rear low-pass filter is adopted to improve the subject quality of synthesized speech. Results show that the proposed coding scheme has achieved average Mean Opinion Score (MOS) of the synthesized speech 3.083 in an appropriate bit rate (4.2 Kbps), which outperforms the quality of Code excited linear prediction (CELP).

20 citations


Journal ArticleDOI
TL;DR: Two new methods are introduced to obtain intrinsically stable predictors with the 1-norm minimization, based on constraining the roots of the predictor to lie within the unit circle by reducing the numerical range of the shift operator associated with the particular prediction problem considered.
Abstract: In linear prediction of speech, the 1-norm error minimization criterion has been shown to provide a valid alternative to the 2-norm minimization criterion. However, unlike 2-norm minimization, 1-norm minimization does not guarantee the stability of the corresponding all-pole filter and can generate saturations when this is used to synthesize speech. In this paper, we introduce two new methods to obtain intrinsically stable predictors with the 1-norm minimization. The first method is based on constraining the roots of the predictor to lie within the unit circle by reducing the numerical range of the shift operator associated with the particular prediction problem considered. The second method uses the alternative Cauchy bound to impose a convex constraint on the predictor in the 1-norm error minimization. These methods are compared with two existing methods: the Burg method, based on the 1-norm minimization of the forward and backward prediction error, and the iteratively reweighted 2-norm minimization known to converge to the 1-norm minimization with an appropriate selection of weights. The evaluation gives proof of the effectiveness of the new methods, performing as well as unconstrained 1-norm based linear prediction for modeling and coding of speech.

17 citations


Proceedings ArticleDOI
19 Apr 2014
TL;DR: The results showed that the modified design of the algorithm continues to offer the same level of security as the original Blowfish cipher with a less computational overhead in key generation.
Abstract: This paper applied the high quality speech coding algorithm G.729 which has been standardized by ITU-T with low bit rate 8kb/s. This algorithm is based on a conjugate- structure algebraic CELP (CS_ACELP) coding technique with 10ms speech frames. The output of encoder is encrypted by a symmetric key Blowfish algorithm which has 64 bits block size and a variable key length from 32 up to 448 bits. One of the main disadvantages of Blowfish algorithm is the time required to initialize the algorithm with the key. This paper proposes a new method for generating S-boxes and P-arrays which are considered as the main building elements of the Blowfish algorithm. This new generating method leads to a reduction in time complexity of generating S-boxes and P-arrays. The proposed speech encryption system has been implemented using Matlab and the output is analyzed using Avalanche effect. The results showed that the modified design of the algorithm continues to offer the same level of security as the original Blowfish cipher with a less computational overhead in key generation.

14 citations


13 Nov 2014
TL;DR: It is shown that this method performs the same as the weighting in the linear prediction coding domain but with lower complexity in a low-bit-rate situation.
Abstract: We have devised a direct and simple scheme for linear conversion of line spectrum pairs (LSP) with low computational complexity aiming at weighting or inverse weighting spectral envelopes for noise control in speech and audio coders. Using optimally prepared coefficients, we can perform the conversion directly in the LSP domain, which ensures low computational costs and also simplifies the check or the modification of unstable parameters. We show that this method performs the same as the weighting in the linear prediction coding domain but with lower complexity in a low-bit-rate situation. The devised method is therefore expected to be useful for lowbit-rate speech and audio coders for mobile communications.

7 citations


Proceedings ArticleDOI
08 May 2014
TL;DR: The perceptual quality of the reconstructed audio output is very good, and this is evident from the spectrogram, pitch, intensity and formant waveforms obtained.
Abstract: This paper describes the analysis of the audio and speech output of the 16 kb/s Low Delay CELP algorithm (LD-CELP). For various speech file conversions, the SoX (Sound eXchange) tool is used. Praat software is used to extract details regarding the quality of the sound synthesized by the coder. The C program of the coder was compiled using the GCC compiler in the Linux environment and executed using the GNU Make utility. File compression ratio of 5 was achieved. The bit rate obtained is 16 kb/s, i.e. a reduction by 4 times of the original bitrate. The perceptual quality of the reconstructed audio output is very good, and this is evident from the spectrogram, pitch, intensity and formant waveforms obtained.

5 citations


01 Jan 2014
TL;DR: This Paper basically focuses on the analysis and synthesis of the speech signal which takes reference from the speech coding technique of Conjugate-Structure Algebraic Code Excited Linear Prediction and also the concept of vector quantization.
Abstract: This Paper basically focuses on the analysis and synthesis of the speech signal which takes reference from the speech coding technique of Conjugate-Structure Algebraic Code Excited Linear Prediction (CS-ACELP) and also the concept of vector quantization. The proposed technique analyses the variation in weighted speech by the changes in the weight factor and also by implementing it with and without vector quantization. This algorithm is based on CELP coding technique which works on analysis by synthesis principle with the frame size of 80 samples (10 ms). Steganography is implemented in the LSB of the input cover speech which contributes to the high frequency component of the cover speech. The weighted speech signal is generated using both quantized and unquantized LP parameters, which is being tested on various grounds to basically observe and analyze its overall performance. At the end an analytic comparison between the original and the weighted speech signal both subjectively as well as objectively is done. The values of the two weight factors produce a considerable change in the weighted speech quality.

5 citations


Journal ArticleDOI
01 Jul 2014
TL;DR: The paper introduced the principle of an embedded speech coding algorithm with dual rates at both 300bps and 400bps based on the enhanced mixed excitation linear prediction model and the results show that this embedded ultra-low-bit-ratespeech coding algorithm has satisfactory quality under both DRT and MOS test.
Abstract: Ultra-low-bit-rate speech coding algorithm was in great demand for many fields such as underwater speech communications. Underwater speech communication for middle-long distance has the characteristics of narrow bandwidth as well as low transmission rate, which makes the underwater speech communication much difficult. Ultra-low-bit-rate speech coding algorithm plays an important role on this occasion. More over, it will be more flexible for the underwater speech communication system if the speech coding algorithm has an embedded structure. The paper introduced the principle of an embedded speech coding algorithm with dual rates at both 300bps and 400bps based on the enhanced mixed excitation linear prediction model. The results show that this embedded ultra-low-bit-rate speech coding algorithm has satisfactory quality under both DRT and MOS test.

4 citations


Journal ArticleDOI
TL;DR: Application of Code Excited Linear Prediction source codec on speech followed by AMR codec is studied and it is studied that why the AMR is proposed for the GSM, how the bits rates are reduced in AMR, operation of AMR and other applications of AM R.
Abstract: In wireless communication system, limited bandwidth and power is the primary restriction. The existing wireless systems involved in transmission of speech visualized that efficient and effective methods be developed to transmit and receive the same while maintaining quality of speech, especially at the receiving end. Speech coding technique is a material of research for the scientific and academic community since the era of digitization (digital). Amongst all elements of the communication systems (transmitter, channel and receiver), transmission channel is the most critical and plays a key role in the transmission and reception of information. The quality of speech at receiver end decides by channel conditions. Modelling a channel is a multifarious task. A number of techniques are adopted to alleviate the effect of the channel. Adaptive Multi Rate is one of the techniques that neutralize the deleterious effect of the channel on speech. This technique utilizes variable bit rate that dynamically switches to specific modes of operation depending upon the channel conditions. For example, Low bit rate mode of operation is selected in adverse channel conditions, this helps to provide more error protection bits for channel coding and vice versa. Therefore, in this paper, application of Code Excited Linear Prediction (CELP) source codec on speech followed by AMR codec is studied. Further, higher the bit rate used, the better is the quality of speech. In this paper apart from speech codec about AMR is also studied that why the AMR is proposed for the GSM, how the bits rates are reduced in AMR, operation of AMR and other applications of AMR.

4 citations


Proceedings ArticleDOI
16 Apr 2014
TL;DR: The implementation details of Code Excited Linear Prediction Speech Coder at different bit rates and analytical evaluation of performance in terms of bit rate and quality using PRAAT software are discussed.
Abstract: Attractive improvements have been made during these days in coding speech with high quality at low bit rates and low delay. The need for low bit rate speech coding algorithms continues, supported by the ever increasing number of users to the wireless communication networks. This paper discusses the implementation details of Code Excited Linear Prediction Speech Coder at different bit rates (16Kbps, 9.6Kbps, 7Kbps, 6.8Kbps, 4.9Kbps and 4.8Kbps) and analytical evaluation of performance in terms of bit rate and quality using PRAAT software.

4 citations


Proceedings ArticleDOI
04 May 2014
TL;DR: A new noise production and propagation model for open- and closed loop linear predictive coding (LPC) is proposed and allows to accurately predict the overall SNR even at lower bit rates where the conventional high rate theory fails.
Abstract: A new noise production and propagation model for open- and closed loop linear predictive coding (LPC) is proposed in this paper. The model allows to accurately predict the overall SNR even at lower bit rates where the conventional high rate theory fails. Moreover, a source of LPC encoder instabilities is pointed out which is due to the interaction between the quantizer and the (filtered) feedback of the quantization error. The new model is verified by measurements.

3 citations


Journal ArticleDOI
TL;DR: The results show that MELP and CELP produce comparable quality while the quality of LD-CELP coder is much higher, at the expense of higher bit rate.
Abstract: Linear predictive coders form an important class of speech coders. This paper describes the software level implementation of linear prediction based vocoders, viz. Code Excited Linear Prediction (CELP), Low-Delay CELP (LD-CELP) and Mixed Excitation Linear Prediction (MELP) at bit rates of 4.8 kb/s, 16 kb/s and 2.4 kb/s respectively. The C programs of the vocoders have been compiled and executed in Linux platform. Subjective testing with the help of Mean Opinion Score test has been performed. Waveform analysis has been done using Praat and Adobe Audition software. The results show that MELP and CELP produce comparable quality while the quality of LD-CELP coder is much higher, at the expense of higher bit rate.

Proceedings ArticleDOI
17 Mar 2014
TL;DR: This study appeals to data hiding technique to hide samples about previous excitation sent frame in the recent excitation samples frame in process to add important information about excitation for use in case of an eventual loss, but without any overload of the transmission bandwidth to avoid increasing the packets congestion in the network.
Abstract: Effectiveness of Code Excited Linear Prediction speech codecs dwells in a good reconstruction of excitation related to the mixture of both: adaptive codebook and fixed codebook excitations. Therefore, in this study we focused our attention on the lost frame excitation reconstruction by using data hiding. Hence, the goal behind this idea is to add important information about excitation for use in case of an eventual loss, but without any overload of the transmission bandwidth to avoid increasing the packets congestion in the network. Indeed, we have to find a good trade-off between the embedded data and keeping the intelligibility of speech after decoding as well as possible. In our study, we appeal to data hiding technique to hide samples about previous excitation sent frame in the recent excitation samples frame in process. Therefore, the hidden data related to any frame can be extracted once the following frame, coming just after, will be received; and this with respect to the considered delay between two successive frames. The test results show that this method is a promising area for future works on frame loss recovery using data hiding.

Proceedings ArticleDOI
Yuhong Yang1, Dong Shaolong1, Ruimin Hu1, Wang Yanye1, Li Gao1, Maosheng Zhang1 
20 Aug 2014
TL;DR: Objective and subjective evaluation results for the proposed approach, in comparison with existing technique of AVS-P10, provide strong evidence for gains across a variety of speech and audio signals.
Abstract: This paper proposes an inter-frame correlation based error concealment approach for hybrid CELP (Code Excited Linear Prediction) and transform codec's with both good speech and audio quality at moderate bit rates. The proposed scheme is designed to overcome the main challenge due to the diversified characteristics of input signals. The underlying idea is to employ the inter-frame correlation of previous neighborhood frames to circumvent the pitfalls of referring to the unrelated frames, and to enable effective prediction of ISF (Immittance Spectral Frequencies) spectrum coefficients of missing frames from the immediate relative history using linear regression approach. Objective and subjective evaluation results for the proposed approach, in comparison with existing technique of AVS-P10 (Audio Video coding of China Standard Part 10 -- Mobile Speech and Audio Codec), provide strong evidence for gains across a variety of speech and audio signals.

Proceedings ArticleDOI
01 Nov 2014
TL;DR: A new post-filtering method in which the bass the frequency band and the gain are adaptively controlled frame-by-frame depending on the pitch frequency of decoded signal to improve bass post-filter performance is described.
Abstract: Most speech codecs utilize a post-filter that emphasizes pitch structures to enhance perceptual quality at the decoder. Particularly, the bass post-filter used in ITU-T G.718 performs an adaptive pitch enhancement technique for a lower fixed frequency band. This paper describes a new post-filtering method in which the bass the frequency band and the gain are adaptively controlled frame-by-frame depending on the pitch frequency of decoded signal to improve bass post-filter performance. We have confirmed the improvement of the speech quality with the developed method through objective and subjective evaluations.

Book ChapterDOI
01 Jan 2014
TL;DR: New results on the stability and sensitivity of LPC based on changes in speech input pitch length, sign bit, and LPC values during transmission (or for any other reason) consecutively and simultaneously are presented.
Abstract: The speech codec analyzes the speech using A(z) (analysis filter) and synthesizes back at decoder side using linear prediction coefficients (LPC). These LP coefficients are sensitive and cannot be sent directly in a transmission channel. A small corruption in LPC values during transmission destroys the synthesized speech at the decoder side. We have presented new results on the stability and sensitivity of LPC based on changes in speech input pitch length, sign bit, and LPC values during transmission (or for any other reason) consecutively and simultaneously. Present analysis will help to add varying dynamic range to LSF coding. For this each individual LPC need to be related to each LSF. All the speech inputs considered in this study are voiced speech, which has been separated manually. For a specific order, we analyzed the numbers of LPC which are more responsible for increase in prediction error at decoder side when they are corrupted by noise. Present analysis provides the reference for number of bits required for quantization of LPC or line spectral pairs (LSF).

Journal ArticleDOI
TL;DR: An adaptive normalisation method is formulated as a preprocessor to the entropy coder to mitigate the poor coding performance in the random access frames and confirm the superiority of the proposed method over existing solutions in terms of coding efficiency performance.
Abstract: Linear prediction serves as a mathematical operation to estimate the future values of a discrete-time signal based on a linear function of previous samples. When applied to predictive coding of waveform such as speech and audio, a common issue that plagues compression performance is the non-stationary characteristics of prediction residuals around the starting point of the random access frames. This is because dependencies between prediction residuals and the historical waveform are interrupted to satisfy the random access requirement. In such cases, the dynamic range of the prediction residuals will fluctuate dramatically in such frames, leading to substantially poor coding performance in the subsequent entropy coder. In this study, the authors developed a solution to this long-standing issue by establishing a theoretical relationship between the energy envelope of linear prediction residuals in the random access frames and the prediction coefficients. Using the established relationship, an adaptive normalisation method is formulated as a preprocessor to the entropy coder to mitigate the poor coding performance in the random access frames. Simulation results confirm the superiority of the proposed method over existing solutions in terms of coding efficiency performance.

Proceedings ArticleDOI
10 Jul 2014
TL;DR: Results show that CELP which operates at a higher bit rate of 4.8 kb/s gives better quality output and the need for better codebook tuning in Indian dialects is highlighted.
Abstract: Speech coders based on the linear prediction model are widely in use today. This paper describes the algorithms of low bit-rate vocoders, viz. Code-Excited Linear Prediction (CELP) and Mixed Excitation Linear Prediction (MELP) and their performance for Indian dialects. A Linux platform has been used for execution of the vocoders. Mean Opinion Score testing has been performed with speech samples of Indian dialects and Indian-accented English. Waveform analysis has been done using Praat software. Results show that CELP which operates at a higher bit rate of 4.8 kb/s gives better quality output. MELP speech output is intelligible and suited for very low bit-rate (2.4 kb/s) applications. The results also highlight the need for better codebook tuning in Indian dialects.

Proceedings ArticleDOI
14 Apr 2014
TL;DR: With these proposals, the computational load is reduced as the number of participants increases, enabling more resources to extend the capacity of the conference service and reduction of complexity also leads to processing delay decreases which is important for real time applications.
Abstract: This paper presents alternative approaches to select the mixed channels during teleconferencing involving CELP CoDecs. The proposals address the problems related to complexity and delay when classical solutions based on PCM samples are used. The principle consists of avoiding total speech decoding and to extrapolate the speech audio level based on CELP parameters, before channels selection. Only the selected speaker's bit streams are completely decoded for mixing and thus leading to very basic processing. With these proposals, the computational load is reduced as the number of participants increases, enabling more resources to extend the capacity of the conference service. Reduction of complexity also leads to processing delay decreases which is important for real time applications.

Journal ArticleDOI
TL;DR: In this paper, the authors describe the software level implementation of linear prediction based vocoders, viz. Code Excited Linear Prediction (CELP), Low-Delay CELP, and Mixed Excitation Linear Prediction at bit rates of 4.8, 16, and 2.4 kb/s respectively.
Abstract: Linear predictive coders form an important class of speech coders. This paper describes the software level implementation of linear prediction based vocoders, viz. Code Excited Linear Prediction (CELP), Low-Delay CELP (LD-CELP) and Mixed Excitation Linear Prediction (MELP) at bit rates of 4.8 kb/s, 16 kb/s and 2.4 kb/s respectively. The C programs of the vocoders have been compiled and executed in Linux platform. Subjective testing with the help of Mean Opinion Score test has been performed. Waveform analysis has been done using Praat and Adobe Audition software. The results show that MELP and CELP produce comparable quality while the quality of LD-CELP coder is much higher, at the expense of higher bit rate.

Patent
15 Jan 2014
TL;DR: In this paper, a codebook arrangement for use in coding an input sound signal comprises first and second codebook stages, where the first codebook stage includes one of a time-domain CELP codebook and a transform-domain codebook.
Abstract: A codebook arrangement for use in coding an input sound signal comprises first and second codebook stages The first codebook stage includes one of a time-domain CELP codebook and a transform-domain codebook The second codebook stage follows the first codebook stage and includes the other of the time-domain CELP codebook and the transform-domain codebook A third codebook stage comprising an adaptive codebook may be provided before the first codebook stage A selector may be provided to select an order of the time-domain CELP codebook and the transform-domain codebook in the first and second codebook stages, respectively, as a function of characteristics of the input sound signal The selector may also be responsive to both the characteristics of the input sound signal and a bit rate of the codec using the codebook arrangement to bypass the second codebook stage The codebook arrangement can be used in a coder of an input sound signal

Patent
03 Sep 2014
TL;DR: In this paper, a codebook excited linear prediction encoder, a decoder, and methods for encoding and decoding are presented, where bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of the global gain values of the frames results in an adjustment of an output level of the decoded representation of the audio content.
Abstract: The inventin provides a codebook excited linear prediction encoder, a decoder, and methods for encoding and decoding. In accordance with a first aspect of the present invention, bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of the global gain value of the frames results in an adjustment of an output level of the decoded representation of the audio content. In accordance with another aspect, a global gain control across CELP coded frames and transform coded frames is achieved by co-controlling the gain of the codebook excitation of the CELP codec, along with a level of the transform or inverse transform of the transform coded frames. According to even another aspect, a variation of the loudness of a CELP coded bitstream upon changing the respective gain value is rendered more well adapted to the behavior of transform coded level adjustments, by performing the gain value determination in CELP coding in the weighted domain of the excitation signal.

Proceedings ArticleDOI
01 Sep 2014
TL;DR: A novel scheme is proposed in which speech coding module based on Algebraic Code Excited Linear Prediction (ACELP) is removed completely and speech waveforms can be reconstructed from MFCCs in decoding and this greatly simplifies the structure of SVAC.
Abstract: In the audio encoder of Surveillance Video and Audio Coding (SVAC), both audio signals and MEL-frequency cepstral coefficients (MFCCs) are coded and this leads to high computational complexity. This paper proposes a novel scheme for SVAC in which speech coding module based on Algebraic Code Excited Linear Prediction (ACELP) is removed completely and speech waveforms can be reconstructed from MFCCs in decoding. The novel scheme greatly simplifies the structure of SVAC and also has a high performance for decoded speech signals in quality evaluation.

Book ChapterDOI
01 Jan 2014
TL;DR: This chapter extends the Pulse Code Modulation (PCM) scheme to DPCM, prepending the word “Differential,” as briefly introduced in Chap.
Abstract: In this chapter, compression of audio information is reviewed, with special consideration paid to speech compression. To begin with, we recall some of the issues covered in Chap. 6 on digital audio in multimedia. Here, this is combined with techniques that exploit the temporal redundancy present in audio signals. We extend the Pulse Code Modulation (PCM) scheme to DPCM, prepending the word “Differential,” as briefly introduced in Chap. 6 but fleshed out here. Specifically, in this chapter, we look at ADPCM, Vocoders, and more general Speech Compression: LPC, CELP, MBE, and MELP. Adaptive DPCM is ADPCM. In speech coding, a number of standards have evolved and we set these out here, including some of their fundamental strategies. We then go on to study coders (encoding/decoding algorithms) specifically aimed at speech compression. The properties of Vocoders are examined, including the notion of phase insensitivity, channels, and formants. Next, LPC (Linear Predictive Coding) vocoders are discussed, followed by CELP (Code Excited Linear Prediction), a more complex family of coders. Hybrid Excitation Vocoders are another large class of speech coders, and we round the discussion off by having a look at MBE (Multi-Band Excitation) and MELP (Multiband Excitation Linear Predictive) vocoders.

01 Jan 2014
TL;DR: In this paper, the C program of the coder was compiled using the GCC compiler in the Linux environment and executed using the GNU Make utility, and the bit rate obtained is 16 kb/s, i.e. a reduction by 4 times of the original bitrate.
Abstract: used to extract details regarding the quality of the sound synthesized by the coder. The C program of the coder was compiled using the GCC compiler in the Linux environment and executed using the GNU Make utility. File compression ratio of 5 was achieved. The bit rate obtained is 16 kb/s, i.e. a reduction by 4 times of the original bitrate. The perceptual quality of the reconstructed audio output is very good, and this is evident from the spectrogram, pitch, intensity and formant waveforms obtained.