Showing papers on "Code-excited linear prediction published in 2014"

PDF

Open Access

Journal Article•DOI•

Speech Coding Based on Compressed Sensing and Sparse Representation

[...]

Shang Jing Li¹, Qi Zhu¹•Institutions (1)

Nanjing University of Posts and Telecommunications¹

07 May 2014-Applied Mechanics and Materials

TL;DR: Results show that the proposed coding scheme has achieved average Mean Opinion Score of the synthesized speech 3.083 in an appropriate bit rate (4.2 Kbps), which outperforms the quality of Code excited linear prediction (CELP).

...read moreread less

Abstract: In this paper, we propose a novel speech coding scheme based on compressed sensing and sparse representation. Compressed sensing (CS) attracts great interest for its ability to utilize a few measurements to recover original signals. Measurements preserve part of speech features while projected by row echelon matrix. A dictionary is learned in order to contain redundant information about speech measurements. The synthesized speech is recovered from a sparse approximation of the corresponding measurement. A rear low-pass filter is adopted to improve the subject quality of synthesized speech. Results show that the proposed coding scheme has achieved average Mean Opinion Score (MOS) of the synthesized speech 3.083 in an appropriate bit rate (4.2 Kbps), which outperforms the quality of Code excited linear prediction (CELP).

...read moreread less

20 citations

Journal Article•DOI•

Stable 1-norm error minimization based linear predictors for speech modeling

[...]

Daniele Giacobello, Mads Græsbøll Christensen¹, Tobias Lindstrøm Jensen¹, Manohar N. Murthi², Søren Holdt Jensen¹, Marc Moonen³ - Show less +2 more•Institutions (3)

Aalborg University¹, University of Miami², Katholieke Universiteit Leuven³

01 May 2014-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Two new methods are introduced to obtain intrinsically stable predictors with the 1-norm minimization, based on constraining the roots of the predictor to lie within the unit circle by reducing the numerical range of the shift operator associated with the particular prediction problem considered.

...read moreread less

Abstract: In linear prediction of speech, the 1-norm error minimization criterion has been shown to provide a valid alternative to the 2-norm minimization criterion. However, unlike 2-norm minimization, 1-norm minimization does not guarantee the stability of the corresponding all-pole filter and can generate saturations when this is used to synthesize speech. In this paper, we introduce two new methods to obtain intrinsically stable predictors with the 1-norm minimization. The first method is based on constraining the roots of the predictor to lie within the unit circle by reducing the numerical range of the shift operator associated with the particular prediction problem considered. The second method uses the alternative Cauchy bound to impose a convex constraint on the predictor in the 1-norm error minimization. These methods are compared with two existing methods: the Burg method, based on the 1-norm minimization of the forward and backward prediction error, and the iteratively reweighted 2-norm minimization known to converge to the 1-norm minimization with an appropriate selection of weights. The evaluation gives proof of the effectiveness of the new methods, performing as well as unconstrained 1-norm based linear prediction for modeling and coding of speech.

...read moreread less

17 citations

Proceedings Article•DOI•

Speech encryption applying a modified Blowfish algorithm

[...]

Amaal A. Abd El-Sadek¹, Talaat A. Elgarf¹, Mohammed M. Fouad²•Institutions (2)

Higher Technological Institute¹, Zagazig University²

19 Apr 2014

TL;DR: The results showed that the modified design of the algorithm continues to offer the same level of security as the original Blowfish cipher with a less computational overhead in key generation.

...read moreread less

Abstract: This paper applied the high quality speech coding algorithm G.729 which has been standardized by ITU-T with low bit rate 8kb/s. This algorithm is based on a conjugate- structure algebraic CELP (CS_ACELP) coding technique with 10ms speech frames. The output of encoder is encrypted by a symmetric key Blowfish algorithm which has 64 bits block size and a variable key length from 32 up to 448 bits. One of the main disadvantages of Blowfish algorithm is the time required to initialize the algorithm with the key. This paper proposes a new method for generating S-boxes and P-arrays which are considered as the main building elements of the Blowfish algorithm. This new generating method leads to a reduction in time complexity of generating S-boxes and P-arrays. The proposed speech encryption system has been implemented using Matlab and the output is analyzed using Avalanche effect. The results showed that the modified design of the algorithm continues to offer the same level of security as the original Blowfish cipher with a less computational overhead in key generation.

...read moreread less

14 citations

Direct linear conversion of LSP parameters for perceptual control in speech and audio coding

[...]

Sugiura Ryosuke¹, Yutaka Kamamoto², Noboru Harada², Hirokazu Kameoka², Takehiro Moriya² - Show less +1 more•Institutions (2)

University of Tokyo¹, Nippon Telegraph and Telephone²

13 Nov 2014

TL;DR: It is shown that this method performs the same as the weighting in the linear prediction coding domain but with lower complexity in a low-bit-rate situation.

...read moreread less

Abstract: We have devised a direct and simple scheme for linear conversion of line spectrum pairs (LSP) with low computational complexity aiming at weighting or inverse weighting spectral envelopes for noise control in speech and audio coders. Using optimally prepared coefficients, we can perform the conversion directly in the LSP domain, which ensures low computational costs and also simplifies the check or the modification of unstable parameters. We show that this method performs the same as the weighting in the linear prediction coding domain but with lower complexity in a low-bit-rate situation. The devised method is therefore expected to be useful for lowbit-rate speech and audio coders for mobile communications.

...read moreread less

7 citations

Proceedings Article•DOI•

Analysis of LD-CELP coder output with Sound eXchange and Praat software

[...]

Lani Rachel Mathew, Ancy S. Anselam, Sakuntala S. Pillai

08 May 2014

TL;DR: The perceptual quality of the reconstructed audio output is very good, and this is evident from the spectrogram, pitch, intensity and formant waveforms obtained.

...read moreread less

Abstract: This paper describes the analysis of the audio and speech output of the 16 kb/s Low Delay CELP algorithm (LD-CELP). For various speech file conversions, the SoX (Sound eXchange) tool is used. Praat software is used to extract details regarding the quality of the sound synthesized by the coder. The C program of the coder was compiled using the GCC compiler in the Linux environment and executed using the GNU Make utility. File compression ratio of 5 was achieved. The bit rate obtained is 16 kb/s, i.e. a reduction by 4 times of the original bitrate. The perceptual quality of the reconstructed audio output is very good, and this is evident from the spectrogram, pitch, intensity and formant waveforms obtained.

...read moreread less

5 citations

Steganography Approach of Weighted Speech Analysis with and without Vector Quantization using Variation in Weight Factor

[...]

Nikita Atul, Malhotra Ȧ, Nikunj Tahilramani

01 Jan 2014

TL;DR: This Paper basically focuses on the analysis and synthesis of the speech signal which takes reference from the speech coding technique of Conjugate-Structure Algebraic Code Excited Linear Prediction and also the concept of vector quantization.

...read moreread less

Abstract: This Paper basically focuses on the analysis and synthesis of the speech signal which takes reference from the speech coding technique of Conjugate-Structure Algebraic Code Excited Linear Prediction (CS-ACELP) and also the concept of vector quantization. The proposed technique analyses the variation in weighted speech by the changes in the weight factor and also by implementing it with and without vector quantization. This algorithm is based on CELP coding technique which works on analysis by synthesis principle with the frame size of 80 samples (10 ms). Steganography is implemented in the LSB of the input cover speech which contributes to the high frequency component of the cover speech. The weighted speech signal is generated using both quantized and unquantized LP parameters, which is being tested on various grounds to basically observe and analyze its overall performance. At the end an analytic comparison between the original and the weighted speech signal both subjectively as well as objectively is done. The values of the two weight factors produce a considerable change in the weighted speech quality.

...read moreread less

5 citations

Journal Article•DOI•

Research on an Embedded Ultra-Low-Bit-Rate Speech Coding Algorithm

[...]

Ye Li, Yan Hong Fan¹, Fei Yuan¹, Xiaomei Xu¹•Institutions (1)

Xiamen University¹

01 Jul 2014

TL;DR: The paper introduced the principle of an embedded speech coding algorithm with dual rates at both 300bps and 400bps based on the enhanced mixed excitation linear prediction model and the results show that this embedded ultra-low-bit-ratespeech coding algorithm has satisfactory quality under both DRT and MOS test.

...read moreread less

Abstract: Ultra-low-bit-rate speech coding algorithm was in great demand for many fields such as underwater speech communications. Underwater speech communication for middle-long distance has the characteristics of narrow bandwidth as well as low transmission rate, which makes the underwater speech communication much difficult. Ultra-low-bit-rate speech coding algorithm plays an important role on this occasion. More over, it will be more flexible for the underwater speech communication system if the speech coding algorithm has an embedded structure. The paper introduced the principle of an embedded speech coding algorithm with dual rates at both 300bps and 400bps based on the enhanced mixed excitation linear prediction model. The results show that this embedded ultra-low-bit-rate speech coding algorithm has satisfactory quality under both DRT and MOS test.

...read moreread less

4 citations

Journal Article•DOI•

Study and Performance of AMR Codecs for GSM

[...]

Divya Choudhary, Abhinav Kumar

30 Oct 2014-International Journal of Advanced Research in Computer and Communication Engineering

TL;DR: Application of Code Excited Linear Prediction source codec on speech followed by AMR codec is studied and it is studied that why the AMR is proposed for the GSM, how the bits rates are reduced in AMR, operation of AMR and other applications of AM R.

...read moreread less

Abstract: In wireless communication system, limited bandwidth and power is the primary restriction. The existing wireless systems involved in transmission of speech visualized that efficient and effective methods be developed to transmit and receive the same while maintaining quality of speech, especially at the receiving end. Speech coding technique is a material of research for the scientific and academic community since the era of digitization (digital). Amongst all elements of the communication systems (transmitter, channel and receiver), transmission channel is the most critical and plays a key role in the transmission and reception of information. The quality of speech at receiver end decides by channel conditions. Modelling a channel is a multifarious task. A number of techniques are adopted to alleviate the effect of the channel. Adaptive Multi Rate is one of the techniques that neutralize the deleterious effect of the channel on speech. This technique utilizes variable bit rate that dynamically switches to specific modes of operation depending upon the channel conditions. For example, Low bit rate mode of operation is selected in adverse channel conditions, this helps to provide more error protection bits for channel coding and vice versa. Therefore, in this paper, application of Code Excited Linear Prediction (CELP) source codec on speech followed by AMR codec is studied. Further, higher the bit rate used, the better is the quality of speech. In this paper apart from speech codec about AMR is also studied that why the AMR is proposed for the GSM, how the bits rates are reduced in AMR, operation of AMR and other applications of AMR.

...read moreread less

4 citations

Proceedings Article•DOI•

Performance evaluation of Code Excited Linear Prediction Speech Coders at various bit rates

[...]

Ancy S. Anselam, Sakuntala S. Pillai

16 Apr 2014

TL;DR: The implementation details of Code Excited Linear Prediction Speech Coder at different bit rates and analytical evaluation of performance in terms of bit rate and quality using PRAAT software are discussed.

...read moreread less

Abstract: Attractive improvements have been made during these days in coding speech with high quality at low bit rates and low delay. The need for low bit rate speech coding algorithms continues, supported by the ever increasing number of users to the wireless communication networks. This paper discusses the implementation details of Code Excited Linear Prediction Speech Coder at different bit rates (16Kbps, 9.6Kbps, 7Kbps, 6.8Kbps, 4.9Kbps and 4.8Kbps) and analytical evaluation of performance in terms of bit rate and quality using PRAAT software.

...read moreread less

4 citations

Proceedings Article•DOI•

On noise propagation in closed-loop linear predictive coding

[...]

Hauke Krüger¹, Bernd Geiser¹, Peter Vary¹•Institutions (1)

RWTH Aachen University¹

04 May 2014

TL;DR: A new noise production and propagation model for open- and closed loop linear predictive coding (LPC) is proposed and allows to accurately predict the overall SNR even at lower bit rates where the conventional high rate theory fails.

...read moreread less

Abstract: A new noise production and propagation model for open- and closed loop linear predictive coding (LPC) is proposed in this paper. The model allows to accurately predict the overall SNR even at lower bit rates where the conventional high rate theory fails. Moreover, a source of LPC encoder instabilities is pointed out which is due to the interaction between the quantizer and the (filtered) feedback of the quantization error. The new model is verified by measurements.

...read moreread less

3 citations

Journal Article•DOI•

Performance Comparison of Linear Prediction based Vocoders in Linux Platform

[...]

Lani Rachel Mathew, Ancy S. Anselam, Sakuntala S. Pillai

25 Jun 2014-arXiv: Multimedia

TL;DR: The results show that MELP and CELP produce comparable quality while the quality of LD-CELP coder is much higher, at the expense of higher bit rate.

...read moreread less

Abstract: Linear predictive coders form an important class of speech coders. This paper describes the software level implementation of linear prediction based vocoders, viz. Code Excited Linear Prediction (CELP), Low-Delay CELP (LD-CELP) and Mixed Excitation Linear Prediction (MELP) at bit rates of 4.8 kb/s, 16 kb/s and 2.4 kb/s respectively. The C programs of the vocoders have been compiled and executed in Linux platform. Subjective testing with the help of Mean Opinion Score test has been performed. Waveform analysis has been done using Praat and Adobe Audition software. The results show that MELP and CELP produce comparable quality while the quality of LD-CELP coder is much higher, at the expense of higher bit rate.

...read moreread less

Proceedings Article•DOI•

Improved packets loss concealment in speech coding by data hiding

[...]

Nadir Benamirouche¹, Bachir Boudraa, Tetsuya Shimamura², Pranab Kumar Dhar²•Institutions (2)

University of Béjaïa¹, Saitama University²

17 Mar 2014

TL;DR: This study appeals to data hiding technique to hide samples about previous excitation sent frame in the recent excitation samples frame in process to add important information about excitation for use in case of an eventual loss, but without any overload of the transmission bandwidth to avoid increasing the packets congestion in the network.

...read moreread less

Abstract: Effectiveness of Code Excited Linear Prediction speech codecs dwells in a good reconstruction of excitation related to the mixture of both: adaptive codebook and fixed codebook excitations. Therefore, in this study we focused our attention on the lost frame excitation reconstruction by using data hiding. Hence, the goal behind this idea is to add important information about excitation for use in case of an eventual loss, but without any overload of the transmission bandwidth to avoid increasing the packets congestion in the network. Indeed, we have to find a good trade-off between the embedded data and keeping the intelligibility of speech after decoding as well as possible. In our study, we appeal to data hiding technique to hide samples about previous excitation sent frame in the recent excitation samples frame in process. Therefore, the hidden data related to any frame can be extracted once the following frame, coming just after, will be received; and this with respect to the considered delay between two successive frames. The test results show that this method is a promising area for future works on frame loss recovery using data hiding.

...read moreread less

Proceedings Article•DOI•

An Inter-frame Correlation Based Error Concealment of Immittance Spectral Coefficients for Mobile Speech and Audio Codecs

[...]

Yuhong Yang¹, Dong Shaolong¹, Ruimin Hu¹, Wang Yanye¹, Li Gao¹, Maosheng Zhang¹ - Show less +2 more•Institutions (1)

Wuhan University¹

20 Aug 2014

TL;DR: Objective and subjective evaluation results for the proposed approach, in comparison with existing technique of AVS-P10, provide strong evidence for gains across a variety of speech and audio signals.

...read moreread less

Abstract: This paper proposes an inter-frame correlation based error concealment approach for hybrid CELP (Code Excited Linear Prediction) and transform codec's with both good speech and audio quality at moderate bit rates. The proposed scheme is designed to overcome the main challenge due to the diversified characteristics of input signals. The underlying idea is to employ the inter-frame correlation of previous neighborhood frames to circumvent the pitfalls of referring to the unrelated frames, and to enable effective prediction of ISF (Immittance Spectral Frequencies) spectrum coefficients of missing frames from the immediate relative history using linear regression approach. Objective and subjective evaluation results for the proposed approach, in comparison with existing technique of AVS-P10 (Audio Video coding of China Standard Part 10 -- Mobile Speech and Audio Codec), provide strong evidence for gains across a variety of speech and audio signals.

...read moreread less

Proceedings Article•DOI•

Adaptive post-filtering controlled by pitch frequency for CELP-based speech coder

[...]

Hironobu Chiba¹, Yutaka Kamamoto², Takehiro Moriya², Noboru Harada², Shigeki Miyabe¹, Takeshi Yamada¹, Shoji Makino¹ - Show less +3 more•Institutions (2)

University of Tsukuba¹, Nippon Telegraph and Telephone²

01 Nov 2014

TL;DR: A new post-filtering method in which the bass the frequency band and the gain are adaptively controlled frame-by-frame depending on the pitch frequency of decoded signal to improve bass post-filter performance is described.

...read moreread less

Abstract: Most speech codecs utilize a post-filter that emphasizes pitch structures to enhance perceptual quality at the decoder. Particularly, the bass post-filter used in ITU-T G.718 performs an adaptive pitch enhancement technique for a lower fixed frequency band. This paper describes a new post-filtering method in which the bass the frequency band and the gain are adaptively controlled frame-by-frame depending on the pitch frequency of decoded signal to improve bass post-filter performance. We have confirmed the improvement of the speech quality with the developed method through objective and subjective evaluations.

...read moreread less

Book Chapter•DOI•

Stability Analysis of Speech Synthesis Filter of CELP-Based AMR-WB Codec

[...]

D. Jatin¹, T. S. Sheshadri¹, N. Ramesh, H. K. Muttanna¹•Institutions (1)

Indian Institute of Science¹

01 Jan 2014

TL;DR: New results on the stability and sensitivity of LPC based on changes in speech input pitch length, sign bit, and LPC values during transmission (or for any other reason) consecutively and simultaneously are presented.

...read moreread less

Abstract: The speech codec analyzes the speech using A(z) (analysis filter) and synthesizes back at decoder side using linear prediction coefficients (LPC). These LP coefficients are sensitive and cannot be sent directly in a transmission channel. A small corruption in LPC values during transmission destroys the synthesized speech at the decoder side. We have presented new results on the stability and sensitivity of LPC based on changes in speech input pitch length, sign bit, and LPC values during transmission (or for any other reason) consecutively and simultaneously. Present analysis will help to add varying dynamic range to LSF coding. For this each individual LPC need to be related to each LSF. All the speech inputs considered in this study are voiced speech, which has been separated manually. For a specific order, we analyzed the numbers of LPC which are more responsible for increase in prediction error at decoder side when they are corrupted by noise. Present analysis provides the reference for number of bits required for quantization of LPC or line spectral pairs (LSF).

...read moreread less

Journal Article•DOI•

Optimal normalisation of prediction residual for predictive coding with random access

[...]

Haiyan Shu, Rongshan Yu, Haibin Huang

01 Sep 2014-Iet Signal Processing

TL;DR: An adaptive normalisation method is formulated as a preprocessor to the entropy coder to mitigate the poor coding performance in the random access frames and confirm the superiority of the proposed method over existing solutions in terms of coding efficiency performance.

...read moreread less

Abstract: Linear prediction serves as a mathematical operation to estimate the future values of a discrete-time signal based on a linear function of previous samples. When applied to predictive coding of waveform such as speech and audio, a common issue that plagues compression performance is the non-stationary characteristics of prediction residuals around the starting point of the random access frames. This is because dependencies between prediction residuals and the historical waveform are interrupted to satisfy the random access requirement. In such cases, the dynamic range of the prediction residuals will fluctuate dramatically in such frames, leading to substantially poor coding performance in the subsequent entropy coder. In this study, the authors developed a solution to this long-standing issue by establishing a theoretical relationship between the energy envelope of linear prediction residuals in the random access frames and the prediction coefficients. Using the established relationship, an adaptive normalisation method is formulated as a preprocessor to the entropy coder to mitigate the poor coding performance in the random access frames. Simulation results confirm the superiority of the proposed method over existing solutions in terms of coding efficiency performance.

...read moreread less

Proceedings Article•DOI•

A study of low bit-rate linear prediction based speech coders for Indian dialects

[...]

Lani Rachel Mathew, Ancy S. Anselam, Sakuntala S. Pillai

10 Jul 2014

TL;DR: Results show that CELP which operates at a higher bit rate of 4.8 kb/s gives better quality output and the need for better codebook tuning in Indian dialects is highlighted.

...read moreread less

Abstract: Speech coders based on the linear prediction model are widely in use today. This paper describes the algorithms of low bit-rate vocoders, viz. Code-Excited Linear Prediction (CELP) and Mixed Excitation Linear Prediction (MELP) and their performance for Indian dialects. A Linux platform has been used for execution of the vocoders. Mean Opinion Score testing has been performed with speech samples of Indian dialects and Indian-accented English. Waveform analysis has been done using Praat software. Results show that CELP which operates at a higher bit rate of 4.8 kb/s gives better quality output. MELP speech output is intelligible and suited for very low bit-rate (2.4 kb/s) applications. The results also highlight the need for better codebook tuning in Indian dialects.

...read moreread less

Proceedings Article•DOI•

Selection of active speaker(s) in VoIP conference bridges: From linear domain to CELP parameters domain

[...]

Emmanuel Rossignol Thepie Fapi, Eric Poulin

14 Apr 2014

TL;DR: With these proposals, the computational load is reduced as the number of participants increases, enabling more resources to extend the capacity of the conference service and reduction of complexity also leads to processing delay decreases which is important for real time applications.

...read moreread less

Abstract: This paper presents alternative approaches to select the mixed channels during teleconferencing involving CELP CoDecs. The proposals address the problems related to complexity and delay when classical solutions based on PCM samples are used. The principle consists of avoiding total speech decoding and to extrapolate the speech audio level based on CELP parameters, before channels selection. Only the selected speaker's bit streams are completely decoded for mixing and thus leading to very basic processing. With these proposals, the computational load is reduced as the number of participants increases, enabling more resources to extend the capacity of the conference service. Reduction of complexity also leads to processing delay decreases which is important for real time applications.

...read moreread less

Journal Article•DOI•

Performance Comparison of Linear Prediction based Vocoders in Linux Platform

[...]

Lani Rachel Mathew, Ancy S. Anselam, Sakuntala S. Pillai

25 Apr 2014-international journal of engineering trends and technology

TL;DR: In this paper, the authors describe the software level implementation of linear prediction based vocoders, viz. Code Excited Linear Prediction (CELP), Low-Delay CELP, and Mixed Excitation Linear Prediction at bit rates of 4.8, 16, and 2.4 kb/s respectively.

...read moreread less

Patent•

Code excited liner prediction coder and transform-domain codebook in decoder

[...]

Eksler Vaclav

15 Jan 2014

TL;DR: In this paper, a codebook arrangement for use in coding an input sound signal comprises first and second codebook stages, where the first codebook stage includes one of a time-domain CELP codebook and a transform-domain codebook.

...read moreread less

Abstract: A codebook arrangement for use in coding an input sound signal comprises first and second codebook stages The first codebook stage includes one of a time-domain CELP codebook and a transform-domain codebook The second codebook stage follows the first codebook stage and includes the other of the time-domain CELP codebook and the transform-domain codebook A third codebook stage comprising an adaptive codebook may be provided before the first codebook stage A selector may be provided to select an order of the time-domain CELP codebook and the transform-domain codebook in the first and second codebook stages, respectively, as a function of characteristics of the input sound signal The selector may also be responsive to both the characteristics of the input sound signal and a bit rate of the codec using the codebook arrangement to bypass the second codebook stage The codebook arrangement can be used in a coder of an input sound signal

...read moreread less

Patent•

Codebook excited linear prediction encoder, decoder, and methods for encoding and decoding

[...]

Geiger Ralf, Guillaume Fuchs, Markus Multrus, Bernhard Grill

03 Sep 2014

TL;DR: In this paper, a codebook excited linear prediction encoder, a decoder, and methods for encoding and decoding are presented, where bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of the global gain values of the frames results in an adjustment of an output level of the decoded representation of the audio content.

...read moreread less

Abstract: The inventin provides a codebook excited linear prediction encoder, a decoder, and methods for encoding and decoding. In accordance with a first aspect of the present invention, bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of the global gain value of the frames results in an adjustment of an output level of the decoded representation of the audio content. In accordance with another aspect, a global gain control across CELP coded frames and transform coded frames is achieved by co-controlling the gain of the codebook excitation of the CELP codec, along with a level of the transform or inverse transform of the transform coded frames. According to even another aspect, a variation of the loudness of a CELP coded bitstream upon changing the respective gain value is rendered more well adapted to the behavior of transform coded level adjustments, by performing the gain value determination in CELP coding in the weighted domain of the excitation signal.

...read moreread less

Proceedings Article•DOI•

A novel scheme for SVAC audio encoder

[...]

Ruo Shu¹, Shuyan Ding¹, Shibao Li¹, Jianhang Liu¹•Institutions (1)

China University of Petroleum¹

01 Sep 2014

TL;DR: A novel scheme is proposed in which speech coding module based on Algebraic Code Excited Linear Prediction (ACELP) is removed completely and speech waveforms can be reconstructed from MFCCs in decoding and this greatly simplifies the structure of SVAC.

...read moreread less

Abstract: In the audio encoder of Surveillance Video and Audio Coding (SVAC), both audio signals and MEL-frequency cepstral coefficients (MFCCs) are coded and this leads to high computational complexity. This paper proposes a novel scheme for SVAC in which speech coding module based on Algebraic Code Excited Linear Prediction (ACELP) is removed completely and speech waveforms can be reconstructed from MFCCs in decoding. The novel scheme greatly simplifies the structure of SVAC and also has a high performance for decoded speech signals in quality evaluation.

...read moreread less

Book Chapter•DOI•

Basic Audio Compression Techniques

[...]

Ze-Nian Li¹, Mark S. Drew¹, Jiangchuan Liu¹•Institutions (1)

Simon Fraser University¹

01 Jan 2014

TL;DR: This chapter extends the Pulse Code Modulation (PCM) scheme to DPCM, prepending the word “Differential,” as briefly introduced in Chap.

...read moreread less

Abstract: In this chapter, compression of audio information is reviewed, with special consideration paid to speech compression. To begin with, we recall some of the issues covered in Chap. 6 on digital audio in multimedia. Here, this is combined with techniques that exploit the temporal redundancy present in audio signals. We extend the Pulse Code Modulation (PCM) scheme to DPCM, prepending the word “Differential,” as briefly introduced in Chap. 6 but fleshed out here. Specifically, in this chapter, we look at ADPCM, Vocoders, and more general Speech Compression: LPC, CELP, MBE, and MELP. Adaptive DPCM is ADPCM. In speech coding, a number of standards have evolved and we set these out here, including some of their fundamental strategies. We then go on to study coders (encoding/decoding algorithms) specifically aimed at speech compression. The properties of Vocoders are examined, including the notion of phase insensitivity, channels, and formants. Next, LPC (Linear Predictive Coding) vocoders are discussed, followed by CELP (Code Excited Linear Prediction), a more complex family of coders. Hybrid Excitation Vocoders are another large class of speech coders, and we round the discussion off by having a look at MBE (Multi-Band Excitation) and MELP (Multiband Excitation Linear Predictive) vocoders.

...read moreread less

of LD-CELP coder output with Sound eXchange and Praat software

[...]

Lani Rachel Mathew, Ancy S. Anselam, Sakuntala S. Pillai

01 Jan 2014

TL;DR: In this paper, the C program of the coder was compiled using the GCC compiler in the Linux environment and executed using the GNU Make utility, and the bit rate obtained is 16 kb/s, i.e. a reduction by 4 times of the original bitrate.

...read moreread less

Abstract: used to extract details regarding the quality of the sound synthesized by the coder. The C program of the coder was compiled using the GCC compiler in the Linux environment and executed using the GNU Make utility. File compression ratio of 5 was achieved. The bit rate obtained is 16 kb/s, i.e. a reduction by 4 times of the original bitrate. The perceptual quality of the reconstructed audio output is very good, and this is evident from the spectrogram, pitch, intensity and formant waveforms obtained.

...read moreread less