Showing papers on "Code-excited linear prediction published in 2015"

PDF

Open Access

Proceedings Article•DOI•

Arithmetic coding of speech and audio spectra using tcx based on linear predictive spectral envelopes

[...]

Tom Bäckström¹, Christian Helmrich¹•Institutions (1)

19 Apr 2015

TL;DR: Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant, and the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.

...read moreread less

Abstract: Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous approaches have used neighbouring frequency components to form a probability model for the entropy coder of spectral components, we propose to use the magnitude of the linear predictor to estimate the variance of spectral components. Since the linear predictor is transmitted in any case, this method does not require any additional side info. Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant. Consequently, the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.

...read moreread less

23 citations

Journal Article•DOI•

Optimal coding of generalized-Gaussian-distributed frequency spectra for low-delay audio coder with powered all-pole spectrum estimation

[...]

Sugiura Ryosuke¹, Yutaka Kamamoto¹, Noboru Harada¹, Hirokazu Kameoka¹, Takehiro Moriya¹ - Show less +1 more•Institutions (1)

Nippon Telegraph and Telephone¹

01 Aug 2015-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Experimental results show that incorporating the coding scheme in a state-of-the-art wide-band audio coder enhances its objective and subjective quality in a low-bit-rate and low-delay situation by increasing the compression efficiency.

...read moreread less

Abstract: We present an optimal coding scheme that parameterizes the maximum-likelihood estimate of variance for frequency spectra belonging to the generalized Gaussian distribution, the distribution covering the Laplacian and the Gaussian. By slightly modifying the all-pole model of the conventional linear prediction (LP), we can estimate the variance with the same method as in LP, which has low computational costs. Experimental results show that incorporating the coding scheme in a state-of-the-art wide-band audio coder enhances its objective and subjective quality in a low-bit-rate and low-delay situation by increasing the compression efficiency. Thus, this coding scheme will be useful in applications like mobile communications, which requires highly efficient compression.

...read moreread less

13 citations

Proceedings Article•DOI•

A robust speech/music discriminator for switched audio coding

[...]

Guillaume Fuchs

28 Dec 2015

TL;DR: Objective measures show that a more reliable switching decision is achievable and a reliable speech and music discriminator (SMD) for such an application is designed.

...read moreread less

Abstract: Switching between speech coding and generic audio coding schemes was recently proven to be very efficient for coding a large range of audio materials at low bit-rates. However, it strongly relies on a robust classification of the input signal. The aim of the paper is to design a reliable speech and music discriminator (SMD) for such an application. Main attention was laid on getting a good tradeoff between accuracy, reactivity and stability of the decision while keeping the delay and complexity reasonably low. To this end, short-term and long-term features are dissociated before being conveyed to two different classifiers. The two classifier outputs are combined in a final decision using a hysteresis. Objective measures show that a more reliable switching decision is achievable. The SMD was successfully implemented in MPEG Unified Speech and Audio Coding (USAC). It allows the codec to show unprecedented audio quality.

...read moreread less

12 citations

Proceedings Article•DOI•

Speech coding and enhancement using quantized compressive sensing measurements

[...]

Vinitha Ramdas, Deepak Mishra, Subrahmanyam Gorthi

01 Feb 2015

TL;DR: This work characterizes a speech codec in a Compressive Sensing (CS) framework and demonstrates simultaneous compression and de-noising of speech by CS, and Appropriate quantization of CS measurements to design medium bit-rate codec.

...read moreread less

Abstract: Medium bit rate hybrid speech coding schemes have gained much interest in the recent years and many of them have been standardized for various applications. This work characterizes a speech codec in a Compressive Sensing (CS) framework. We mainly demonstrate two aspects 1) Simultaneous compression and de-noising of speech by CS 2) Appropriate quantization of CS measurements to design medium bit-rate codec. The proposed scheme renders better quality speech compared to CELP, the widely used hybrid coding scheme, at the same bit rates. The CS speech codec has the added advantage of inherent noise suppression and easy scalability, without complex parameter extractions and voice activity detections.

...read moreread less

11 citations

Proceedings Article•DOI•

Enhanced inter prediction with localized weighted prediction in HEVC

[...]

Na Zhang¹, Yiran Lu¹, Xiaopeng Fan¹, Ruiqin Xiong², Debin Zhao¹, Wen Gao¹ - Show less +2 more•Institutions (2)

Harbin Institute of Technology¹, Peking University²

01 Dec 2015

TL;DR: The linear regression improvement model is employed to modify the prediction pixel values, and the weighting parameters are estimated in both encoder and decoder, no additional bits are required to be transmitted.

...read moreread less

Abstract: Inter prediction plays an important role in most video encoding systems since it could significantly improve coding performance. The more accurate the prediction of the current block, the smaller the residual, and the higher coding efficiency could be achieved accordingly. In this paper, a localized weighted prediction method is proposed to improve inter prediction accuracy. The linear regression improvement model is employed to modify the prediction pixel values. The weighting parameters are estimated in both encoder and decoder, no additional bits are required to be transmitted. The proposed method shows better coding performance than previous methods, including the explicit weighted prediction method in High Efficiency Video Coding (HEVC). Experimental results show that the BD bit rate saving of the proposed method is up to 7.4% compared to HM12.0, while the decoding complexity is almost the same.

...read moreread less

10 citations

Patent•

Apparatus and method for improved concealment of the adaptive codebook in a celp-like concealment employing improved pulse resynchronization

[...]

Jeremie Lecomte, Michael Schnabel, Markovic Goran, Dietz Martin, Bernhard Neugebauer - Show less +1 more

21 Dec 2015

TL;DR: In this paper, an apparatus for reconstructing a frame including a speech signal as a reconstructed frame is provided, the apparatus including a determination unit and a frame reconstructor being configured to reconstruct the reconstructed frame.

...read moreread less

Abstract: An apparatus for reconstructing a frame including a speech signal as a reconstructed frame is provided, the apparatus including a determination unit and a frame reconstructor being configured to reconstruct the reconstructed frame, such that the reconstructed frame completely or partially includes the first reconstructed pitch cycle, such that the reconstructed frame completely or partially includes a second reconstructed pitch cycle, and such that the number of samples of the first reconstructed pitch cycle differs from a number of samples of the second reconstructed pitch cycle.

...read moreread less

9 citations

Proceedings Article•DOI•

Sparse least-squares prediction for intra image coding

[...]

Luis F. R. Lucas, Nuno M. M. Rodrigues, Carla L. Pagliari¹, Eduardo A. B. da Silva, Sergio M. M. de Faria - Show less +1 more•Institutions (1)

Instituto Militar de Engenharia¹

10 Dec 2015

TL;DR: Experiments using an implementation of the proposed method in the state-of-the-art H.265/HEVC algorithm have shown that SLSP is able to improve the coding performance, specially in the presence of complex textures, achieving higher coding gains than other existing intra linear prediction methods.

...read moreread less

Abstract: This paper presents a new intra prediction method for efficient image coding, based on linear prediction and sparse representation concepts, denominated sparse least-squares prediction (SLSP). The proposed method uses a low order linear approximation model which may be built inside a predefined large causal region. The high flexibility of the SLSP filter context allows the inclusion of more significant image features into the model for better prediction results. Experiments using an implementation of the proposed method in the state-of-the-art H.265/HEVC algorithm have shown that SLSP is able to improve the coding performance, specially in the presence of complex textures, achieving higher coding gains than other existing intra linear prediction methods.

...read moreread less

7 citations

Proceedings Article•DOI•

Envelope modeling for speech and audio processing using distribution quantization

[...]

Tobias Jahnel¹, Tom Bäckström¹, Benjamin Schubert•Institutions (1)

University of Erlangen-Nuremberg¹

28 Dec 2015

TL;DR: An envelope model called distribution quantizer (DQ) is introduced, with the objective of combining the accuracy of linear prediction and the flexibility of scale factor bands, and the coefficients of distribution quantization are independent and thus more flexible and easier to quantize than linear predictive coefficients.

...read moreread less

Abstract: Envelope models are common in speech and audio processing: for example, linear prediction is used for modeling the spectral envelope of speech, whereas audio coders use scale factor bands for perceptual masking models. In this work we introduce an envelope model called distribution quantizer (DQ), with the objective of combining the accuracy of linear prediction and the flexibility of scale factor bands. We evaluate the performance of envelope models with respect to their ability to reduce entropy as well as their correlation to the original signal magnitude. The experiments show that in terms of entropy, distribution quantization and linear prediction are comparable, whereas for correlation, distribution quantization is better. Furthermore the coefficients of distribution quantization are independent and thus more flexible and easier to quantize than linear predictive coefficients.

...read moreread less

6 citations

Proceedings Article•DOI•

Advances in low bitrate time-frequency coding

[...]

Tommy Vaillancourt, Vladimir Malenovsky, Redwan Salami, Zexin Liu¹, Lei Miao¹, Jon Gibbs¹, Milan Jelinek - Show less +3 more•Institutions (1)

Huawei¹

19 Apr 2015

TL;DR: A novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay.

...read moreread less

Abstract: In this paper a novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay. The paper discusses how to integrate parts of a traditional Algebraic Code Excited Linear Prediction (ACELP) speech codec to create a time-domain contribution which coexists with a frequency based coding model. A mechanism to determine the value of the time-domain contribution is proposed and a method is described how the frequency-domain contribution might be added without increasing the overall delay of the codec. The proposed method forms part of the recently standardised 3GPP EVS codec.

...read moreread less

5 citations

Proceedings Article•DOI•

Comparison of windowing schemes for speech coding

[...]

Johannes Fischer¹, Tom Bäckström¹•Institutions (1)

University of Erlangen-Nuremberg¹

28 Dec 2015

TL;DR: This paper introduces three alternative windowing schemes, as alternatives to the one already used in CELP codecs, and shows that omitting the error feedback loop yields an increase in perceptual quality at scenarios with high quantization noise.

...read moreread less

Abstract: The majority of speech coding algorithms are based on the code excited linear prediction (CELP) paradigm, modelling the speech signal by linear prediction. This coding approach offers the advantage of a very short algorithmic delay, due to the windowing scheme based on rectangular windowing of the residual of the linear predictor. Although widely used, the performance and structural choices of this windowing scheme have not been extensively documented. In this paper we introduce three alternative windowing schemes, as alternatives to the one already used in CELP codecs. These windowing schemes differ in their handling of transitions between frames. Our subject evaluation shows that omitting the error feedback loop yields an increase in perceptual quality at scenarios with high quantization noise. In addition, objective measures show that while error feedback improves the accuracy slightly at high bitrates, at low bitrates it causes a degradation in quality, resulting in a lower SNR.

...read moreread less

5 citations

Book Chapter•DOI•

Algorithms for Low Bit-Rate Coding with Adaptation to Statistical Characteristics of Speech Signal

[...]

Anton Saveliev, Oleg Basov, Andrey Ronzhin, Alexander L. Ronzhin

20 Sep 2015

TL;DR: The article establishes the general trends of speech coding algorithms based on linear prediction and the main procedures of their forming and results of experimental studies of the developed adaptive low bit-rate coding algorithms are presented.

...read moreread less

Abstract: The article establishes the general trends of speech coding algorithms based on linear prediction. The task of adaptation of speech codec to the statistical characteristics of the coding parameters is set and accomplished. The main procedures of their forming are examined. The results of experimental studies of the developed adaptive low bit-rate coding algorithms are presented. The benefits of the quality of remade speech in comparison with algorithms on FS1015, FS1017 and FS1016 standards and Full-rate GSM are displayed.

...read moreread less

Proceedings Article•DOI•

Implementation and Overall Performance Evaluation of CELP Based GSM AMR NB Coder over ABE

[...]

Ninad Bhatt

04 Apr 2015

TL;DR: Development of ABE algorithm for Code Excited Linear Prediction based GSM AMR NB coder is discussed, and for the same, MATLAB based e-test bench is created for simulation to discover and gain insight about the overall performance of proposed ABE coder.

...read moreread less

Abstract: In today's wireless communication system, quality of decoded speech at receiving end is found muffled and thin, mainly attributed to inherent band limitation (300-3400 Hz) and power constraints. In order to obtain toll quality of recovered speech in terms of intelligibility and naturalness in wireless systems, Narrowband (NB) speech coders should be upgraded to its counterpart Wideband (WB) coders (50-7000Hz). In the meantime, a novel and backward compatible solution is proposed that claims to artificially extend bandwidth of NB speech to WB at receiving end, popular as Artificial Bandwidth Extension (ABE). Out of many techniques which aim to mitigate the effect of the ever unpredictable channel conditions, Adaptive Multi Rate (AMR) NB coder is considered to be one of the potential candidates. Selection of particular bit rate mode (out of all eight bit rate modes between 4.75 kbps and 12.2 kbps) solely depends upon the channel condition. This paper discusses development of ABE algorithm for Code Excited Linear Prediction (CELP) based GSM AMR NB coder, and for the same, MATLAB based e-test bench is created for simulation. Such series of simulations are conducted to discover and gain insight about the overall performance of proposed ABE coder that includes subjective (Mean Opinion Score - MOS) and objective (Perceptual Evaluation of Speech Quality - PESQ) analyses. The evaluated results for both analyses clearly advocate that proposed ABE coder outperforms legacy GSM AMR 06.90 NB coder.

...read moreread less

Scalable Speech Coding for IP Networks

[...]

Koji Seto

01 Jan 2015

TL;DR: The water needs of this region have changed in recent years from being primarily for agricultural purposes to domestic and industrial uses currently and in the past also for industrial and industrial purposes.

...read moreread less

Abstract: The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition.

...read moreread less

Proceedings Article•DOI•

New post-processing techniques for low bit rate celp codecs

[...]

Tommy Vaillancourt, Redwan Salami, Milan Jelinek¹•Institutions (1)

Université de Sherbrooke¹

19 Apr 2015

TL;DR: Two new post-processing techniques to address limitations of the deployed low bit rate speech codecs in case of unvoiced speech and background noise and of generic audio signals coded by lowbit rate ACELP codecs are presented.

...read moreread less

Abstract: This paper presents two new post-processing techniques to address limitations of the deployed low bit rate speech codecs in case of unvoiced speech and background noise, and in case of music. Both post-processing techniques enhance the spectrum of the decoded excitation signal without increasing the codec algorithmic delay. The paper discusses how to integrate the enhancement procedure of unvoiced speech and background noise and of generic audio signals coded by low bit rate ACELP codecs. The proposed post-processing procedures are part of the AMR-WB interoperable modes of the recently standardized 3GPP EVS codec [1].

...read moreread less

Proceedings Article•DOI•

A 2400 bps vocoder based on mixed excitation linear prediction and channel coding

[...]

Qiuyun Hao, Ye Li, Peng Zhang, Xiaofeng Ma, Yanhong Fan - Show less +1 more

01 Oct 2015

TL;DR: Test results show that the proposed speech coding algorithm could provide satisfactory speech quality and strong robustness against channel errors and an unequal error protection channel coding and a parameter substitution method for error frame are proposed to improve the robustness over random error channel.

...read moreread less

Abstract: To obtain high quality synthetic speech at 2400 bps, this paper presents a vocoder based on the Mixed Excitation Linear Prediction (MELP) model. The differences of the vocoder parameters are analyzed, and an unequal error protection channel coding and a parameter substitution method for error frame are proposed to improve the robustness over random error channel. Several channel coding schemes are compared and the optimal one is then selected. Test results show that the proposed speech coding algorithm could provide satisfactory speech quality and strong robustness against channel errors.

...read moreread less

Proceedings Article•DOI•

Efficient handling of mode switching and speech transitions in the EVS codec

[...]

Vaclav Eksler¹, Milan Jelinek¹, Redwan Salami•Institutions (1)

Université de Sherbrooke¹

19 Apr 2015

TL;DR: This paper focuses on techniques that enable a seamless switching between two linear prediction based modes running at different sampling rates within this codec, which is based on a constrained-memory ACELP called transition coding (TC).

...read moreread less

Abstract: The recently standardized codec for Enhanced Voice Services (EVS) consists of a number of modes to achieve its high coding flexibility. In this paper we focus on techniques that enable a seamless switching between two linear prediction based modes running at different sampling rates within this codec. The first one deals with an efficient conversion of the linear prediction filter coefficients. The other one is based on a constrained-memory ACELP called transition coding (TC) that significantly limits the inter-frame long-term dependency. We show that the use of TC can be successfully extended to improve quality also in coding other transitions, e.g. strong onsets of voiced speech.

...read moreread less

Proceedings Article•DOI•

Graph linear prediction results in smaller error than standard linear prediction

[...]

Aran Venkitaraman¹, Saikat Chatterjee¹, Peter Händel¹•Institutions (1)

Royal Institute of Technology¹

28 Dec 2015

TL;DR: It is proved theoretically that the graph based linear prediction approach results in an equal or better performance compared with the SLP in terms of the prediction gain.

...read moreread less

Abstract: Linear prediction is a popular strategy employed in the analysis and representation of signals. In this paper, we propose a new linear prediction approach by considering the standard linear prediction in the context of graph signal processing, which has gained significant attention recently. We view the signal to be defined on the nodes of a graph with an adjacency matrix constructed using the coefficients of the standard linear predictor (SLP). We prove theoretically that the graph based linear prediction approach results in an equal or better performance compared with the SLP in terms of the prediction gain. We illustrate the proposed concepts by application to real speech signals.

...read moreread less

Journal Article•DOI•

Speech Coding Techniques

[...]

S.K. Jagtap¹, M.S. Mulye¹, M. D. Uplane¹•Institutions (1)

College of Engineering, Pune¹

01 Jan 2015-Procedia Computer Science

TL;DR: Speech coding techniques discussed here are Linear Predictive Coding, waveform coding, Code excited linear predictive coding, etc, which are studied with the help of MATLAB to check their performance measures like compression ratio and speech audible quality.

...read moreread less

Book Chapter•DOI•

Comparative Study of PCM, LPC, and CELP Speech Coders Used for VoIP Applications

[...]

Mahesh Chandra¹, Manas Ray¹•Institutions (1)

Birla Institute of Technology, Mesra¹

01 Jan 2015

TL;DR: This paper analyzes the performance of the above standard coders by comparing the coding capabilities of the coders on two Hindi and two English language sentences and evaluating the performance in terms of compression ratio, peak signal-to-noise ratio, and normalized root mean square error.

...read moreread less

Abstract: The quality of the speech signal in a voice over internet protocol (VoIP) is governed by the speech coding technique employed. Currently, various standard coders such as FS-1015 (LPC-10), ITU-G.711, and FS-1016 (ITU-G.728) are used to digitize the speech signal. This paper analyzes the performance of the above coders by comparing the coding capabilities of the coders on two Hindi and two English language sentences. The performance is then evaluated in terms of compression ratio (CR), peak signal-to-noise ratio (PSNR), and normalized root mean square error (NRMSE).

...read moreread less

Proceedings Article•DOI•

A 1.8kbps vocoder based on Mixed Excitation Linear Prediction

[...]

Ye Li, Xiaofeng Ma, Qiuyun Hao, Peng Zhang, Yanhong Fan, Jingsai Jiang - Show less +2 more

01 Dec 2015

TL;DR: The proposed low bit rate speech coding algorithm based on the Mixed Excitation Linear Prediction (MELP) and unequal error channel coding and has strong robustness for channel error is proposed.

...read moreread less

Abstract: With the rapid development of communication technology, there is an urgent need for high quality speech coding algorithm at very low bit rate. In the paper, based on the Mixed Excitation Linear Prediction (MELP) and unequal error channel coding, a low bit rate speech coding algorithm is proposed. According to the different importance for each parameter, the relatively significant code stream information is protected by using channel coding with strong error correcting ability to obtain high-quality synthetic speech at 1.8 kbps with 1/3 redundancy for channel coding. Test results show that the proposed algorithm could provide better speech quality and also has strong robustness for channel error.

...read moreread less

Journal Article•DOI•

Analysis-by-synthesis compression of range-focused SAR raw data

[...]

Mort Naraghi-Pour¹, Ricardo Cortez, Takeshi Ikuma¹•Institutions (1)

Louisiana State University¹

22 Jun 2015-IEEE Transactions on Aerospace and Electronic Systems

TL;DR: A predictive compression scheme for synthetic aperture radar (SAR) raw data based on the analysis-by-synthesis encoding method that exploits the correlation across azimuth of the range-focused SAR data.

...read moreread less

Abstract: We propose a predictive compression scheme for synthetic aperture radar (SAR) raw data based on the analysis-by-synthesis encoding method. The algorithm is inspired by the code excited linear prediction (CELP) algorithm used in speech compression and exploits the correlation across azimuth of the range-focused SAR data. We also extend this approach to include multiple codebooks and obtain the performance of the proposed methods on recorded data. Numerical results show that these algorithms outperform the open-loop predictive quantization schemes.

...read moreread less

Proceedings Article•DOI•

Adaptive selection of lag-window shape for linear predictive analysis in the 3GPP EVS codec

[...]

Yutaka Kamamoto¹, Takehiro Moriya¹, Noboru Harada¹•Institutions (1)

Nippon Telegraph and Telephone¹

01 Dec 2015

TL;DR: An adaptive selection scheme is devised in which the window shape selected depends on the periodicity of the signal, which has proven to be effective for LP analysis and to enhance the coding efficiency in both time and frequency domains in general.

...read moreread less

Abstract: Lag windowing has long been used for the autocorrelation method of linear predictive (LP) analysis to prevent possible instability of the synthesis filter with the obtained coefficients. We have investigated the lag-window shape in terms of the trade-offs between stability and the coding efficiency. On the basis of these investigations, we have devised an adaptive selection scheme in which the window shape selected depends on the periodicity of the signal. This scheme has proven to be effective for LP analysis to enhance the coding efficiency in both time and frequency domains in general. This scheme has thus been included in the speech and audio coding schemes of the newly established 3GPP EVS codec standard.

...read moreread less

Proceedings Article•DOI•

An Automatic Watermarking in CELP Speech Codec Based on Formant Tuning

[...]

Erick Christian Garcia Alvarez, Shengbei Wang¹, Masashi Unoki²•Institutions (2)

Tianjin Polytechnic University¹, Japan Advanced Institute of Science and Technology²

01 Sep 2015

TL;DR: The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate.

...read moreread less

Abstract: This paper proposes the unification of the codeexcited linear prediction (CELP) codec process with watermarking based on formant tuning. The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate. We took advantage of two key properties: I) humans do not perceive alterations applied to formants and II) CELP and watermarking based on formant tuning methods utilize lineal prediction coefficients. We investigated the inaudibility and robustness of the proposed method by carrying out three different experiments using log-spectrum distance (LSD), the perceptual evaluation of speech quality (PESQ) and the bit detection rate (BDR). The results indicated that the proposed method satisfied the inaudibility requirement when watermarking was applied to the CELP codec, which increased the watermarking detection rate.

...read moreread less

Journal Article•DOI•

Speaker verification method for operation system of consumer electronic devices

[...]

Masatsugu Ichino¹, Yasushi Yamazaki², Hiroshi Yoshiura¹•Institutions (2)

University of Electro-Communications¹, University of Kitakyushu²

23 Mar 2015-IEEE Transactions on Consumer Electronics

TL;DR: A system that can remotely operate consumer electronic devices by voice using the mobile phone as a controller and a CELP-based speaker verification method to match the audio stream by comparing the trajectories of continuous phonemes is proposed.

...read moreread less

Abstract: A system is proposed that can remotely operate consumer electronic devices by voice. It uses the mobile phone as a controller. And it uses the CELP(code excited linear prediction) parameters that are used for speech coding in mobile phones. A speaker verification function protects private information and separates the user's voice from that of people nearby who are also speaking. A CELP-based speaker verification method is used to match the audio stream by comparing the trajectories of continuous phonemes. Experimental evaluation of the speaker verification method demonstrated the effectiveness of the proposed verification method.1.

...read moreread less

Patent•

Voice signal decoding device and voice signal decoding method

[...]

江原宏幸, Hiroyuki Ebara, 宏幸江原

05 Aug 2015

TL;DR: In this paper, an adaptive code book decoding part is proposed to avoid reproduction of loud abnormal sound caused by an error of a coded signal, without affecting reproduction of a normal decoded signal.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a voice signal decoding device and a voice signal decoding method that can avoid reproduction of loud abnormal sound caused by an error of a coded signal, without affecting reproduction of a normal decoded signal.SOLUTION: A sound signal decoding device includes: an adaptive code book decoding part 102 that generates an adaptive code book vector using an adaptive code book code of a coded signal coded by a CELP system; a fixed code book decoding part 103 that generates a fixed code book vector using a fixed code book code of the coded signal; a ratio calculation part 107 that calculates an amplitude ratio or an energy ratio between the adaptive code book vector and the fixed code book vector; a determination part 109 that determines whether the amplitude ratio or the energy ratio exceeds a prescribed threshold; and an attenuator 110 that attenuates an excitation signal with the adaptive code book vector and the fixed code book vector added, when the amplitude ratio or the energy ratio is determined to exceed the prescribed threshold.SELECTED DRAWING: Figure 1

...read moreread less

Proceedings Article•DOI•

Qualitative Spectral Parameter Coding for Speech Signals

[...]

Charu¹, Sukriti Sharma¹•Institutions (1)

Manav Rachna College of Engineering¹

03 Jun 2015

TL;DR: Voice-excited LPC is the technique proposed in this paper that results in a low bit rate and a better signal to noise ratio and also provides accurate estimation of speech parameters and is computationally effective.

...read moreread less

Abstract: Speech signal is the unique and special signal in communication system so it must be analyzed in order to extract its important parameters and to compress it for maximum utilization of available bandwidth. For this, there are various kinds of speech analysis and synthesis techniques that have been effectively used. Among all these techniques, Linear Predictive Coding (LPC) is the most powerful one to represent the speech signal at reduced bit rates while preserving the quality of the signal and also provides accurate estimation of speech parameters and is computationally effective. Voice-excited LPC is the technique proposed in this paper. This technique has been implemented using both male and female voices and trade-offs between bit rates, delay, power signal to noise ratio and complexity are analyzed. It results in a low bit rate and a better signal to noise ratio.

...read moreread less

Proceedings Article•DOI•

Implementation of G.729E Speech Coding Algorithm based on TMS320VC5416

[...]

Xiaojin Yang, Jinjin Pan

30 Aug 2015

TL;DR: The base algorithm, G.729E, is introduced, and the hardware and software implementation in TMS320VC5416DSP is emphasized.

...read moreread less

Abstract: Implementation of G.729E Speech Coding Algorithm based on TMS320VC5416 G.729E algorithm is an excellent speech coding algorithm. G.729E designed to speech with background noise and even music, it will be widely used in multimedia communication. This paper introduce the base algorithm, and it is emphasized the hardware and software implementation in TMS320VC5416DSP. It is offered the testing result on hardware.

...read moreread less

Proceedings Article•DOI•

The Design and Implementation of Speech Engine based on Speed

[...]

Fan Zhang

18 Jul 2015

TL;DR: The foundation of speech coding is taken as the breakthrough point, by means of the interpretation of Speed, the design of the speech engine as well as the route of implementation are discussed so as to reduce tedious work in voice testing.

...read moreread less

Abstract: This paper will take the foundation of speech coding as the breakthrough point, by means of the interpretation of Speed, it discusses the design of the speech engine as well as the route of implementation so as to reduce tedious work in voice testing. Introduction Speech coding is the basic technology of digital speech transmission and storage, by means of the compressed digital, it can represent the speech signals and make the expression of these signals with the minimum number of the required bits. Compared with the stimulated voice, digital voice transmission and storage system of using speech coding technology, has the advantages of high reliability, strong anti-interference ability, easy to be quickly exchanged, easy for the realization of confidentiality, multiplexing, packaging as well as the advantage of low price, etc. The compressed voice is used for transmission, which can reduce the required bandwidth of each route, thus it can transmit mote voice transmission in the same bandwidth; it can be used for storage, which can save space and improve the storage of speech length as well as reduce cost. The Interpretation of Speex. Speex is a multi-mode, multi-rate, speech code, based on CELP algorithm, it can provide narrow band, wide band and ultra-wide band three speech codec modes, which are respectively corresponding to the speech signals with the bandwidth of 4kHz (sampling rate is 8000), 8kHz (sampling rate is 16000), 16kHz (sampling rate is 32000). Among them, the narrow band speech coding only adopts narrow band sub-pattern coding; wide band speech can be divided into two sub bands,wide band speech adopts broadband sub-pattern coding, low band voice adopts narrow band sub-pattern coding; ultra-wide band will repeat decomposition two times which adopts the wide band sub-pattern coding twice and narrow band sub-pattern coding once . Thus, we can see, throughout the algorithm of Speex, it is composed by two types of sub-pattern encoding: narrow band sub-pattern and wide band sub-pattern. The Structure of the Functional Module of the System. In this paper, VOIP system adopts Speed coding. The whole system consists of two parts, the server side and client side. The server of the database saves the data of all registered users. Each user must firstly log on the server used by the client side, and access to the list if online friends, then it can make a voice call to the online friends. The call between the client sides adopts the mode of point to point. It can avoid the excessive delay of voice packet caused by the transmission of the server side. The Design of the Server Side. The server side can not only save the user's information in addition to maintaining a database, but also can command and manage each client side. Once the server is started to start the service, it is started to listen for the requests of users. The server receives the message sent by the client side, firstly, it sends back the confirmed information and then it can establish a separate thread to deal with the received data. In this separate thread, according to the category of the received data it has the corresponding treatment. The Design of the Client Side. The function of the client side can be divided into two parts: one part is interacted with the server side, which can obtain the relevant information from the server; the other part can complete the communication between the different clients point to point. Among them, the second part is the core of VOIP system.

...read moreread less

Technology linear predictive coding

[...]

Sanjay M. Gulhane

01 Jan 2015

TL;DR: The most important aspect of LPC is the linear predictive filter which allows the value of the next sample to be determined by a linear combination of previous samples.

...read moreread less

Abstract: Linear predictive coding (LPC) is defined as a digital method for encoding an analog signal in which a particular value is predicted by a linear function of the past values of the signal. It was first proposed as a method for encoding human speech by the United States Department of Defense in federal standard 1015, published in 1984. Human speech is produced in the vocal tract which can be approximated as a variable diameter tube. The linear predictive coding (LPC) model is based on a mathematical approximation of the vocal tract represented by this tube of a varying diameter. At a Particular time, t, the speech sample s(t) is represented as a linear sum of the p previous samples. The most important aspect of LPC is the linear predictive filter which allows the value of the next sample to be determined by a linear combination of previous samples. Under normal circumstances, speech is sampled at 8000 samples/second with 8 bits used to represent each sample. This provides a rate of 64000 bits/second. Linear predictive coding reduces this to 2400 bits/second. At this reduced rate the speech has a distinctive synthetic sound and there is a noticeable loss of quality. However, the speech is still audible and it can still be easily understood. Since there is information loss in linear predictive coding, it is a lossy form of compression.

...read moreread less

Patent•

Multi-mode audio codec and celp coding to be adapted therefor

[...]

Geiger Ralf, Guillaume Fuchs, Markus Multrus, Bernhard Grill

05 Mar 2015

TL;DR: In this paper, a multi-mode audio codec and a CELP codec that enable a global gain adjustment with a proper penalty without detouring decryption and re-coding is proposed.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a multi-mode audio codec and a CELP codec that enable a global gain adjustment with a proper penalty without detouring decryption and re-coding.SOLUTION: Bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of a global gain value of a frame results in an adjustment of an output level of a decoded representation of audio content. Concurrently, the differential coding saves or generates a bit when introducing a new syntax element into the coded bitstream. Further, regarding a differential coding, when the global gain value lower than a time resolution adjusting a gain of each sub-frame is set for the bitstream elements encoded differentially to a global gain value, the time resolution is enabled to thereby enable reduction in a burden of globally adjusting the gain of the encoded bitstream.

...read moreread less