scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 2015"


Proceedings ArticleDOI
19 Apr 2015
TL;DR: Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant, and the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.
Abstract: Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous approaches have used neighbouring frequency components to form a probability model for the entropy coder of spectral components, we propose to use the magnitude of the linear predictor to estimate the variance of spectral components. Since the linear predictor is transmitted in any case, this method does not require any additional side info. Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant. Consequently, the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.

23 citations


Journal ArticleDOI
TL;DR: Experimental results show that incorporating the coding scheme in a state-of-the-art wide-band audio coder enhances its objective and subjective quality in a low-bit-rate and low-delay situation by increasing the compression efficiency.
Abstract: We present an optimal coding scheme that parameterizes the maximum-likelihood estimate of variance for frequency spectra belonging to the generalized Gaussian distribution, the distribution covering the Laplacian and the Gaussian. By slightly modifying the all-pole model of the conventional linear prediction (LP), we can estimate the variance with the same method as in LP, which has low computational costs. Experimental results show that incorporating the coding scheme in a state-of-the-art wide-band audio coder enhances its objective and subjective quality in a low-bit-rate and low-delay situation by increasing the compression efficiency. Thus, this coding scheme will be useful in applications like mobile communications, which requires highly efficient compression.

13 citations


Proceedings ArticleDOI
28 Dec 2015
TL;DR: Objective measures show that a more reliable switching decision is achievable and a reliable speech and music discriminator (SMD) for such an application is designed.
Abstract: Switching between speech coding and generic audio coding schemes was recently proven to be very efficient for coding a large range of audio materials at low bit-rates. However, it strongly relies on a robust classification of the input signal. The aim of the paper is to design a reliable speech and music discriminator (SMD) for such an application. Main attention was laid on getting a good tradeoff between accuracy, reactivity and stability of the decision while keeping the delay and complexity reasonably low. To this end, short-term and long-term features are dissociated before being conveyed to two different classifiers. The two classifier outputs are combined in a final decision using a hysteresis. Objective measures show that a more reliable switching decision is achievable. The SMD was successfully implemented in MPEG Unified Speech and Audio Coding (USAC). It allows the codec to show unprecedented audio quality.

12 citations


Proceedings ArticleDOI
01 Feb 2015
TL;DR: This work characterizes a speech codec in a Compressive Sensing (CS) framework and demonstrates simultaneous compression and de-noising of speech by CS, and Appropriate quantization of CS measurements to design medium bit-rate codec.
Abstract: Medium bit rate hybrid speech coding schemes have gained much interest in the recent years and many of them have been standardized for various applications. This work characterizes a speech codec in a Compressive Sensing (CS) framework. We mainly demonstrate two aspects 1) Simultaneous compression and de-noising of speech by CS 2) Appropriate quantization of CS measurements to design medium bit-rate codec. The proposed scheme renders better quality speech compared to CELP, the widely used hybrid coding scheme, at the same bit rates. The CS speech codec has the added advantage of inherent noise suppression and easy scalability, without complex parameter extractions and voice activity detections.

11 citations


Proceedings ArticleDOI
01 Dec 2015
TL;DR: The linear regression improvement model is employed to modify the prediction pixel values, and the weighting parameters are estimated in both encoder and decoder, no additional bits are required to be transmitted.
Abstract: Inter prediction plays an important role in most video encoding systems since it could significantly improve coding performance. The more accurate the prediction of the current block, the smaller the residual, and the higher coding efficiency could be achieved accordingly. In this paper, a localized weighted prediction method is proposed to improve inter prediction accuracy. The linear regression improvement model is employed to modify the prediction pixel values. The weighting parameters are estimated in both encoder and decoder, no additional bits are required to be transmitted. The proposed method shows better coding performance than previous methods, including the explicit weighted prediction method in High Efficiency Video Coding (HEVC). Experimental results show that the BD bit rate saving of the proposed method is up to 7.4% compared to HM12.0, while the decoding complexity is almost the same.

10 citations


Patent
21 Dec 2015
TL;DR: In this paper, an apparatus for reconstructing a frame including a speech signal as a reconstructed frame is provided, the apparatus including a determination unit and a frame reconstructor being configured to reconstruct the reconstructed frame.
Abstract: An apparatus for reconstructing a frame including a speech signal as a reconstructed frame is provided, the apparatus including a determination unit and a frame reconstructor being configured to reconstruct the reconstructed frame, such that the reconstructed frame completely or partially includes the first reconstructed pitch cycle, such that the reconstructed frame completely or partially includes a second reconstructed pitch cycle, and such that the number of samples of the first reconstructed pitch cycle differs from a number of samples of the second reconstructed pitch cycle.

9 citations


Proceedings ArticleDOI
10 Dec 2015
TL;DR: Experiments using an implementation of the proposed method in the state-of-the-art H.265/HEVC algorithm have shown that SLSP is able to improve the coding performance, specially in the presence of complex textures, achieving higher coding gains than other existing intra linear prediction methods.
Abstract: This paper presents a new intra prediction method for efficient image coding, based on linear prediction and sparse representation concepts, denominated sparse least-squares prediction (SLSP). The proposed method uses a low order linear approximation model which may be built inside a predefined large causal region. The high flexibility of the SLSP filter context allows the inclusion of more significant image features into the model for better prediction results. Experiments using an implementation of the proposed method in the state-of-the-art H.265/HEVC algorithm have shown that SLSP is able to improve the coding performance, specially in the presence of complex textures, achieving higher coding gains than other existing intra linear prediction methods.

7 citations


Proceedings ArticleDOI
28 Dec 2015
TL;DR: An envelope model called distribution quantizer (DQ) is introduced, with the objective of combining the accuracy of linear prediction and the flexibility of scale factor bands, and the coefficients of distribution quantization are independent and thus more flexible and easier to quantize than linear predictive coefficients.
Abstract: Envelope models are common in speech and audio processing: for example, linear prediction is used for modeling the spectral envelope of speech, whereas audio coders use scale factor bands for perceptual masking models. In this work we introduce an envelope model called distribution quantizer (DQ), with the objective of combining the accuracy of linear prediction and the flexibility of scale factor bands. We evaluate the performance of envelope models with respect to their ability to reduce entropy as well as their correlation to the original signal magnitude. The experiments show that in terms of entropy, distribution quantization and linear prediction are comparable, whereas for correlation, distribution quantization is better. Furthermore the coefficients of distribution quantization are independent and thus more flexible and easier to quantize than linear predictive coefficients.

6 citations


Proceedings ArticleDOI
19 Apr 2015
TL;DR: A novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay.
Abstract: In this paper a novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay. The paper discusses how to integrate parts of a traditional Algebraic Code Excited Linear Prediction (ACELP) speech codec to create a time-domain contribution which coexists with a frequency based coding model. A mechanism to determine the value of the time-domain contribution is proposed and a method is described how the frequency-domain contribution might be added without increasing the overall delay of the codec. The proposed method forms part of the recently standardised 3GPP EVS codec.

5 citations


Proceedings ArticleDOI
28 Dec 2015
TL;DR: This paper introduces three alternative windowing schemes, as alternatives to the one already used in CELP codecs, and shows that omitting the error feedback loop yields an increase in perceptual quality at scenarios with high quantization noise.
Abstract: The majority of speech coding algorithms are based on the code excited linear prediction (CELP) paradigm, modelling the speech signal by linear prediction. This coding approach offers the advantage of a very short algorithmic delay, due to the windowing scheme based on rectangular windowing of the residual of the linear predictor. Although widely used, the performance and structural choices of this windowing scheme have not been extensively documented. In this paper we introduce three alternative windowing schemes, as alternatives to the one already used in CELP codecs. These windowing schemes differ in their handling of transitions between frames. Our subject evaluation shows that omitting the error feedback loop yields an increase in perceptual quality at scenarios with high quantization noise. In addition, objective measures show that while error feedback improves the accuracy slightly at high bitrates, at low bitrates it causes a degradation in quality, resulting in a lower SNR.

5 citations


Book ChapterDOI
20 Sep 2015
TL;DR: The article establishes the general trends of speech coding algorithms based on linear prediction and the main procedures of their forming and results of experimental studies of the developed adaptive low bit-rate coding algorithms are presented.
Abstract: The article establishes the general trends of speech coding algorithms based on linear prediction. The task of adaptation of speech codec to the statistical characteristics of the coding parameters is set and accomplished. The main procedures of their forming are examined. The results of experimental studies of the developed adaptive low bit-rate coding algorithms are presented. The benefits of the quality of remade speech in comparison with algorithms on FS1015, FS1017 and FS1016 standards and Full-rate GSM are displayed.

Proceedings ArticleDOI
04 Apr 2015
TL;DR: Development of ABE algorithm for Code Excited Linear Prediction based GSM AMR NB coder is discussed, and for the same, MATLAB based e-test bench is created for simulation to discover and gain insight about the overall performance of proposed ABE coder.
Abstract: In today's wireless communication system, quality of decoded speech at receiving end is found muffled and thin, mainly attributed to inherent band limitation (300-3400 Hz) and power constraints. In order to obtain toll quality of recovered speech in terms of intelligibility and naturalness in wireless systems, Narrowband (NB) speech coders should be upgraded to its counterpart Wideband (WB) coders (50-7000Hz). In the meantime, a novel and backward compatible solution is proposed that claims to artificially extend bandwidth of NB speech to WB at receiving end, popular as Artificial Bandwidth Extension (ABE). Out of many techniques which aim to mitigate the effect of the ever unpredictable channel conditions, Adaptive Multi Rate (AMR) NB coder is considered to be one of the potential candidates. Selection of particular bit rate mode (out of all eight bit rate modes between 4.75 kbps and 12.2 kbps) solely depends upon the channel condition. This paper discusses development of ABE algorithm for Code Excited Linear Prediction (CELP) based GSM AMR NB coder, and for the same, MATLAB based e-test bench is created for simulation. Such series of simulations are conducted to discover and gain insight about the overall performance of proposed ABE coder that includes subjective (Mean Opinion Score - MOS) and objective (Perceptual Evaluation of Speech Quality - PESQ) analyses. The evaluated results for both analyses clearly advocate that proposed ABE coder outperforms legacy GSM AMR 06.90 NB coder.

01 Jan 2015
TL;DR: The water needs of this region have changed in recent years from being primarily for agricultural purposes to domestic and industrial uses currently and in the past also for industrial and industrial purposes.
Abstract: The emergence of Voice over Internet Protocol (VoIP) has posed new challenges to the development of speech codecs. The key issue of transporting real-time voice packet over IP networks is the lack of guarantee for reasonable speech quality due to packet delay or loss. Most of the widely used narrowband codecs depend on the Code Excited Linear Prediction (CELP) coding technique. The CELP technique utilizes the long-term prediction across the frame boundaries and therefore causes error propagation in the case of packet loss and need to transmit redundant information in order to mitigate the problem. The internet Low Bit-rate Codec (iLBC) employs the frame-independent coding and therefore inherently possesses high robustness to packet loss. However, the original iLBC lacks in some of the key features of speech codecs for IP networks: Rate flexibility, Scalability, and Wideband support. This dissertation presents novel scalable narrowband and wideband speech codecs for IP networks using the frame independent coding scheme based on the iLBC. The rate flexibility is added to the iLBC by employing the discrete cosine transform (DCT) and iii the scalable algebraic vector quantization (AVQ) and by allocating different number of bits to the AVQ. The bit-rate scalability is obtained by adding the enhancement layer to the core layer of the multi-rate iLBC. The enhancement layer encodes the weighted iLBC coding error in the modified DCT (MDCT) domain. The proposed wideband codec employs the bandwidth extension technique to extend the capabilities of existing narrowband codecs to provide wideband coding functionality. The wavelet transform is also used to further enhance the performance of the proposed codec. The performance evaluation results show that the proposed codec provides high robustness to packet loss and achieves equivalent or higher speech quality than state-of-the-art codecs under the clean channel condition.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: Two new post-processing techniques to address limitations of the deployed low bit rate speech codecs in case of unvoiced speech and background noise and of generic audio signals coded by lowbit rate ACELP codecs are presented.
Abstract: This paper presents two new post-processing techniques to address limitations of the deployed low bit rate speech codecs in case of unvoiced speech and background noise, and in case of music. Both post-processing techniques enhance the spectrum of the decoded excitation signal without increasing the codec algorithmic delay. The paper discusses how to integrate the enhancement procedure of unvoiced speech and background noise and of generic audio signals coded by low bit rate ACELP codecs. The proposed post-processing procedures are part of the AMR-WB interoperable modes of the recently standardized 3GPP EVS codec [1].

Proceedings ArticleDOI
01 Oct 2015
TL;DR: Test results show that the proposed speech coding algorithm could provide satisfactory speech quality and strong robustness against channel errors and an unequal error protection channel coding and a parameter substitution method for error frame are proposed to improve the robustness over random error channel.
Abstract: To obtain high quality synthetic speech at 2400 bps, this paper presents a vocoder based on the Mixed Excitation Linear Prediction (MELP) model. The differences of the vocoder parameters are analyzed, and an unequal error protection channel coding and a parameter substitution method for error frame are proposed to improve the robustness over random error channel. Several channel coding schemes are compared and the optimal one is then selected. Test results show that the proposed speech coding algorithm could provide satisfactory speech quality and strong robustness against channel errors.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: This paper focuses on techniques that enable a seamless switching between two linear prediction based modes running at different sampling rates within this codec, which is based on a constrained-memory ACELP called transition coding (TC).
Abstract: The recently standardized codec for Enhanced Voice Services (EVS) consists of a number of modes to achieve its high coding flexibility. In this paper we focus on techniques that enable a seamless switching between two linear prediction based modes running at different sampling rates within this codec. The first one deals with an efficient conversion of the linear prediction filter coefficients. The other one is based on a constrained-memory ACELP called transition coding (TC) that significantly limits the inter-frame long-term dependency. We show that the use of TC can be successfully extended to improve quality also in coding other transitions, e.g. strong onsets of voiced speech.

Proceedings ArticleDOI
28 Dec 2015
TL;DR: It is proved theoretically that the graph based linear prediction approach results in an equal or better performance compared with the SLP in terms of the prediction gain.
Abstract: Linear prediction is a popular strategy employed in the analysis and representation of signals. In this paper, we propose a new linear prediction approach by considering the standard linear prediction in the context of graph signal processing, which has gained significant attention recently. We view the signal to be defined on the nodes of a graph with an adjacency matrix constructed using the coefficients of the standard linear predictor (SLP). We prove theoretically that the graph based linear prediction approach results in an equal or better performance compared with the SLP in terms of the prediction gain. We illustrate the proposed concepts by application to real speech signals.

Journal ArticleDOI
TL;DR: Speech coding techniques discussed here are Linear Predictive Coding, waveform coding, Code excited linear predictive coding, etc, which are studied with the help of MATLAB to check their performance measures like compression ratio and speech audible quality.

Book ChapterDOI
01 Jan 2015
TL;DR: This paper analyzes the performance of the above standard coders by comparing the coding capabilities of the coders on two Hindi and two English language sentences and evaluating the performance in terms of compression ratio, peak signal-to-noise ratio, and normalized root mean square error.
Abstract: The quality of the speech signal in a voice over internet protocol (VoIP) is governed by the speech coding technique employed. Currently, various standard coders such as FS-1015 (LPC-10), ITU-G.711, and FS-1016 (ITU-G.728) are used to digitize the speech signal. This paper analyzes the performance of the above coders by comparing the coding capabilities of the coders on two Hindi and two English language sentences. The performance is then evaluated in terms of compression ratio (CR), peak signal-to-noise ratio (PSNR), and normalized root mean square error (NRMSE).

Proceedings ArticleDOI
01 Dec 2015
TL;DR: The proposed low bit rate speech coding algorithm based on the Mixed Excitation Linear Prediction (MELP) and unequal error channel coding and has strong robustness for channel error is proposed.
Abstract: With the rapid development of communication technology, there is an urgent need for high quality speech coding algorithm at very low bit rate. In the paper, based on the Mixed Excitation Linear Prediction (MELP) and unequal error channel coding, a low bit rate speech coding algorithm is proposed. According to the different importance for each parameter, the relatively significant code stream information is protected by using channel coding with strong error correcting ability to obtain high-quality synthetic speech at 1.8 kbps with 1/3 redundancy for channel coding. Test results show that the proposed algorithm could provide better speech quality and also has strong robustness for channel error.

Journal ArticleDOI
TL;DR: A predictive compression scheme for synthetic aperture radar (SAR) raw data based on the analysis-by-synthesis encoding method that exploits the correlation across azimuth of the range-focused SAR data.
Abstract: We propose a predictive compression scheme for synthetic aperture radar (SAR) raw data based on the analysis-by-synthesis encoding method. The algorithm is inspired by the code excited linear prediction (CELP) algorithm used in speech compression and exploits the correlation across azimuth of the range-focused SAR data. We also extend this approach to include multiple codebooks and obtain the performance of the proposed methods on recorded data. Numerical results show that these algorithms outperform the open-loop predictive quantization schemes.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: An adaptive selection scheme is devised in which the window shape selected depends on the periodicity of the signal, which has proven to be effective for LP analysis and to enhance the coding efficiency in both time and frequency domains in general.
Abstract: Lag windowing has long been used for the autocorrelation method of linear predictive (LP) analysis to prevent possible instability of the synthesis filter with the obtained coefficients. We have investigated the lag-window shape in terms of the trade-offs between stability and the coding efficiency. On the basis of these investigations, we have devised an adaptive selection scheme in which the window shape selected depends on the periodicity of the signal. This scheme has proven to be effective for LP analysis to enhance the coding efficiency in both time and frequency domains in general. This scheme has thus been included in the speech and audio coding schemes of the newly established 3GPP EVS codec standard.

Proceedings ArticleDOI
01 Sep 2015
TL;DR: The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate.
Abstract: This paper proposes the unification of the codeexcited linear prediction (CELP) codec process with watermarking based on formant tuning. The serial problem in atermarking and then encoding with the CELP codec was thereby reduced by using the proposed method which also ncreased the bit detection rate. We took advantage of two key properties: I) humans do not perceive alterations applied to formants and II) CELP and watermarking based on formant tuning methods utilize lineal prediction coefficients. We investigated the inaudibility and robustness of the proposed method by carrying out three different experiments using log-spectrum distance (LSD), the perceptual evaluation of speech quality (PESQ) and the bit detection rate (BDR). The results indicated that the proposed method satisfied the inaudibility requirement when watermarking was applied to the CELP codec, which increased the watermarking detection rate.

Journal ArticleDOI
TL;DR: A system that can remotely operate consumer electronic devices by voice using the mobile phone as a controller and a CELP-based speaker verification method to match the audio stream by comparing the trajectories of continuous phonemes is proposed.
Abstract: A system is proposed that can remotely operate consumer electronic devices by voice. It uses the mobile phone as a controller. And it uses the CELP(code excited linear prediction) parameters that are used for speech coding in mobile phones. A speaker verification function protects private information and separates the user's voice from that of people nearby who are also speaking. A CELP-based speaker verification method is used to match the audio stream by comparing the trajectories of continuous phonemes. Experimental evaluation of the speaker verification method demonstrated the effectiveness of the proposed verification method.1.

Patent
05 Aug 2015
TL;DR: In this paper, an adaptive code book decoding part is proposed to avoid reproduction of loud abnormal sound caused by an error of a coded signal, without affecting reproduction of a normal decoded signal.
Abstract: PROBLEM TO BE SOLVED: To provide a voice signal decoding device and a voice signal decoding method that can avoid reproduction of loud abnormal sound caused by an error of a coded signal, without affecting reproduction of a normal decoded signal.SOLUTION: A sound signal decoding device includes: an adaptive code book decoding part 102 that generates an adaptive code book vector using an adaptive code book code of a coded signal coded by a CELP system; a fixed code book decoding part 103 that generates a fixed code book vector using a fixed code book code of the coded signal; a ratio calculation part 107 that calculates an amplitude ratio or an energy ratio between the adaptive code book vector and the fixed code book vector; a determination part 109 that determines whether the amplitude ratio or the energy ratio exceeds a prescribed threshold; and an attenuator 110 that attenuates an excitation signal with the adaptive code book vector and the fixed code book vector added, when the amplitude ratio or the energy ratio is determined to exceed the prescribed threshold.SELECTED DRAWING: Figure 1

Proceedings ArticleDOI
03 Jun 2015
TL;DR: Voice-excited LPC is the technique proposed in this paper that results in a low bit rate and a better signal to noise ratio and also provides accurate estimation of speech parameters and is computationally effective.
Abstract: Speech signal is the unique and special signal in communication system so it must be analyzed in order to extract its important parameters and to compress it for maximum utilization of available bandwidth. For this, there are various kinds of speech analysis and synthesis techniques that have been effectively used. Among all these techniques, Linear Predictive Coding (LPC) is the most powerful one to represent the speech signal at reduced bit rates while preserving the quality of the signal and also provides accurate estimation of speech parameters and is computationally effective. Voice-excited LPC is the technique proposed in this paper. This technique has been implemented using both male and female voices and trade-offs between bit rates, delay, power signal to noise ratio and complexity are analyzed. It results in a low bit rate and a better signal to noise ratio.

Proceedings ArticleDOI
30 Aug 2015
TL;DR: The base algorithm, G.729E, is introduced, and the hardware and software implementation in TMS320VC5416DSP is emphasized.
Abstract: Implementation of G.729E Speech Coding Algorithm based on TMS320VC5416 G.729E algorithm is an excellent speech coding algorithm. G.729E designed to speech with background noise and even music, it will be widely used in multimedia communication. This paper introduce the base algorithm, and it is emphasized the hardware and software implementation in TMS320VC5416DSP. It is offered the testing result on hardware.

Proceedings ArticleDOI
18 Jul 2015
TL;DR: The foundation of speech coding is taken as the breakthrough point, by means of the interpretation of Speed, the design of the speech engine as well as the route of implementation are discussed so as to reduce tedious work in voice testing.
Abstract: This paper will take the foundation of speech coding as the breakthrough point, by means of the interpretation of Speed, it discusses the design of the speech engine as well as the route of implementation so as to reduce tedious work in voice testing. Introduction Speech coding is the basic technology of digital speech transmission and storage, by means of the compressed digital, it can represent the speech signals and make the expression of these signals with the minimum number of the required bits. Compared with the stimulated voice, digital voice transmission and storage system of using speech coding technology, has the advantages of high reliability, strong anti-interference ability, easy to be quickly exchanged, easy for the realization of confidentiality, multiplexing, packaging as well as the advantage of low price, etc. The compressed voice is used for transmission, which can reduce the required bandwidth of each route, thus it can transmit mote voice transmission in the same bandwidth; it can be used for storage, which can save space and improve the storage of speech length as well as reduce cost. The Interpretation of Speex. Speex is a multi-mode, multi-rate, speech code, based on CELP algorithm, it can provide narrow band, wide band and ultra-wide band three speech codec modes, which are respectively corresponding to the speech signals with the bandwidth of 4kHz (sampling rate is 8000), 8kHz (sampling rate is 16000), 16kHz (sampling rate is 32000). Among them, the narrow band speech coding only adopts narrow band sub-pattern coding; wide band speech can be divided into two sub bands,wide band speech adopts broadband sub-pattern coding, low band voice adopts narrow band sub-pattern coding; ultra-wide band will repeat decomposition two times which adopts the wide band sub-pattern coding twice and narrow band sub-pattern coding once . Thus, we can see, throughout the algorithm of Speex, it is composed by two types of sub-pattern encoding: narrow band sub-pattern and wide band sub-pattern. The Structure of the Functional Module of the System. In this paper, VOIP system adopts Speed coding. The whole system consists of two parts, the server side and client side. The server of the database saves the data of all registered users. Each user must firstly log on the server used by the client side, and access to the list if online friends, then it can make a voice call to the online friends. The call between the client sides adopts the mode of point to point. It can avoid the excessive delay of voice packet caused by the transmission of the server side. The Design of the Server Side. The server side can not only save the user's information in addition to maintaining a database, but also can command and manage each client side. Once the server is started to start the service, it is started to listen for the requests of users. The server receives the message sent by the client side, firstly, it sends back the confirmed information and then it can establish a separate thread to deal with the received data. In this separate thread, according to the category of the received data it has the corresponding treatment. The Design of the Client Side. The function of the client side can be divided into two parts: one part is interacted with the server side, which can obtain the relevant information from the server; the other part can complete the communication between the different clients point to point. Among them, the second part is the core of VOIP system.

01 Jan 2015
TL;DR: The most important aspect of LPC is the linear predictive filter which allows the value of the next sample to be determined by a linear combination of previous samples.
Abstract: Linear predictive coding (LPC) is defined as a digital method for encoding an analog signal in which a particular value is predicted by a linear function of the past values of the signal. It was first proposed as a method for encoding human speech by the United States Department of Defense in federal standard 1015, published in 1984. Human speech is produced in the vocal tract which can be approximated as a variable diameter tube. The linear predictive coding (LPC) model is based on a mathematical approximation of the vocal tract represented by this tube of a varying diameter. At a Particular time, t, the speech sample s(t) is represented as a linear sum of the p previous samples. The most important aspect of LPC is the linear predictive filter which allows the value of the next sample to be determined by a linear combination of previous samples. Under normal circumstances, speech is sampled at 8000 samples/second with 8 bits used to represent each sample. This provides a rate of 64000 bits/second. Linear predictive coding reduces this to 2400 bits/second. At this reduced rate the speech has a distinctive synthetic sound and there is a noticeable loss of quality. However, the speech is still audible and it can still be easily understood. Since there is information loss in linear predictive coding, it is a lossy form of compression.

Patent
05 Mar 2015
TL;DR: In this paper, a multi-mode audio codec and a CELP codec that enable a global gain adjustment with a proper penalty without detouring decryption and re-coding is proposed.
Abstract: PROBLEM TO BE SOLVED: To provide a multi-mode audio codec and a CELP codec that enable a global gain adjustment with a proper penalty without detouring decryption and re-coding.SOLUTION: Bitstream elements of sub-frames are encoded differentially to a global gain value so that a change of a global gain value of a frame results in an adjustment of an output level of a decoded representation of audio content. Concurrently, the differential coding saves or generates a bit when introducing a new syntax element into the coded bitstream. Further, regarding a differential coding, when the global gain value lower than a time resolution adjusting a gain of each sub-frame is set for the bitstream elements encoded differentially to a global gain value, the time resolution is enabled to thereby enable reduction in a burden of globally adjusting the gain of the encoded bitstream.