scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 2010"


Journal ArticleDOI
TL;DR: This work presents a coder (RCELP) that uses a generalization of the analysis-by-synthesis paradigm, which relaxes the waveform-matching constraints without affecting speech quality.
Abstract: At bit rates between 4 and 16 kbit/s, many state-of-the-art speech coding algorithms fall into the class of linear-prediction based analysis-by-synthesis (LPAS) speech coders. At the lower bit rates the waveform-matching, on which LPAS coders rely, constrains the speech quality. To overcome this drawback, we present a coder (RCELP) that uses a generalization of the analysis-by-synthesis paradigm. This generalization relaxes the waveform-matching constraints without affecting speech quality. We describe several implementations at bit rates between 4 and 6 kbit/s. MOS tests show that a 6 kbit/s RCELP has a quality similar or better than the 13 kbit/s GSM full-rate coder, and a 4.4 kbit/s RCELP has a speech quality significantly better than the 4.8 kbit/s FS1016 standard.

90 citations


Journal ArticleDOI
TL;DR: This work proposes a codec that simultaneously addresses both high quality and low delay, with a delay of only 8.7 ms at 44.1 kHz, and uses gain-shape algebraic vector quantization in the frequency domain with time-domain pitch prediction.
Abstract: With increasing quality requirements for multimedia communications, audio codecs must maintain both high quality and low delay. Typically, audio codecs offer either low delay or high quality, but rarely both. We propose a codec that simultaneously addresses both these requirements, with a delay of only 8.7 ms at 44.1 kHz. It uses gain-shape algebraic vector quantization in the frequency domain with time-domain pitch prediction. We demonstrate that the proposed codec operating at 48 kb/s and 64 kb/s out-performs both G.722.1C and MP3 and has quality comparable to AAC-LD, despite having less than one fourth of the algorithmic delay of these codecs.

78 citations


Proceedings Article
01 Aug 2010
TL;DR: This paper introduces the VISNET II DVC codec, which achieves very high RD performance thanks to the efficient combination of many state-of-the-art coding tools into a fully practical video codec.
Abstract: This paper introduces the VISNET II DVC codec. This codec achieves very high RD performance thanks to the efficient combination of many state-of-the-art coding tools into a fully practical video codec. Experimental results show that the proposed DVC codec consistently outperforms H.264/AVC Intra. For sequences with coherent motion, it even surpasses H.264/AVC zero-motion. Finally, it is also always better than the DISCOVER DVC codec. Therefore, it is expected that the proposed high performing DVC codec will be used by other researchers in the field as a reference to benchmark their results.

39 citations


Journal ArticleDOI
TL;DR: This new audio codec allows efficient transform-domain audio indexing for three different applications, namely beat tracking, chord recognition, and musical genre classification and is compared with the two standard MP3 and AAC codecs in terms of performance and computation time.
Abstract: Indexing audio signals directly in the transform domain can potentially save a significant amount of computation when working on a large database of signals stored in a lossy compression format, without having to fully decode the signals. Here, we show that the representations used in standard transform-based audio codecs (e.g., MDCT for AAC, or hybrid PQF/MDCT for MP3) have a sufficient time resolution for some rhythmic features, but a poor frequency resolution, which prevents their use in tonality-related applications. Alternatively, a recently developed audio codec based on a sparse multi-scale MDCT transform has a good resolution both for time- and frequency-domain features. We show that this new audio codec allows efficient transform-domain audio indexing for three different applications, namely beat tracking, chord recognition, and musical genre classification. We compare results obtained with this new audio codec and the two standard MP3 and AAC codecs, in terms of performance and computation time.

37 citations


Patent
23 Jun 2010
TL;DR: In this paper, an encoding apparatus for a High Quality Multi-channel Audio Codec (HQMAC) and a decoding apparatus for the HQMAC are provided, provided that the encoding/decoding apparatus can perform a HQMCA-CB encoding or an HQMAC-CB decoding in accordance with characteristics of input signals to provide compatibility with a lower channel.
Abstract: Provided is an encoding apparatus for a High Quality Multi-channel Audio Codec (HQMAC) and a decoding apparatus for the HQMAC. The encoding/decoding apparatuses for the HQMAC may perform a High Quality Multi-channel Audio Codec-Channel Based (HQMAC-CB) encoding or an HQMAC-CB decoding in accordance with characteristics of inputted audio signals to provide compatibility with a lower channel.

36 citations


09 Sep 2010
TL;DR: SILK, a speech codec for real-time, packet- based voice communications, provides scalability in several dimensions through control of bitrate, packet rate, packet loss resilience and use of discontinuous transmission (DTX).
Abstract: This document describes SILK, a speech codec for real-time, packet- based voice communications. Targeting a diverse range of operating environments, SILK provides scalability in several dimensions. Four different sampling frequencies are supported for encoding the audio input signal. Adaptation to network characteristics is provided through control of bitrate, packet rate, packet loss resilience and use of discontinuous transmission (DTX). And several different complexity levels let SILK take advantage of available processing power without relying on it. Each of these properties can be adjusted during operation of the codec on a frame-by-frame basis.

30 citations


Patent
Jinwei Feng1, Chu Peter
01 Jul 2010
TL;DR: In this article, a scalable audio codec for a processing device determines first and second bit allocations for each frame of input audio, and the allocations are made on a frame-by-frame basis based on the energy ratio between the two bands.
Abstract: A scalable audio codec for a processing device determines first and second bit allocations for each frame of input audio. First bits are allocated for a first frequency band, and second bits are allocated for a second frequency band. The allocations are made on a frame-by-frame basis based on the energy ratio between the two bands. For each frame, the codec transform codes both frequency bands into two sets of transform coefficients, which are then packetized based on the bit allocations. The packets are then transmitted with the processing device. Additionally, the frequency regions of the transform coefficients can be arranged in order of importance determined by power levels and perceptual modeling. Should bit stripping occur, the decoder at a receiving device can produce audio of suitable quality given that bits have been allocated between the bands and the regions of transform coefficients have been ordered by importance.

17 citations


Journal ArticleDOI
TL;DR: A fully parametric audio coder, which decomposes the audio signal into sinusoids, transients and noise, is here proposed and the performance of the proposed parametricaudio coders is assessed in comparison to widely used audio coders operating at similar bit rates.
Abstract: This paper deals with the application of adaptive signal models for parametric audio coding. A fully parametric audio coder, which decomposes the audio signal into sinusoids, transients and noise, is here proposed. Adaptive signal models for sinusoidal, transient, and noise modeling are therefore included in the parametric scheme in order to achieve high-quality and low bit-rate audio coding. In this paper, a new sinusoidal modeling method based on a perceptual distortion measure is proposed. For transient modeling, a fast and effective method based on matching pursuit with a mixed dictionary is chosen. The residue of the previous models is analyzed as a noise-like signal. The proposed parametric audio coder allows high quality audio coding for one-channel audio signals at 16 kbits/s (average bit rate). A bit-rate scalable version of the parametric audio coder is also proposed in this work. Bit-rate scalability is intended for audio streaming applications, which are highly demanded nowadays. The performance of the proposed parametric audio coders (nonscalable and scalable coders) is assessed in comparison to widely used audio coders operating at similar bit rates.

14 citations


Proceedings ArticleDOI
01 Aug 2010
TL;DR: A new method for the bandwidth extension of telephone speech using only the information in the narrowband speech to improve speech quality compared with a previously published bandwidth extension method.
Abstract: The limited audio bandwidth used in telephone systems degrades both the quality and the intelligibility of speech. This paper presents a new method for the bandwidth extension of telephone speech. Frequency components are added to the frequency band 4–8 kHz using only the information in the narrowband speech. First, a wideband excitation is generated by spectral folding from the narrowband linear prediction residual. The highband of this signal is divided into four subbands with a filter bank, and a neural network is used to weight the subbands based on features calculated from the narrowband speech. Bandwidth-extended speech is obtained by summing the weighted subbands and the original narrowband signal. Listening tests show that this new method improves speech quality compared with a previously published bandwidth extension method.

13 citations


Patent
01 Sep 2010
TL;DR: In this paper, a transceiver consisting of a codec, a microcontroller, and a radio is used to receive the first digital audio signal from the codec and packetize it into a first packet for transmission over a TCP/IP network.
Abstract: A transceiver including, a codec, microcontroller, and radio. The codec including an analog to digital converter for receiving a first audio program and converting it to a first digital signal; a digital to analog converter for receiving a second digital audio signal and converting it to a second audio program; and, a control function for managing characteristics of the codec. The microcontroller is in electrical communication with the codec: for receiving the first digital audio signal from the codec and packetizing it into a first packet for transmission over a TCP/IP network; for receiving a second packet from network and converting it into the second digital audio signal and sending it to the codec; and for receiving control signals from the network. The radio is in electrical communication with the microcontroller for connection to the network to transmit the first packet to the network and receive the second packet from the network.

12 citations


Journal ArticleDOI
TL;DR: The subjective and objective quality evaluations show that the reconstruction signal quality for the proposed FDLP codec compares well with the state-of-the-art audio codecs in the 32-64 kbps range.
Abstract: We present a scalable medium bit-rate wide-band audio coding technique based on frequency-domain linear prediction (FDLP). FDLP is an efficient method for representing the long-term amplitude modulations of speech/audio signals using autoregressive models. For the proposed audio codec, relatively long temporal segments (1000 ms) of the input audio signal are decomposed into a set of critically sampled sub-bands using a quadrature mirror filter (QMF) bank. The technique of FDLP is applied on each sub-band to model the sub-band temporal envelopes. The residual of the linear prediction, which represents the frequency modulations in the sub-band signal, are encoded and transmitted along with the envelope parameters. These steps are reversed at the decoder to reconstruct the signal. The proposed codec utilizes a simple signal independent nonadaptive compression mechanism for a wide class of speech and audio signals. The subjective and objective quality evaluations show that the reconstruction signal quality for the proposed FDLP codec compares well with the state-of-the-art audio codecs in the 32-64 kbps range.

Patent
14 Oct 2010
TL;DR: In this article, in-band signaling is used between two stations to change the codec that is to be used during a call setup, and if the receiving station detects and reacts to the inband signals, then both stations change to communicate with the second codec.
Abstract: After a call is established between two stations using a codec that has been negotiated during call setup, in-band signaling may be used between the two stations to change the codec that is to be used. The in-band signals are indicative that the station that is transmitting the in-band signals can operate with a second codec and are used to probe whether the receiving station can also operate with that second codec. If the receiving station detects and reacts to the in-band signals, then both stations change to communicate with the second codec. The second codec has compatible packet sizes of the deployed (originally negotiated) codec without any need of infrastructure upgrade and/or quality compromise to legacy phone users (i.e., stations that cannot operate with the second codec).

Patent
Adrian Fratila1
13 Oct 2010
TL;DR: In this article, a finite impulse response filter (FIR) coefficients characterizing an echo path between its local audio output and audio input are applied to the received/decompressed audio data, and the predicted echo is subtracted from the uplink signal.
Abstract: Duplex audio communications over a network use compressed audio data, with linear prediction coefficients (LPCs) and variances by which sample values differ from predictions. A adaptive echo canceller for a transceiver develops finite impulse response filter (FIR) coefficients characterizing an echo path between its local audio output and audio input. The received/decompressed audio data is applied to the FIR coefficients, and the predicted echo is subtracted from the uplink signal. Echo is detected as cross-correlation of the receive signal versus the uplink/send signal over time. In one embodiment, the cross-correlation is determined using a pre-whitened receive signal, obtained by adopting the variance values received over the network by the downlink Codec. Apart from the uplink Codec, no speech analysis filter or process is needed. The technique is apt for GSM, AMR and similar compressed audio communications.

Journal ArticleDOI
TL;DR: A new technique for the class of code-excited linear prediction speech codecs designed to reduce error propagation after lost frames is presented, which consists in replacing the interframe long-term prediction with a glottal-shape codebook in the subframe containing the firstglottal impulse in a given frame.
Abstract: This paper presents a new technique for the class of code-excited linear prediction speech codecs designed to reduce error propagation after lost frames. Its principle consists in replacing the interframe long-term prediction with a glottal-shape codebook in the subframe containing the first glottal impulse in a given frame. This technique, independent of previous frames, is of particular interest in voiced speech frames following transitions as these frames are the most sensitive to frame erasures. It is a basis of a structured coding scheme called transition coding (TC). The TC greatly improves codec performance in noisy channels while maintaining clean channel performance. It is a part of the new embedded speech and audio codec recently standardized as Recommendation G.718 by ITU-T.

01 Jan 2010
TL;DR: An experimental design and implementation of the controller using the specification given by the Philips for I2C protocol & DSP mode of operation of CODEC on cyclone-II EP2C35F72C6 FPGA in Altera DE2 board is presented.
Abstract: The trend in hardware design is towards implementing a complete system, intended for various applications, on a single chip. In order to implement the any speech application in Altera DE2 board a controller is designed to control the CODEC and acquire the digital data from it. This paper presents an experimental design and implementation of the controller using the specification given by the Philips for I2C protocol & DSP mode of operation of CODEC on cyclone-II EP2C35F72C6 FPGA in Altera DE2 board . A controller was designed using VHDL language, which performs the two operations: I2C protocol operation to drive the Wolfson Codec WM8731, sound fetching from Wolfson Codec WM8731 to FPGA in DSP mode. Altera Quartus II 9.0 sp2 web Edition is used for the synthesis of the VHDL logic on FPGA and ModelSim- Altera 6.5b (Quartus II 9.1) Starter Edition is used for the simulation of VHDL logic. Three modules have been created in the design: the I2C bus controller, virtual sound fetcher, and the clock module. The FPGA communicates with the Wolfson via the I2C (Inter-Integrated Circuit) protocol using two pins: 'SDIN' (the data line), and 'SCLK' (the bus clock). I2C bus controller modifies internal settings of Codec, de-mute the microphone input, boost the microphone volume, and change the default sound path (so that the microphone is given priority over other inputs). After the codec digitalizes the input it put the digital data on digital audio interface, to fetch the data on DACDAT of codec form digital audio interface DSP mode of operation of codec is used in the design. DACDAT is the formatted digital audio data stream with left and right channels multiplexed together. DACLRC (alignment clock) and BCLK (synchronization clock) is used to fetch the data on DACDAT this data can be use for any sound application. Clock module is design to generate different clock requirement for the controller.

Proceedings ArticleDOI
01 Nov 2010
TL;DR: The scores of PESQ with 11 testing sequences show that the proposed switching method will not bring additional noise and can get higher objective evaluation of audio quality than single codec.
Abstract: This paper presents a dual-mode switching method between time-domain codec and transform-domain codec of audio coding. It is a key technique of unified speech and audio (music) coding, since the replaying audio quality corresponds to the suitable codec selection and smooth switching between them. The proposed method consists of two steps, codec mode selection and switching. The binary decision trees (BDTs) algorithm is used to take a decision of mode selection, because of its advantages of high accuracy, low delay and low complexity. For smoothing transition between two codec, a pre-coding strategy is suggested in this paper. The classical speech codec, Algebraic Code Excited Linear Prediction (ACELP) and the Advanced Audio Coding (AAC) of MPEG are used for validating the proposed method. The scores of PESQ with 11 testing sequences show that the proposed switching method will not bring additional noise and can get higher objective evaluation of audio quality than single codec.

Patent
Yooseok Kim1, Kyoungjoung Kim1, Younghun Jang1, Youngkook Seo1, Hyeyoung Hong1 
15 Dec 2010

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This work proposes a trellis-based approach to directly optimize the trade-off between the quality of the AAC core and the lossless compression performance of SLS, and shows that such optimization can in fact achieve an AAC core of superior perceptual quality while maintaining state-of-the-art Lossless compression, all this in compliance with the HD-AAC standard.
Abstract: MPEG-4 High-Definition Advanced Audio Coding (HD-AAC) enables scalable-to-lossless (SLS) audio coding with an Advanced Audio Coding (AAC) base layer, and fine-grained enhancements based on the MPEG SLS standard. While the AAC core offers better perceptual quality at lossy bit-rates, its inclusion has been observed to compromise the ultimate lossless compression performance as compared to the SLS ‘non-core’ (i.e., without an AAC base layer) codec. In contrast, the latter provides excellent lossless compression but with significantly degraded audio quality at low bit-rates. We propose a trellis-based approach to directly optimize the trade-off between the quality of the AAC core and the lossless compression performance of SLS. Simulations to test the effectiveness of the approach demonstrate the capability to adjust the trade-off to match application specific needs. Moreover, such optimization can in fact achieve an AAC core of superior perceptual quality while maintaining state-of-the-art (and surprisingly sometimes even better) lossless compression, all this in compliance with the HD-AAC standard.

Journal ArticleDOI
TL;DR: The verification results indicate that audio quality estimated by the proposed parametric packet-layer model has a high correlation with perceived audio quality.
Abstract: We propose a parametric packet-layer model for monitoring audio quality in multimedia streaming services such as Internet protocol television (IPTV). This model estimates audio quality of experience (QoE) on the basis of quality degradation due to coding and packet loss of an audio sequence. The input parameters of this model are audio bit rate, sampling rate, frame length, packet-loss frequency, and average burst length. Audio bit rate, packet-loss frequency, and average burst length are calculated from header information in received IP packets. For sampling rate, frame length, and audio codec type, the values or the names used in monitored services are input into this model directly. We performed a subjective listening test to examine the relationships between these input parameters and perceived audio quality. The codec used in this test was the Advanced Audio Codec-Low Complexity (AAC-LC), which is one of the international standards for audio coding. On the basis of the test results, we developed an audio quality evaluation model. The verification results indicate that audio quality estimated by the proposed model has a high correlation with perceived audio quality.

Proceedings ArticleDOI
18 Mar 2010
TL;DR: The paper presents a 1.5V 10mW full-featured stereo audio CODEC that is integrated with a Bluetooth radio and PMU on a single die that is optimized for low-voltage operation and low-power consumption.
Abstract: Low-power and full-featured stereo audio CODECs are increasingly needed in wireless devices, such as Bluetooth headsets and smart phones. These portable devices are usually powered by low-voltage batteries with limited capacities. It is of particular importance that such CODECs be optimized for low-voltage operation and low-power consumption. The paper presents a 1.5V 10mW full-featured stereo audio CODEC that is integrated with a Bluetooth radio and PMU on a single die. As depicted in Fig. 4.5.1, the CODEC contains microphone PGAs, audio ΔΣ ADCs and DACs, speaker drivers and microphone bias generators.

Patent
Haiting Li1
30 Dec 2010
TL;DR: In this paper, a coding method, decoding method, coding-decoding (codec) method, a codec system, and relevant apparatuses are disclosed, which includes: obtaining an amplitude vector and a length vector corresponding to a vector to be coded.
Abstract: A coding method, a decoding method, a coding-decoding (codec) method, a codec system and relevant apparatuses are disclosed. The coding method includes: obtaining an amplitude vector and a length vector corresponding to a vector to be coded; sorting elements of the amplitude vector and elements of the length vector; and obtaining a position index value according to the sorted amplitude vector and the sorted length vector. A decoding method, a codec system, and relevant apparatuses are also provided.

Proceedings ArticleDOI
23 May 2010
TL;DR: A novel technique to identify the voice and silent regions of a speech stream that is very much suitable for VoIP calls is introduced, which uses an entropy measure, which is based on the spacings of order statistics of speech frames to differentiate the silence zones from the speech zones.
Abstract: Realtime voice communication over the Internet has rapidly gained popularity. It is indeed essential to reduce the total bandwidth consumption to efficiently use the available bandwidth for the subscribers having low speed connectivity and even otherwise. In this paper we introduce a novel technique to identify the voice and silent regions of a speech stream that is very much suitable for VoIP calls. We use an entropy measure, which is based on the spacings of order statistics of speech frames to differentiate the silence zones from the speech zones. We developed an algorithm that uses an adaptive thresholding to minimize the misdetection. The performance of our approach is compared with the built-in VAD of AMR codec. Our approach yields comparatively better saving in bandwidth yet maintaining a good quality of the speech streams. Further, the proposed approach has improved voice detection compared to the AMR schemes under noisy conditions. The ideas presented in this paper has been identified novel during the WIPO international patent search.

Journal ArticleDOI
TL;DR: A complexity scalability design is proposed for the coding of the dynamic codebook search in the iLBC speech codec and results show that the computational complexity can be effectively reduced with imperceptible degradation of the speech quality.
Abstract: Differing from the long-term prediction used in the modern speech codec, the standard of the internet low bit rate codec (iLBC) independently encodes the residual of the linear predictive coding (LPC) frame by frame. In this paper, a complexity scalability design is proposed for the coding of the dynamic codebook search in the iLBC speech codec. In addition, a trade-off between the computational complexity and the speech quality can be achieved by dynamically setting the parameter of the proposed approach. Simulation results show that the computational complexity can be effectively reduced with imperceptible degradation of the speech quality.

Journal ArticleDOI
TL;DR: An interval Type-2 fuzzy logic controlled scheme for VoIP services that infers network state from average delivered perceived quality of service and its degradation due to network congestion and updates an AMR codec mode to match voice quality to available network bandwidth.
Abstract: Adaptive VoIP schemes have potentially suboptimal performance owing to imprecision in the metrics used to infer network state. An interval Type-2 fuzzy logic controlled scheme for VoIP services is presented. It infers network state from average delivered perceived quality of service and its degradation due to network congestion and updates an AMR codec mode to match voice quality to available network bandwidth. Tests showed that the scheme maximised delivered voice quality and outperformed an existing adaptive scheme. The scheme achieves robust performance in the presence of input imprecision and can be implemented in VoIP terminals, and the fuzzy rule base is easy to understand and change by non-experts because of its similarity to the human decision-making process.

Patent
22 Apr 2010
TL;DR: In this paper, a multi-bus architecture within a video codec that discretely and efficiently transports video components within the codec is presented. But it does not address the specific characteristics of the video components or parameters being processed.
Abstract: Embodiments of the present invention relate to a multi-bus architecture within a video codec that discretely and efficiently transports video components within the codec. This multi-bus architecture provides a relatively more efficient transport mechanism because the various buses are designed to specifically address unique characteristics of the video components or parameters being processed within the codec.

Journal ArticleDOI
TL;DR: This work revisits an original concept of speech coding in which the signal is separated into the carrier modulated by the signal envelope and results in a codec that does not rely on the linear speech production model but rather uses well-accepted concept of frequency-selective auditory perception.
Abstract: We revisit an original concept of speech coding in which the signal is separated into the carrier modulated by the signal envelope. A recently developed technique, called frequency-domain linear prediction (FDLP), is applied for the efficient estimation of the envelope. The processing in the temporal domain allows for a straightforward emulation of the forward temporal masking. This, combined with an efficient nonuniform sub-band decomposition and application of noise shaping in spectral domain instead of temporal domain (a technique to suppress artifacts in tonal audio signals), yields a codec that does not rely on the linear speech production model but rather uses well-accepted concept of frequency-selective auditory perception. As such, the codec is not only specific for coding speech but also well suited for coding other important acoustic signals such as music and mixed content. The quality of the proposed codec at 66 kbps is evaluated using objective and subjective quality assessments. The evaluation indicates competitive performance with the MPEG codecs operating at similar bit rates.

Journal ArticleDOI
TL;DR: Objective and subjective experimental results confirm that the proposed algorithm could achieve better speech quality and the value of pitch lag when consecutive frames are lost and the recovery of codebook gain for good frames after continuous bad frames are discussed.

Proceedings ArticleDOI
15 Nov 2010
TL;DR: This paper achieves the design of a video acquisition and compression codec system, which takes dual-core chips TMS320DM6446 as its core and Linux as its operating system because Linux can be reduced and transplanted.
Abstract: This paper achieves the design of a video acquisition and compression codec system, which takes dual-core chips TMS320DM6446 as its core and Linux as its operating system. This is because Linux can be reduced and transplanted. The video capture device driver V4L2 and Codec Engine are introduced in detail, and through H.264 algorithm functions of video compression codec are successfully realized. Relevant experiments show that the codec algorithm has strong performance of anti-errors and the videos are clear and reliable after compression codec. Moreover the mount of the video data is largely reduced.

Proceedings ArticleDOI
01 Nov 2010
TL;DR: The result is that AVS-M audio performance is no worse than AMR-WB+ on average, and fixed-point version of AVs-M codec are implemented on DSP platform.
Abstract: AVS-M audio standard, which targets at wireless network and mobile equipment, is now independently drawn up in China. Its framework is similar to that of AMR-WB+. The performance of AVS-M audio core algorithms is analyzed in this paper. In order to analyze its complexity fixed-point version of AVS-M codec are implemented on DSP platform. At last, a performance evaluation between AVS-M and AMR-WB+ is discussed. The result is that AVS-M audio performance is no worse than AMR-WB+ on average.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This paper proposes a low-complexity video codec based on two-dimensional Singular Value Decomposition (2D-SVD), which has higher coding efficiency and performs well to deal with packet loss that is unavoidable in error-prone transmission.
Abstract: In this paper, we propose a low-complexity video codec based on two-dimensional Singular Value Decomposition (2D-SVD). We exploit the common temporal characteristics of video without resorting to motion estimation. It has been demonstrated that this codec has higher coding efficiency than the relevant existing low complexity codecs. Moreover, the proposed codec performs well to deal with packet loss that is unavoidable in error-prone transmission. Therefore it is with advantages and good potential for wireless video applications such as mobile video calls and wireless surveillance.