scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 2008"


Proceedings ArticleDOI
12 May 2008
TL;DR: The key element of the method is an alternative search strategy for the ACELP codebook which allows for joint data hiding and speech coding and it is pointed out that the method can also be exploited to reduce the codec bit rate.
Abstract: A new method for hiding digital data in the bitstream of an ACELP speech codec is proposed in this paper. The key element of our method is an alternative search strategy for the ACELP codebook which allows for joint data hiding and speech coding. The concept has been examplarily applied to the AMR speech codec (12.2 kbit/s mode) and it is shown that steganographic data can be reliably transmitted at a rate of up to 2 kbit/s both with a negligible effect on the subjective quality of the coded speech and with reasonable computational complexity. Apart from data hiding, it is further pointed out that our method can also be exploited to reduce the codec bit rate.

70 citations


Journal ArticleDOI
TL;DR: A backward-compatible multichannel audio codec that unifies the above-mentioned conditions: backward compatibility and exploitation of both signal and perceptual redundancies and combines a high audio quality and a low parameter bit rate.
Abstract: We propose in this paper a backward-compatible multichannel audio codec. This codec represents a multichannel audio input signal by a down mix and parametric data. In order to enable backward compatibility, it is necessary to have the possibility of exerting control over the down-mixing procedure. At the same time, in order to achieve a high coding efficiency, both signal and perceptual redundancies should be exploited. In this paper, we describe a codec that unifies the above-mentioned conditions: backward compatibility and exploitation of both signal and perceptual redundancies. The codec combines a high audio quality and a low parameter bit rate. Moreover, its design is flexible, examples of which are the scalability of the audio quality to (in principle) transparency and the possibility to preserve the correlation structure of the original input signals by using synthetic signals. A stereo backward compatible version of the proposed codec is used as a component of the recently standardized MPEG Surround multichannel audio codec.

45 citations


Proceedings ArticleDOI
25 Aug 2008
TL;DR: ITU-T Embedded Variable Bit-Rate (EV-VBR) codec is presented, being standardized by Question 9 of Study Group 16 (Q9/16) as recommendation G.718, robust to significant rates of frame erasures or packet losses and several technologies are used to encode the MDCT coefficients for best performance both for speech and music.
Abstract: This paper presents ITU-T Embedded Variable Bit-Rate (EV-VBR) codec being standardized by Question 9 of Study Group 16 (Q9/16) as recommendation G.718. The codec provides a scalable solution for compression of 16 kHz sampled speech and audio signals at rates between 8 kbit/s and 32 kbit/s, robust to significant rates of frame erasures or packet losses. It comprises 5 layers where higher layer bitstreams can be discarded without affecting the lower layer decoding. The core layer takes advantage of signal-classification based CELP encoding. The second layer reduces the coding error from the first layer by means of additional pitch contribution and another algebraic codebook. The higher layers encode the weighted error signal from lower layers using MDCT transform coding. Sev-eral technologies are used to encode the MDCT coefficients for best performance both for speech and music. The codec performance is demonstrated with selected results from ITU-T Characterization test.

42 citations


Proceedings ArticleDOI
12 May 2008
TL;DR: The relationship to Ambisonic B-format signals is described and alternative approaches that derive a stereo or mono-downmix signal based on S3AC are presented and evaluated.
Abstract: Spatially squeezed surround audio coding (S3AC) has been previously shown to provide efficient coding with perceptually accurate soundfield reconstruction when applied to ITU 5.1 multichannel audio. This paper investigates the application of S3AC to the coding of Ambisonic audio recordings. Traditional ambisonics achieve compression and backward compatibility through the use of the UHJ matrixing approach to obtain a stereo signal. In this paper the relationship to Ambisonic B-format signals is described and alternative approaches that derive a stereo or mono-downmix signal based on S3AC are presented and evaluated. The mono-downmix approach utilizes side information consisting of spatial cues that are quantized based on novel source localization listening experiments. Objective and subjective tests demonstrate significant improvements in the localization of sound sources resulting from decoding the compressed B-format signals to a 5.1 speaker playback.

36 citations


Proceedings ArticleDOI
12 May 2008
TL;DR: This paper proposes a unified speech/audio codec by adopting a single channel harmonic separation module as a pre-processor that adopts two state-of-the-art international standards (AMR-WB and HE-AAC) to provide an interoperability option.
Abstract: This paper proposes a unified speech/audio codec by adopting a single channel harmonic separation module as a pre-processor. A modulation frequency analysis method is used for harmonic separation, and the separated components are first encoded by an appropriate codec, e.g. speech codec. The error between input and the encoded signal is re- encoded by another codec. Though any type of codec can be used for the purpose, we adopt two state-of-the-art international standards (AMR-WB and HE-AAC) to provide an interoperability option. The amount of allocated bits to each stage is controlled by a power ratio of separated harmonic components to input signal. Subjective listening tests verify the consistency of the proposed method in speech, music and mixed signal inputs.

34 citations


Proceedings ArticleDOI
01 Nov 2008
TL;DR: This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec.
Abstract: In the following paper, a method for the real-time conversion of whispers to normal phonated speech through a code excited linear prediction analysis-by-synthesis codec is discussed. This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec. Furthermore, since restoring pitch to whispered speech requires some considerations of quality and accuracy, spectral enhancements are required in terms of formant shifting (LSPs modification) and pitch injection based on voiced/unvoiced decision. Spectral shifting is accomplished through line-spectral pair adjustment. Implementing such methods by using the popular CELP codec allows integration of the technique with any modern speech applications and devices. Subjective testing results are presented to determine the effectiveness of the technique.

31 citations


Journal ArticleDOI
TL;DR: This letter presents two approaches to accelerate the coding operation substantially, component-level parallelism and pipeline techniques capable of processing high-bitrate video data in a macroblock (MB)-level pipelined codec architecture and a specific part of the coding process, i.e., residual block coding.
Abstract: In H.264/AVC and the variants, the coding of context-based adaptive variable length codes (CAVLC) requires demanding operations, particularly at high bitrates such as 100 Mbps. This letter presents two approaches to accelerate the coding operation substantially. Firstly, in the architectural aspect, we propose component-level parallelism and pipeline techniques capable of processing high-bitrate video data in a macroblock (MB)-level pipelined codec architecture. The second approach focuses on a specific part of the coding process, i.e., the residual block coding, in which the coefficient levels are coded without using look-up tables so we minimize the pertaining logic depth in the critical path, and we achieve higher operating clock frequencies. Additionally, two coefficient levels are processed in parallel by exploiting a look-ahead technique. The resulting architecture, merged in the MB-level pipelined codec system, is capable of coding up to 100 Mbps bitstreams in real-time, thus accommodating the real-time encoding of 1080p@60 Hz video.

25 citations


Proceedings ArticleDOI
12 May 2008
TL;DR: The Q9/16 codec is an embedded codec comprising 5 layers where higher layer bitstreams can be discarded without affecting the decoding of the lower layers, and has been designed with the primary objective of a high-performance wideband speech coding for error- prone telecommunications channels, without compromising the quality for narrowband/wideband speech or wideband music signals.
Abstract: We present the Q.EV-VBR winning candidate codec recently selected by Question 9 of Study Group 16 (Q9/16) of ITU-T as a baseline for the development of a scalable solution for wideband speech and audio compression at rates between 8 kb/s and 32 kb/s. The Q9/16 codec is an embedded codec comprising 5 layers where higher layer bitstreams can be discarded without affecting the decoding of the lower layers. The two lower layers are based on the CELP technology where the core layer takes advantage of signal classification based encoding. The higher layers encode the weighted error signal from lower layers using overlap-add transform coding. The codec has been designed with the primary objective of a high-performance wideband speech coding for error- prone telecommunications channels, without compromising the quality for narrowband/wideband speech or wideband music signals. The codec performance is demonstrated with selected test results.

25 citations


Patent
05 Aug 2008
TL;DR: In this article, an adaptive multimedia system for providing multimedia contents and a codec to a user terminal, and a method thereof, is presented, which includes a media server controller that receives profile information from an open codec player of the user terminal.
Abstract: Disclosed are an adaptive multimedia system for providing multimedia contents and a codec to a user terminal, and a method thereof. The adaptive multimedia system includes: a media server controller that receives profile information from an open codec player of the user terminal, and when a codec for decoding the multimedia contents does not exist in the user terminal, transmits a control message to allow the multimedia contents and the decoding codec to be transmitted together; and at least one transmission frame generator that encodes the multimedia contents through a transcoder and an encoding module according to the control message transmitted from the media server controller, generates a transmission frame including the encoded multimedia contents and the decoding codec, and transmits the generated transmission frame to the open codec player.

24 citations


Journal ArticleDOI
TL;DR: The numerical results show that the proposed algorithm can increase the maximum supportable number of voice users by 26% compared to the conventional extended real-time polling service (ertPS), and reduce the waste of uplink bandwidth considering the characteristics of AMR speech codec.
Abstract: This letter proposes an efficient uplink scheduling algorithm for voice over Internet protocol (VoIP) services with adaptive multi-rate (AMR) speech codec in IEEE 802.16e/m systems. The proposed scheduling algorithm adopts the random access scheme during silent-period to reduce the waste of uplink bandwidth considering the characteristics of AMR speech codec. The numerical results show that the proposed algorithm can increase the maximum supportable number of voice users by 26% compared to the conventional extended real-time polling service (ertPS).

22 citations


Patent
25 Mar 2008
TL;DR: In this paper, a scalable audio codec uses perceptual transform coding to encode the base layer and the residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio.
Abstract: A scalable audio codec encodes an input audio signal as a base layer at a high compression ratio and one or more residual signals as an enhancement layer of a compressed bitstream, which permits a lossless or near lossless reconstruction of the input audio signal at decoding. The scalable audio codec uses perceptual transform coding to encode the base layer. The residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio. For lossless reconstruction, the frequency and multi-channel transforms are reversible.

Journal ArticleDOI
TL;DR: A parametric multichannel audio codec dedicated to coding signals consisting of a dense series of transient-type events, which finds the new codec to have a significantly higher audio quality than the MPEG Surround codec for the two multich channel applause signals under test.
Abstract: We develop a parametric multichannel audio codec dedicated to coding signals consisting of a dense series of transient-type events. These signals of which applause is a typical example are known to be problematic for such audio codecs. The codec design is based on preservation of both timbre and transient-type event density. It combines a very low complexity and a low parameter bit rate (0.2 kbps). In a formal listening test, we compared the proposed codec to the recently standardised MPEG Surround multichannel codec, with an associated parameter bit rate of 9 kbps. We found the new codec to have a significantly higher audio quality than the MPEG Surround codec for the two multichannel applause signals under test. Though this seems promising, the technique presented is not fully mature, for example, because issues related to integration of the proposed codec in the MPEG Surround codec were not addressed.

Patent
30 Dec 2008
TL;DR: In this paper, an apparatus and method of transmitting a circuit switched (CS) voice application via an enhanced dedicated channel (E-DCH), implemented in a wireless transmit/receive unit (WTRU), is described.
Abstract: An apparatus and method of transmitting a circuit switched (CS) voice application via an enhanced dedicated channel (E-DCH), implemented in a wireless transmit/receive unit (WTRU). The method includes receiving a grant; performing an E-TFC selection procedure based on the grant, wherein a number of bits that may be transmitted over an enhanced dedicated channel (E-DCH) is determined, determining an adaptive multi-rate (AMR) codec bit-rate based on the number of bits that may be transmitted over the E-DCH, generating AMR voice packets based on the determined AMR codec bit rate, and submitting the AMR voice packets to lower layers for transmission over the E-DCH.

Patent
09 Jul 2008
TL;DR: In this paper, a method and apparatus for controlling voice quality in WLAN is provided, where the bit rate of wireless terminals is adjusted by collecting channel state information for determining a channel occupation time of the wireless terminals connected to an access point.
Abstract: A method and apparatus for controlling voice quality in WLAN is provided. The codec bit rate of wireless terminals is adjusted by collecting channel state information for determining a channel occupation time of wireless terminals connected to an access point, adjusting the codec bit rate of the wireless terminals based on channel occupancy of the wireless terminals with respect to total channel capacity, which is determined using the channel state information, and transmitting the adjusted codec bit rate to each of the wireless terminals.

Book ChapterDOI
01 Jan 2008
TL;DR: Low-bit-rate speech coding, at rates below 4 kb/s, is needed for both communication and voice storage applications and a number of different approaches for this modeling are related to the basic linear model of speech production, where an excitation signal drives a vocal-tract filter.
Abstract: Low-bit-rate speech coding, at rates below 4 kb/s, is needed for both communication and voice storage applications. At such low rates, full encoding of the speech waveform is not possible; therefore, low-rate coders rely instead on parametric models to represent only the most perceptually relevant aspects of speech. While there are a number of different approaches for this modeling, all can be related to the basic linear model of speech production, where an excitation signal drives a vocal-tract filter.

Patent
29 Oct 2008
TL;DR: In this paper, a transcoding apparatus and method between two codecs, each including a deblocking filter, is described. But the deblocking filtering is not performed by the first codec and the decoded data may be used as input data when the second codec performs encoding or motion estimation.
Abstract: Disclosed are a transcoding apparatus and method between two codecs each including a deblocking filter. The transcoding method between first and second codecs each including a deblocking filter, may include decoding input data encoded according to the first codec, according to the first codec so as to generate decoded data; and encoding the decoded data according to the second codec. The decoded data may include data on which deblocking filtering is not performed by the first codec, or data on which deblocking filtering is adaptively performed by the first codec. The decoded data may further include data on which deblocking filtering is performed by the first codec. The decoded data may be used as input data when the second codec performs encoding and/or when the second codec performs motion estimation.

Proceedings ArticleDOI
12 May 2008
TL;DR: ITU-T has selected the candidate submitted by Ericsson, Nokia, Motorola, VoiceAge, and Texas Instruments as the baseline for the G.EV-VBR coding standard, an embedded scalable speech codec that uses state-of-the-art technology to provide the most efficient encoded speech available for various real-time applications.
Abstract: ITU-T has selected the candidate submitted by Ericsson, Nokia, Motorola, VoiceAge, and Texas Instruments as the baseline for the G.EV-VBR coding standard. G.EV-VBR is an embedded scalable speech codec that uses state-of-the-art technology to provide the most efficient encoded speech available for various real-time applications. EV-VBR encodes both narrowband (NB) and wideband (WB) speech signals starting at 8 kbps. Near perfect wideband representation is achieved at 32 kbps for all signal types. The bit stream is divided into five robust layers, providing sufficient granularity, in particular for VoIP applications. In addition, an extension to the codec will provide super- wideband and stereo capability by adding layers to the codec. Extensive listening tests were conducted during the ITU-T selection phase to support selection of the best- performing candidate. The selected EV-VBR candidate passed 69 of 70 required and 25 of 28 objective terms of reference.

Journal ArticleDOI
TL;DR: A new, simple-to-use jitter buffer algorithm is proposed as a front-end to conventional static or adaptive jitterbuffer algorithms to provide improved performance, in terms of enhanced user-perceived speech quality and reduced end- to-end delay.
Abstract: Jitter buffer plays an important role in Voice over IP (VoIP) applications because it provides a key mechanism for achieving good speech quality to meet technical and commercial requirements. The main objective of this paper is to propose a new, simple-to-use jitter buffer algorithm as a front-end to conventional static or adaptive jitter buffer algorithms to provide improved performance, in terms of enhanced user-perceived speech quality and reduced end-to-end delay. Supported by signal processing features, the new algorithm, the so-called Play Late Algorithm, alters the playout delay inside a speech talkspurt without introducing unnecessary extra end-to-end delay. The results show that the new algorithm achieves the best performance under different network conditions when compared to conventional static and adaptive jitter buffer algorithms. The results reported here are based on live tests and emulated network conditions on real mobile phone prototypes. The mobile phone prototypes use AMR codec and support full IP/UDP/RTP stack with IPSec function in some of the tests. The method for perceived speech quality measurement is based on the ITU-T standard for speech quality evaluation (PESQ).

Proceedings ArticleDOI
05 Nov 2008
TL;DR: This paper presents a novel scheme to code audio signals at low bitrates which uses a traditional scalar quantization followed by entropy coding to code some portions of the spectrum and results in an audio codec which has been shown to be among the best audio codecs available atLow bitrates.
Abstract: Audio coding at low bitrates typically suffers from artifacts caused by bandwidth truncation. In this paper we present a novel scheme to code audio signals at low bitrates which uses a traditional scalar quantization followed by entropy coding to code some portions of the spectrum (typically the lower portion). The other portions (typically the higher portions) of the spectrum are coded at a low bitrate using an adaptive gain shape vector quantizer where the codebook for vector quantization is formed by unmodified or modified versions of the portions of the spectrum which have already been coded. Fixed pre-trained codebooks are also available for use in certain cases. The use of such a scheme results in an audio codec which has been shown to be among the best audio codecs available at low bitrates. In addition, the decoder complexity of this audio codec is significantly lower than any other codec of equal quality at low bitrates.

01 Feb 2008
TL;DR: This document specifies real-time transport protocol (RTP) payload formats to be used for the EVRC wideband codec (EVRC-WB) and updates the media type registrations for EVRC-B codec.
Abstract: This document specifies real-time transport protocol (RTP) payload formats to be used for the EVRC wideband codec (EVRC-WB) and updates the media type registrations for EVRC-B codec. Several media type registrations are included for EVRC-WB RTP payload formats. In addition, a file format is specified for transport of EVRC-WB speech data in storage mode applications such as e-mail.

Proceedings ArticleDOI
25 Aug 2008
TL;DR: The purpose of this paper is to demonstrate how efficient noise masking can be applied at the encoder in a G.711-interoperable manner, and how the same noise masks can be extended at the decoder to one or more enhancement layers to implement a perceptually optimized multilayer codec.
Abstract: In the transition from narrowband to wideband speech communications, there is a need in some applications for a high quality wideband coding scheme interoperable with the ITU-T G.711 narrowband coding standard. This can be accomplished using a multi-layer coding scheme with a G.711 compatible core layer. For optimal wideband quality in the upper layers, this requires using full frequency range (50–4000 Hz instead of 300–3400Hz) in the core layer. In this context, the 8-bit non-uniform PCM quantizer of the ITU-T G.711 standard can produce highly perceptible noise. The purpose of this paper is to demonstrate how efficient noise masking can be applied at the encoder in a G.711-interoperable manner, and how the same noise masking can be extended at the decoder to one or more enhancement layers to implement a perceptually optimized multilayer codec.

Patent
01 Sep 2008
TL;DR: In this article, a shadow codec is proposed to snoop the audio data and commands on the High Definition Audio (HDA) bus that are targeted to the conventional codec to generate a second audio output.
Abstract: Systems and methods for “shadowing” a target codec to provide additional features that are not available in the target codec. In one embodiment, an audio amplification system includes a High Definition Audio (HDA) bus, and an HDA controller, a conventional HDA codec and a shadow HDA codec coupled to the HDA bus. The conventional codec receives audio data and commands from the HDA controller via the bus and processes them to generate an output audio signal. The shadow codec snoops the audio data and commands on the HDA bus that are targeted to the conventional codec. The shadow codec processes the snooped audio data and commands to generate a second audio output. The shadow codec does not communicate with the HDA controller and is transparent to the controller. The shadow codec does not request enumeration from the HDA controller and does not receive an address from the HDA controller.

01 Aug 2008
TL;DR: This paper introduces the OpenCORE multimedia framework and associated optimized audio codecs which are a part of the Android platform, and shows how these components meet the challenging requirements for use in mobile devices.
Abstract: Audio and speech codecs such as MP3, AAC, and AMR are used extensively on mobile devices throughout the world. In the ideal case, such codecs rely on hardware acceleration. However, it is also very common to see software audio codecs running on the main application processor, which is often an ARM core processor. Such codecs must be memory efficient, processing cycle efficient, portable to multiple operating systems, robust to data loss, and must also have a modular interface. In this paper, we introduce the OpenCORE multimedia framework and associated optimized audio codecs which are a part of the Android platform. We show how these components meet the challenging requirements for use in mobile devices. The OpenCORE audio components are currently available from the Open Handset Alliance as part of the Android SDK, and the source code for these components is scheduled for release in late 2008. The components are thus freely available for use in mobile device projects, and for non-mobile projects as well.

Book ChapterDOI
Juin-Hwey Chen1, Jes Thyssen1
01 Jan 2008
TL;DR: This chapter gives an overview of many variations of the analysis-by-synthesis excitation coding paradigm as exemplified by various speech coding standards around the world.
Abstract: Since the early 1980s, advances in speech coding technologies have enabled speech coders to achieve bit-rate reductions of a factor of 4 to 8 while maintaining roughly the same high speech quality One of the most important driving forces behind this feat is the so-called analysis-by-synthesis paradigm for coding the excitation signal of predictive speech coders In this chapter, we give an overview of many variations of the analysis-by-synthesis excitation coding paradigm as exemplified by various speech coding standards around the world We describe the variations of the same basic theme in the context of different coder structures where these techniques are employed We also attempt to show the relationship between them in the form of a family tree The goal of this chapter is to give the readers a big-picture understanding of the dominant types of analysis-by-synthesis excitation coding techniques for predictive speech coding

Patent
Laurent Pilati1, Mickael Jougit1
23 Jan 2008
TL;DR: In this article, the size of a bit pool allocated for encoding is adapted in a manner that is dependent upon the audio content that is being encoded, such as speech or music, and decreased when the audio input signal represents background noise or silence.
Abstract: In a Bluetooth™ Sub-band Codec (SBC), the size of a bit pool allocated for encoding is adapted in a manner that is dependent upon the audio content that is being encoded. In one implementation, the size of the bit pool is increased during periods when the audio input signal represents an active audio signal, such as speech or music, and decreased when the audio input signal represents background noise or silence. This has the effect of increasing the bit rate (and thus the audio quality) in the presence of speech or music but decreasing the bit rate (and thus the audio quality) in the presence of background noise or silence. By adapting the size of the bit pool in this manner, the quality of the encoded bit-stream transmitted from the SBC encoder may be improved or power may be conserved depending upon the implementation.

Patent
28 Nov 2008
TL;DR: In this article, an apparatus and method of improving the quality of a speech codec is presented, in which a first energy of a signal decoded by a low-band codec is calculated, and a second energy of the signal decoding by a high-band enhancement mode is calculated.
Abstract: An apparatus and method of improving the quality of a speech codec are provided. In the method, a first energy of a signal decoded by a low-band codec is calculated, and a second energy of a signal decoded by a low-band enhancement mode is calculated. Then, when the first energy is less than a first threshold value or less than a product of the second energy and a second threshold value, a size of the decoded signal is scaled. Accordingly, generation of a quantization error with respect to a silence segment is reduced.

Proceedings ArticleDOI
05 May 2008
TL;DR: In this study the application of CELP in AMR is observed and MATLAB program simulation is used to observe and calculate errors occur in the system.
Abstract: In cellular communication technology, quality of voice output at destination depends on the channel condition. Bad channel condition will produce many errors in the voice output and hence the voice quality. To maintain the voice quality in various channel condition AMR is used. Various modes of bit rate is used in AMR, from low to high bit rate is used depend on the channel condition. Low bit rate modes is used in a bad channel condition to allow more bits for channel coding, while high bit rate on the contrary. Recently various speech (source) coding techniques, such as: CELP, ACELP, RPE-LTP, are used in different applications. In this study the application of CELP in AMR is observed. MATLAB program simulation is used to observe and calculate errors occur in the system. The difference of resulted error produced in AMR using CELP is not significant. From low bit rate (5.9 kbps) to high bit rate (12.2 kbps), the error difference is less than 1%.

01 Jan 2008
TL;DR: Results show that this method exploits both intra-beat and inter-beat correlations of the ECG signals to achieve high compression ratios (CR) and a low percent root mean square difference (PRD).
Abstract: inter-frame correlations, video codec technology can be used for ECG compression. For ECG signals, there In this paper, we present a method using is a little difference, so some pre-process will be video codec technology to compress ECG needed: ECG signals should be segmented and period signals. This method exploits both intra-beat normalized to a sequence of beat cycles with the and inter-beat correlations of the ECG sig- same size. Then these beat cycles can be treated as nals to achieve high compression ratios (CR) 'picture frames' and compressed with a video codec. and a low percent root mean square differ- In this work, we present a method using video ence (PRD). Since ECG signals have both codec technology to compress ECG signals. This intra-beat and inter-beat redundancies like method exploits both intra-beat and inter-beat correvideo signals, which have both intra-frame lations of the ECG signals to achieve high compresand inter-frame correlation, video codec tech- sion ratios (CR) and a low percent root mean square nology can be used for ECG compression. In difference (PRD). Although video codec technology order to do this, some pre-process will be was developed to compress video signals, it can be needed. The ECG signals should firstly be used to compress other signals as well, and we illussegmented and normalized to a sequence of trate how video codec technology can be used to combeat cycles with the same length, and then press ECG signals. In Section II, we take a brief overthese beat cycles can be treated as picture view of video codec technology. Section III presents frames and compressed with video codec the coding algorithm. Experimental results and comtechnology. We have used records from MIT- parisons with other algorithm are presented in SecBIH arrhythmia database to evaluate our algo- tion IV. At last, we provide conclusions. rithm. Results show that, besides compression efficiently, this algorithm has the advantages of resolution adjustable, random 2. OVERVIEW OF VIDEO CODEC TECHaccess and flexibility for irregular period and NOLOGY QRS false detection. Representing video material in a digital form requires a large number of bits. The volume of data generated by digitizing a video signal is too large for most storage and transmission systems. This means that compression is essential for most digital video applications. Statistical analysis of video signals indi

Proceedings ArticleDOI
15 Aug 2008
TL;DR: A technique for hiding data in audio signals using subband amplitude modulation was evaluated by a computer simulation in terms of robustness with respect to cascaded disturbances of reverberations, background noises, and the speech codec.
Abstract: A technique for hiding data in audio signals using subband amplitude modulation was evaluated by a computer simulation in terms of robustness with respect to cascaded disturbances of reverberations, background noises, and the speech codec. Speech signals from 22 speakers and signals from 100 pieces of music in various genres were served to the host audio data. Computer simulation showed that the speech and music signals with background noise and reverberations were able to transmit at least 80% of embedded data at 8 bps after encoding and decoding using the AMR speech codec at a bitrate of 12.2 kbps. Objective measurement of sound quality degradation induced by data hiding was performed by PESQ and PEAQ algorithms. The average PESQ score for speech signals approximately corresponded to the subjective evaluation of 'fair'. The average PEAQ score for the music signals was slightly degraded from the subjective difference grade of 'slightly annoying'.

Patent
17 Jul 2008
TL;DR: An audio codec and a BIST method adapted for the audio codec are provided in this article, where a first channel digital-to-analog converter (DAC) converts a test signal into an analog signal.
Abstract: An audio codec and a BIST method adapted for the audio codec are provided. The BIST method includes the following steps. A first channel digital-to-analog converter (DAC) of the audio codec converts a test signal into an analog signal. A first channel analog-to-digital converter (ADC) of the audio codec converts the analog signal into a digital signal. Use a second channel DAC of the audio codec and a second channel ADC of the audio codec to calculate the magnitudes of a plurality of spectral components of the DFT of the digital signal. Determine whether the audio codec passes the test according to the magnitudes of the spectral components.