scispace - formally typeset
Search or ask a question

Showing papers on "Discrete cosine transform published in 1996"


Journal ArticleDOI
TL;DR: This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods.
Abstract: Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range of test material. This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods. The perfor- mance and ease of selecting good thresholds for these algorithms are evaluated based on a wide variety of video sequences with a good mix of transition types. Threshold selection requires a trade-off between recall and precision that must be guided by the target application. © 1996 SPIE and IS&T.

634 citations


Journal ArticleDOI
01 Feb 1996
TL;DR: This two-dimensional 8/spl times/8 discrete cosine transform (DCT) core processor for portable multimedia equipment with HDTV-resolution in a 0.3 /spl mu/m CMOS triple-well double-metal technology operates at 150 MHz from a 09 V power supply and consumes 10 mW, only 2% power dissipation of a previous 3.3 V DCT.
Abstract: A 4 mm/sup 2/, two-dimensional (2-D) 8/spl times/8 discrete cosine transform (DCT) core processor for HDTV-resolution video compression/decompression in a 0.3-/spl mu/m CMOS triple-well, double-metal technology operates at 150 MHz from a 0.9-V power supply and consumes 10 mW, only 2% power dissipation of a previous 3.3-V design. Circuit techniques for dynamically varying threshold voltage (VT scheme) are introduced to reduce active power dissipation with negligible overhead in speed, standby power dissipation, and chip area. A way to explore V/sub DD/-V/sub th/ design space is also studied.

523 citations


Journal ArticleDOI
TL;DR: In this article, the spectral representation of the stochastic field is used to obtain the mean value, autocorrelation function, and power spectral density function of a multi-dimensional, homogeneous Gaussian field.
Abstract: The subject of this paper is the simulation of multi-dimensional, homogeneous, Gaussian stochastic fields using the spectral representation method. Following this methodology, sample functions of the stochastic field can be generated using a cosine series formula. These sample functions accurately reflect the prescribed probabilistic characteristics of the stochastic field when the number of terms in the cosine series is large. The ensemble-averaged power spectral density or autocorrelation function approaches the corresponding target function as the sample size increases. In addition, the generated sample functions possess ergodic characteristics in the sense that the spatially-averaged mean value, autocorrelation function and power spectral density function are identical with the corresponding targets, when the averaging takes place over the multi-dimensional domain associated with the fundamental period of the cosine series. Another property of the simulated stochastic field is that it is asymptotically Gaussian as the number of terms in the cosine series approaches infinity. The most important feature of the method is that the cosine series formula can be numerically computed very efficiently using the Fast Fourier Transform technique. The main area of application of this method is the Monte Carlo solution of stochastic problems in structural engineering, engineering mechanics and physics. Specifically, the method has been applied to problems involving random loading (random vibration theory) and random material and geometric properties (response variability due to system stochasticity).

421 citations


Patent
13 Sep 1996
TL;DR: In this paper, a digital watermarking method and apparatus is proposed for the transmission of a digital video signal in a compressed form, thereby allowing watermark of a pre-compressed video sequence without requiring decoding and re-coding of the signal.
Abstract: A digital watermarking method and apparatus allows for the watermarking of a digital video signal in a compressed form, thereby allowing watermarking of a pre-compressed video sequence without requiring the decoding and re-coding of the signal. The watermark signal is a sequence of information bits which has been modulated by a pseudo-random noise sequence to spread it in the frequency domain. The video signal is transform coded, preferably with a discrete cosine transform, and a watermark signal, which has been transform coded using the same type of transform, is added to the coded video signal. The system also includes bitstream control to prevent an increase in the bit rate of the video signal. This allows the system to be used with transmission channels having strict bit rate constraints. For each transform coefficient of the video signal, the number of bits necessary to encode the watermarked coefficient is compared to the number of bits necessary to encode the unwatermarked coefficient. If more bits are required to transmit a watermarked coefficient than to transmit the corresponding unwatermarked coefficient, the watermarked coefficient is not output, and the unwatermarked coefficient is output in its place. When watermarking interframe coded data, a drift compensation signal may be used to compensate for the accumulating variations in the decoded video signal stored at the receiver. The system may also include an encryption/decryption capability, with the watermarking apparatus located at either the transmitting or receiving end of the transmission channel.

336 citations


Proceedings ArticleDOI
16 Sep 1996
TL;DR: The algorithms proposed select certain blocks in the image based on a Gaussian network classifier such that their discrete cosine transform (DCT) coefficients fulfil a constraint imposed by the watermark code.
Abstract: Watermarking algorithms are used for image copyright protection. The algorithms proposed select certain blocks in the image based on a Gaussian network classifier. The pixel values of the selected blocks are modified such that their discrete cosine transform (DCT) coefficients fulfil a constraint imposed by the watermark code. Two different constraints are considered. The first approach consists of embedding a linear constraint among selected DCT coefficients and the second one defines circular detection regions in the DCT domain. A rule for generating the DCT parameters of distinct watermarks is provided. The watermarks embedded by the proposed algorithms are resistant to JPEG compression.

283 citations


Proceedings ArticleDOI
TL;DR: This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods.
Abstract: Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range of test material. This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods. The performance and ease of selecting good thresholds for these algorithms are evaluated based on a wide variety of video sequences with a good mix of transition types. Threshold selection requires a trade-off between recall and precision that must be guided by the target application.© (1996) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

273 citations


Journal ArticleDOI
TL;DR: The error-resilient entropy code (EREC) is introduced as a method for adapting existing schemes to give increased resilience to random and burst errors while maintaining high compression.
Abstract: Many source and data compression schemes work by splitting the input signal into blocks and producing variable-length coded data for each block. If these variable-length blocks are transmitted consecutively, then the resulting coder is highly sensitive to channel errors. Synchronization code words are often used to provide occasional resynchronization at the expense of some added redundant information. This paper introduces the error-resilient entropy code (EREC) as a method for adapting existing schemes to give increased resilience to random and burst errors while maintaining high compression. The EREC has been designed to exhibit graceful degradation with worsening channel conditions. The EREC is applicable to many problems and is particularly effective when the more important information is transmitted near the start of each variable-length block and is not dependent on following data. The EREC has been applied to both still image and video compression schemes, using the discrete cosine transform (DCT) and variable-length coding. The results have been compared to schemes using synchronization code words, and a large improvement in performance for noisy channels has been observed.

271 citations


Journal ArticleDOI
TL;DR: The results indicate that the FFT and DCT computation of oxygen saturation were as accurate without averaging, as weighted moving average (WMA) algorithms currently being used, and directly indicate when erroneous calculations occur.

240 citations


Journal ArticleDOI
TL;DR: A class of lapped orthogonal transforms with extended overlap (GenLOTs) is developed as a subclass of the general class of LPPUFB as a method to process finite-length signals.
Abstract: The general factorization of a linear-phase paraunitary filter bank (LPPUFB) is revisited. From this new perspective, a class of lapped orthogonal transforms with extended overlap (generalized linear-phase lapped orthogonal transforms (GenLOTs)) is developed as a subclass of the general class of LPPUFB. In this formulation, the discrete cosine transform (DCT) is the order-1 GenLOT, the lapped orthogonal transform is the order-2 GenLOT, and so on, for any filter length that is an integer multiple of the block size. The GenLOTs are based on the DCT and have fast implementation algorithms. The implementation of GenLOTs is explained, including the method to process finite-length signals. The degrees of freedom in the design of GenLOTs are described, and design examples are presented along with image compression tests.

235 citations


Journal ArticleDOI
TL;DR: This work points out that the wavelet transform is just one member in a family of linear transformations, and the discrete cosine transform (DCT) can be coupled with an embedded zerotree quantizer, and presents an image coder that outperforms any other DCT-based coder published in the literature.
Abstract: Since Shapiro (see ibid., vol.41, no.12, p. 445, 1993) published his work on embedded zerotree wavelet (EZW) image coding, there have been increased research activities in image coding centered around wavelets. We first point out that the wavelet transform is just one member in a family of linear transformations, and the discrete cosine transform (DCT) can also be coupled with an embedded zerotree quantizer. We then present such an image coder that outperforms any other DCT-based coder published in the literature, including that of the Joint Photographers Expert Group (JPEG). Moreover, our DCT-based embedded image coder gives higher peak signal-to-noise ratios (PSNR) than the quoted results of Shapiro's EZW coder.

225 citations


Journal ArticleDOI
TL;DR: In this article, a cosine transform between the order and angle of the Chebyshev operator is identified, and the cosine transformation is considered as an evolution state in the order domain, analogous to a time dependent wave packet.
Abstract: A cosine transform between the order and angle of the Chebyshev operator is identified Because the order and angle form a conjugate pair similar to energy and time, the Chebyshev state can be considered as a cosine‐type evolution state in the order domain, analogous to a time‐dependent wave packet The order/angle formulation is analytically equivalent to the time/energy formulation, but the former may have some numerical advantages in certain applications This is illustrated by examining the spectral method and the filter‐diagonalization method in both formulations

Proceedings ArticleDOI
TL;DR: This paper considers the detection of areas of interest and edges in images compressed using the discrete cosine transform (DCT) and shows how a measure based on certain DCT coefficients of a block can provide an indication of underlying activity.
Abstract: This paper examines the issue of direct extraction of low level features from compressed images. Specifically, we consider the detection of areas of interest and edges in images compressed using the discrete cosine transform (DCT). For interest areas, we show how a measure based on certain DCT coefficients of a block can provide an indication of underlying activity. For edges, we show using an ideal edge model how the relative values of different DCT coefficients of a block can be used to estimate the strength and orientation of an edge. Our experimental results indicate that coarse edge information from compressed images can be extracted up to 20 times faster than conventional edge detectors.

Journal ArticleDOI
TL;DR: This work addresses the problem of retrieving images from a large database using an image as a query, specifically aimed at databases that store images in JPEG format, and works in the compressed domain to create index keys.
Abstract: We address the problem of retrieving images from a large database using an image as a query. The method is specifically aimed at databases that store images in JPEG format, and works in the compressed domain to create index keys. A key is generated for each image in the database and is matched with the key generated for the query image. The keys are independent of the size of the image. Images that have similar keys are assumed to be similar, but there is no semantic meaning to the similarity.

ReportDOI
01 Jun 1996
TL;DR: The data embedding method applies to host data compressed with transform, or {open_ quote}lossy{close_quote} compression algorithms, as for example ones based on discrete cosine transform and wavelet functions.
Abstract: Data embedding is a new steganographic method for combining digital information sets. This paper describes the data embedding method and gives examples of its application using software written in the C-programming language. Sandford and Handel produced a computer program (BMPEMBED, Ver. 1.51 written for IBM PC/AT or compatible, MS/DOS Ver. 3.3 or later) that implements data embedding in an application for digital imagery. Information is embedded into, and extracted from, Truecolor or color-pallet images in Microsoft{reg_sign} bitmap (.BMP) format. Hiding data in the noise component of a host, by means of an algorithm that modifies or replaces the noise bits, is termed {open_quote}steganography.{close_quote} Data embedding differs markedly from conventional steganography, because it uses the noise component of the host to insert information with few or no modifications to the host data values or their statistical properties. Consequently, the entropy of the host data is affected little by using data embedding to add information. The data embedding method applies to host data compressed with transform, or {open_quote}lossy{close_quote} compression algorithms, as for example ones based on discrete cosine transform and wavelet functions. Analysis of the host noise generates a key required for embedding and extracting the auxiliary data from the combined data.more » The key is stored easily in the combined data. Images without the key cannot be processed to extract the embedded information. To provide security for the embedded data, one can remove the key from the combined data and manage it separately. The image key can be encrypted and stored in the combined data or transmitted separately as a ciphertext much smaller in size than the embedded data. The key size is typically ten to one-hundred bytes, and it is in data an analysis algorithm.« less

Patent
Hiroki Fukuoka1, Tetsuya Hashimoto1
21 Feb 1996
TL;DR: In this paper, a digital electronic still camera has an image pickup device to pick up an image to be photographed and the picked-up image data is compressed by a data compressing unit using the two-dimensional discrete cosine transformation (DCT), the optimum quantization table, and the Huffman coding.
Abstract: A digital electronic still camera has an image pickup device to pick up an image to be photographed. The picked-up image data is compressed by a data compressing unit using the two-dimensional discrete cosine transformation (DCT), the optimum quantization table, and the Huffman coding. The optimum quantization table is selected by comparing a characteristic parameter of the image data with preliminarily obtained values for respective image kinds. The two-dimensional DCT provides transformation coefficients of image, which are linearly quantized with reference to the optimum quantization table. The quantized data is coded by the Huffman coding method to have a minimum length. The coded data is sent to a recording device, which records the coded data in a recording medium such as a memory card. Fuzzy control theory may be applied to determine whether the image attributes to a document, a portrait, or a landscape. The image attribute may also manually be set, for example, on an input device.

01 Jan 1996
TL;DR: A 4 mm2, two-dimensional (2-D) 8 8 discrete cosine transform (DCT) core processor for HDTV-resolution video compression/decompression in a 0.3m CMOS triple-well, double-metal technology operates at 150 MHz from a 0-V power supply and consumes 10 mW, only 2% power dissipation of a previous 3.3-V design.
Abstract: A 4 mm2, two-dimensional (2-D) 8 8 discrete cosine transform (DCT) core processor for HDTV-resolution video compression/decompression in a 0.3m CMOS triple-well, double-metal technology operates at 150 MHz from a 0.9-V power supply and consumes 10 mW, only 2% power dissipation of a previous 3.3-V design. Circuit techniques for dynamically varying threshold voltage (VT scheme) are introduced to reduce active power dissipation with negligible overhead in speed, standby power dissipation, and chip area. A way to exploreVDD Vth design space is also studied.

Book
01 Jan 1996
TL;DR: PREFACE INTRODUCTION Digital Signal Processing How to Read this Text Introduction to MATLAB Signals, Vectors, and Arrays Review of Vector and Matrix Algebra using Matlab Notation Geometric Series and Other Formulas Matlab Functions in DSP.
Abstract: PREFACE INTRODUCTION Digital Signal Processing How to Read this Text Introduction to MATLAB Signals, Vectors, and Arrays Review of Vector and Matrix Algebra Using Matlab Notation Geometric Series and Other Formulas Matlab Functions in DSP The Chapters Ahead References LEAST SQUARES, ORTHOGONALITY, AND THE FOURIER SERIES Introduction Least Squares Orthogonality The Discrete Fourier Series Exercises References CORRELATION, FOURIER SPECTRA, AND THE SAMPLING THEOREM Introduction Correlation The Discrete Fourier Transform (DFT) Redundancy in the DFT The FFT algorithm Amplitude and Phase Spectra The Inverse DFT Properties of the DFT Continuous Transforms The Sampling Theorem Waveform Reconstruction and Aliasing Exercises References LINEAR SYSTEMS AND TRANSFER FUNCTIONS Continuous and Discrete Linear Systems Properties of Discrete Linear Systems Discrete Convolution The z-Transform and Linear Transfer Functions Poles and Zeros Transient Response and Stability System Response via the Inverse z-Transform Cascade, Parallel, and Feedback Structures Direct Algorithms State-Space Algorithms Lattice Algorithms and Structures FFT Algorithms Discrete Linear Systems and Digital Filters Exercises References FIR FILTER DESIGN Introduction An Ideal Lowpass Filter The Realizable Version Improving an FIR Filter with Window Functions Highpass, Bandpass, and Bandstop Filters A Complete FIR Filtering Example Other Types of FIR Filters Exercises References IIR FILTER DESIGN Introduction Linear Phase Butterworth Filters Chebyshev Filters Frequency Translations The Bilinear Transformation IIR Digital Filters Other Types of IIR Filters Exercises References RANDOM SIGNAL AND SPECTRAL ESTIMATION Introduction Amplitude Distributions Uniform, Gaussian, and Other Distributions Power and Power Density Spectra Properties of the Power Spectrum Power Spectral Estimation Data Windows in Spectral Estimation The Cross-Power Spectrum Algorithms Exercises References LEAST-SQUARES SYSTEM DESIGN Introduction Applications of Least-Squares Design System Design via the Mean-Squared Error A Design Example Least-Squares Design with Finite Signal Vectors Correlation and Covariance Computation Channel Equalization System Identification Interference Canceling Linear Prediction and Recovery Effects of Independent Broadband Noise Exercises References ADAPTIVE SIGNAL PROCESSING Introduction The Mean-Squared Error Performance Surface Searching the Performance Surface Steepest Descent and the LMS Algorithm LMS Example Direct Descent and the RLS Algorithm Measures of Adaptive System Performance Other Adaptive Structures and Algorithms Exercises References SIGNAL INFORMATION, CODING AND COMPRESSION Introduction Measuring Information Two Ways to Compress Signals Entropy Coding Transform Coding and the Discrete Cosine Transform Multirate Signal Decomposition and Subband Coding Time-Frequency Analysis and Wavelet Transforms Exercises References INDEX


Proceedings ArticleDOI
16 Sep 1996
TL;DR: This work compute the perceptual error for each block based upon the DCT quantization error adjusted according to the contrast sensitivity, light adaptation, and contrast masking, and pick the set of multipliers which yield maximally flat perceptual error over the blocks of the image.
Abstract: An extension to the JPEG standard (ISO/IEC DIS 10918-3) allows spatial adaptive coding of still images. As with baseline JPEG coding, one quantization matrix applies to an entire image channel, but in addition the user may specify a multiplier for each 8/spl times/8 block, which multiplies the quantization matrix, yielding the new matrix for that block. MPEG 1 and 2 use much the same scheme, except there the multiplier changes only on macroblock boundaries. We propose a method for perceptual optimization of the set of multipliers. We compute the perceptual error for each block based upon the DCT quantization error adjusted according to the contrast sensitivity, light adaptation, and contrast masking, and pick the set of multipliers which yield maximally flat perceptual error over the blocks of the image. We investigate the bit rate savings due to this adaptive coding scheme and the relative importance of the different sorts of masking on adaptive coding.

Journal ArticleDOI
TL;DR: An iterative unwrapping technique recently published, based on least-squares minimization, obtained by the discrete cosine transform is investigated, and is shown to be fast, easy to implement, robust in the presence of noise, and able to handle phase inconsistencies without propagating local errors.
Abstract: Current whole-field interferometric techniques yield a phase distribution in modulo 2π. Removal of the resulting cyclic discontinuities is a process known as unwrapping, which must be performed before the data can be interpreted. We investigate an iterative unwrapping technique recently published by Ghiglia and Romero [J. Opt. Soc. Am. A 11, 107 (1994)], which is based on least-squares minimization, obtained by the discrete cosine transform. We apply this technique to remove phase wraps from electronic speckle pattern interferometry data, using modest personal computer hardware. The algorithm is shown to be fast, easy to implement, robust in the presence of noise, and able to handle phase inconsistencies without propagating local errors.

Journal ArticleDOI
TL;DR: This paper investigates a modified DCT computation scheme, to be called the subband DCT (SB-DCT), that provides a simple, efficient solution to the reduction of the block artifacts while achieving faster computation.
Abstract: The discrete cosine transform (DCT) is well known for its highly efficient coding performance and is widely used in many image compression applications. However, in low bit rate coding, it produces undesirable block artifacts that are visually not pleasing. In addition, in many practical applications, faster computation and easier VLSI implementation of DCT coefficients are also important issues. The removal of the block artifacts and faster DCT computation are therefore of practical interest. In this paper, we investigate a modified DCT computation scheme, to be called the subband DCT (SB-DCT), that provides a simple, efficient solution to the reduction of the block artifacts while achieving faster computation. We have applied the new approach for the low bit rate coding and decoding of images. Simulation results on real images have verified the improved performance obtained using the proposed method over the standard JPEG method.

Patent
21 Jun 1996
TL;DR: In this paper, a method and apparatus for video image compression using a unique operand decomposition technique combined with an innovative data scatter and retrieve process is presented, which allows the use of single ported RAM structures where multiported RAMS would normally be used, such as when retrieving two operands in the same time cycle.
Abstract: A method and apparatus is presented for video image compression using a unique operand decomposition technique combined with an innovative data scatter and retrieve process. This combination of features allows the use of single ported RAM structures where multiported RAMS would normally be used, such as when retrieving two operands in the same time cycle. As applied to the Discrete Cosine Transformation this method and apparatus additionally allows elimination of the usual prior art use of a separate transpose matrix buffer. The elimination of the separate transpose matrix buffer is accomplished by combining the transpose matrix intermediate results memory storage with the memory buffer used for the other intermediate results in a double buffer system. The double buffer memory locations are chosen so that the intermediate storage register address are orthogonal to the initial source addresses, thereby using one of the properties of the Discrete Cosine Transform to improve speed of operation and reduce the circuit area and system cost.

Patent
12 Dec 1996
TL;DR: In this paper, a decoder for decoding MPEG video bitstreams encoded in any color space encoding format and outputting the decoded video bit stream to different sized windows is disclosed.
Abstract: A decoder is disclosed for decoding MPEG video bitstreams encoded in any color space encoding format and outputting the decoded video bitstream to different sized windows. Both MPEG decompression and color space decoding and conversion are performed on the bitstreams within the same decoder. The disclosed decoder may be programmed to output the decoded video bitstream in any of three primary color space formats comprising YUV 4:2:0, YUV 4:2:2, and YUV 4:4:4. The decoder may also output the decoded bitstream to different sized windows using Discrete Cosine Transform (DCT) based image resizing.

Proceedings ArticleDOI
16 Sep 1996
TL;DR: This work proposes a new approach that compresses image blocks using a layered representation, derived from progressive JPEG, and has been combined with CR and optimized for efficient software implementation to provide an improved solution for Internet packet video.
Abstract: Several compression schemes for Internet video utilize block-based conditional replenishment (CR) where block updates are coded independently of the past. In the current Internet video tools, blocks are compressed with a single-layer representation. We propose a new approach that compresses image blocks using a layered representation. Our layered-DCT (LDCT) compression algorithm, derived from progressive JPEG, has been combined with CR and optimized for efficient software implementation to provide an improved solution for Internet packet video. Although LDCT is constrained to a layered representation, its compression performance is as good or better than the single layer Intra-H.261 and baseline JPEG coding schemes.

PatentDOI
TL;DR: A method and apparatus for encoding an input signal, such as a broad-range speech signal, in which a number of decoding operations with different bit rates are enabled for assuring a high encoding bit rate and for minimizing deterioration of the reproduced sound even with a low bit rate.
Abstract: A method and apparatus for encoding an input signal, such as a broad-range speech signal, in which a number of decoding operations with different bit rates are enabled for assuring a high encoding bit rate and for minimizing deterioration of the reproduced sound even with a low bit rate. The signal encoding method includes a band-splitting step for splitting an input signal into a number of bands and a step of encoding signals of the bands in a different manner depending on signal characteristics of the bands. Specifically, a low-range side signal is taken out by a low-pass filter from an input signal entering a terminal, and analyzed for Linear Predictive coding by an Linear Predictive coding analysis quantization unit. After finding the Linear Predictive coding residuals, as short-term prediction residuals by an Linear Predictive coding inverted filter, the pitch is found by a pitch analysis circuit. Then, pitch residuals are found by long-term prediction by a pitch inverted filter. The pitch residuals are processed with modified discrete cosine transform by a modified discrete cosine transform (MDCT) circuit and vector-quantized by a vector-quantization circuit. The resulting quantization indices are transmitted along with the pitch lag and the pitch gain. The linear spectral pairs linear spectral pairs are also sent as parameter representing LPC coefficients.

Proceedings ArticleDOI
D. Sinha1, J.D. Johnston
07 May 1996
TL;DR: A novel switched filter-bank scheme which switches between a modified discrete cosine transform and a wavelet filter- bank based on the signal characteristics is proposed which allows for the optimum exploitation of perceptual irrelevancies.
Abstract: A perceptual audio coder typically consists of a filter-bank which breaks the signal into its frequency components. These components are then quantized using a perceptual masking model. Previous efforts have indicated that a high resolution filter-bank, e.g., the modified discrete cosine transform (MDCT) with 1024 subbands, is able to minimize the bit rate requirements for most of the music samples. The high resolution MDCT, however, is not suitable for the encoding of non-stationary segments of music. A long/short resolution or "window" switching scheme has been employed to overcome this problem but it has certain inherent disadvantages which become prominent at lower bit rates (<64 kbps for stereo). We propose a novel switched filter-bank scheme which switches between a MDCT and a wavelet filter-bank based on the signal characteristics. A tree structured wavelet filter-bank with properly designed filters offers natural advantages for the representation of non-stationary segments such as attacks. Furthermore, it allows for the optimum exploitation of perceptual irrelevancies.

Journal ArticleDOI
TL;DR: Subjective results confirm the efficacy of the proposed classified coder over the RMS based H.261 coder in two ways: it consistently produces better quality sequences and achieves a bit rate saving of 35% when measuring at the same picture quality.
Abstract: A new technique of adaptively classifying the scene content of an image block has been developed in the proposed perceptual coder. It measures the texture masking energy of an image block and classifies it into one of four perceptual classes: flat, edge, texture, and fine-texture. Each class has an associated factor to adapt the quantizer with the aim of achieving constant quality across an image. A second feature of the perceptual coder is the visual thresholding, a process that reduces bit rate by discarding subthreshold discrete cosine transform (DCT) coefficients without degrading the image perceived quality. Finally, further quality gain is achieved by an improved reference model 8 (RM8) intramode decision, which removes sticking noise artifacts from newly uncovered background found in H.261 coded sequences. Subjective viewing tests, guided by Rec. 500-5, were conducted with 30 subjects. Subjective results confirm the efficacy of the proposed classified coder over the RMS based H.261 coder in two ways: (i) it consistently produces better quality sequences (with a mean opinion score, MOS, of approximately 2.0) when comparing at any fix bit rate; and (ii) it achieves a bit rate saving of 35% when measuring at the same picture quality (i.e., same MOS).

Proceedings ArticleDOI
31 Mar 1996
TL;DR: This work describes here a wavelet-based algorithm that operates directly in the highest dimension available, and which has been used to successfully compress geophysical data with no observable loss of geophysical information at compression ratios substantially greater than 100:1.
Abstract: Seismic data have a number of unique characteristics that differentiate them from the still image and video data that are the focus of most lossy coding research efforts. Seismic data occupy three or four dimensions, and have a high degree of anisotropy with substantial amounts of noise. Two-dimensional coding approaches based on wavelets or the DCT achieve only modest compression ratios on such data because of these statistical properties, and because 2D approaches fail to fully leverage the redundancy in the higher dimensions of the data. We describe here a wavelet-based algorithm that operates directly in the highest dimension available, and which has been used to successfully compress geophysical data with no observable loss of geophysical information at compression ratios substantially greater than 100:1. This algorithm was successfully field tested on a vessel in the North Sea in July 1995, demonstrating the feasibility of performing on-board real-time compression and satellite downloading from marine seismic data acquisition platforms.

Patent
Byeungwoo Jeon1, Jechang Jeong1
19 Jan 1996
TL;DR: In this paper, a post-processing device for eliminating a blocking artifact generated upon reconstructing an image compressed by block transform operation and a method thereof minimize a blocking artifacts at block boundaries by selecting a predetermined discrete cosine transform (DCT), estimating transform coefficients with respect to the information lost upon quantization or inverse quantization, performing an inverse transform operation on the estimated transform coefficients and adding the thus-obtained adjustment value to an inverse-transform-operated reconstructed image signal.
Abstract: A post-processing device for eliminating a blocking artifact generated upon reconstructing an image compressed by block transform operation and a method thereof minimize a blocking artifact at block boundaries by selecting a predetermined discrete cosine transform (DCT), estimating transform coefficients with respect to the information lost upon quantization or inverse quantization to have the highest continuity with respect to adjacent blocks, performing an inverse transform operation on the estimated transform coefficients, and adding the thus-obtained adjustment value to an inverse-transform-operated reconstructed image signal.

Proceedings ArticleDOI
TL;DR: An analysis of a broad suite of images confirms previous findings that a Laplacian distribution can be used to model the luminance ac coefficients and the distribution model is applied to improve dynamic generation of quantization matrices.
Abstract: Many image and video compression schemes perform the discrete cosine transform (DCT) to represent image data in frequency space. An analysis of a broad suite of images confirms previous findings that a Laplacian distribution can be used to model the luminance ac coefficients. This model is expanded and applied to color space (Cr/Cb) coefficients. In MPEG, the DCT is used to code interframe prediction error terms. The distribution of these coefficients is explored. Finally, the distribution model is applied to improve dynamic generation of quantization matrices.