scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Interpolative multiresolution coding of advance television with compatible subchannels

TL;DR: A multiresolution representation for video signals is introduced and Interpolation in an FIR (finite impulse response) scheme solves uncovered area problems, considerably improving the temporal prediction.
Abstract: A multiresolution representation for video signals is introduced. A three-dimensional spatiotemporal pyramid algorithms for high-quality compression of advanced television sequences is presented. The scheme utilizes a finite memory structure and is robust to channel errors, provides compatible subchannels, and can handle different scan formats, making it well suited for the broadcast environment. Additional features such as fast random access and reverse playback make it suitable for digital storage as well. Model-based processing is used both over space and time, where motion-based interpolation is used. Interpolation in an FIR (finite impulse response) scheme solves uncovered area problems, considerably improving the temporal prediction. The complexity is comparable to that of previous recursive schemes. Computer simulations indicate that high compression factors (about an order of magnitude) are easily achieved with no apparent loss of quality. The scheme also has a number of commonalities with the emerging MPEG standard. >
Citations
More filters
Book
01 Mar 1995
TL;DR: Wavelets and Subband Coding offered a unified view of the exciting field of wavelets and their discrete-time cousins, filter banks, or subband coding and developed the theory in both continuous and discrete time.
Abstract: First published in 1995, Wavelets and Subband Coding offered a unified view of the exciting field of wavelets and their discrete-time cousins, filter banks, or subband coding. The book developed the theory in both continuous and discrete time, and presented important applications. During the past decade, it filled a useful need in explaining a new view of signal processing based on flexible time-frequency analysis and its applications. Since 2007, the authors now retain the copyright and allow open access to the book.

2,793 citations


Cites background or methods from "Interpolative multiresolution codin..."

  • ...This video compression scheme was studied in [301, 302, 303]....

    [...]

  • ...The property holds also for multilevel pyramids if one uses quantization error feedback [303]....

    [...]

  • ...This coding scheme was implemented for high quality coding of HDTV with a compatible subchannel and it performed well at medium compression (of the order of 10-15 to 1) with essentially no visible degradation [301, 303]....

    [...]

Journal ArticleDOI
TL;DR: The article provides arguments in favor of an alternative approach that uses splines, which is equally justifiable on a theoretical basis, and which offers many practical advantages, and brings out the connection with the multiresolution theory of the wavelet transform.
Abstract: The article provides arguments in favor of an alternative approach that uses splines, which is equally justifiable on a theoretical basis, and which offers many practical advantages. To reassure the reader who may be afraid to enter new territory, it is emphasized that one is not losing anything because the traditional theory is retained as a particular case (i.e., a spline of infinite degree). The basic computational tools are also familiar to a signal processing audience (filters and recursive algorithms), even though their use in the present context is less conventional. The article also brings out the connection with the multiresolution theory of the wavelet transform. This article attempts to fulfil three goals. The first is to provide a tutorial on splines that is geared to a signal processing audience. The second is to gather all their important properties and provide an overview of the mathematical and computational tools available; i.e., a road map for the practitioner with references to the appropriate literature. The third goal is to give a review of the primary applications of splines in signal and image processing.

1,732 citations

Journal ArticleDOI
TL;DR: The scheme is interpreted more generally, viewed as a motion-compensated short-time spectral analysis of video sequences, which can adapt to the quickness of changes.
Abstract: Three-dimensional (3-D) frequency coding is an alternative approach to hybrid coding concepts used in today's standards. The first part of this paper presents a study on concepts for temporal-axis frequency decomposition along the motion trajectory in video sequences. It is shown that, if a two-band split is used, it is possible to overcome the problem of spatial inhomogeneity in the motion vector field (MVF), which occurs at the positions of uncovered and covered areas. In these cases, original pixel values from one frame are placed into the lowpass-band signal, while displaced-frame-difference values are embedded into the highpass band. This technique is applicable with arbitrary MVF's; examples with block-matching and interpolative motion compensation are given. Derivations are first performed for the example of two-tap quadrature mirror filters (QMF's), and then generalized to any linear-phase QMF's. With two-band analysis and synthesis stages arranged as cascade structures, higher resolution frequency decompositions are realizable. In the second part of the paper, encoding of the temporal-axis subband signals is discussed. A parallel filterbank scheme was used for spatial subband decomposition, and adaptive lattice vector quantization was employed to approach the entropy rate of the 3-D subband samples. Coding results suggest that high-motion video sequences can be encoded at significantly lower rates than those achievable with conventional hybrid coders. Main advantages are the high energy compaction capability and the nonrecursive decoder structure. In the conclusion, the scheme is interpreted more generally, viewed as a motion-compensated short-time spectral analysis of video sequences, which can adapt to the quickness of changes. Although a 3-D multiresolution representation of the picture information is produced, a true multiresolution representation of motion information, based on spatio-temporal decimation and interpolation of the MVF, is regarded as the still-missing part. >

625 citations

Journal ArticleDOI
TL;DR: The search speed of the proposed ARPS-ZMP is about two to three times faster than that of the diamond search (DS), and the method even achieves higher peak signal-to-noise ratio (PSNR) particularly for those video sequences containing large and/or complex motion contents.
Abstract: We propose a novel and simple fast block-matching algorithm (BMA), called adaptive rood pattern search (ARPS), which consists of two sequential search stages: (1) initial search and (2) refined local search. For each macroblock (MB), the initial search is performed only once at the beginning in order to find a good starting point for the follow-up refined local search. By doing so, unnecessary intermediate search and the risk of being trapped into local minimum matching error points could be greatly reduced in long search case. For the initial search stage, an adaptive rood pattern (ARP) is proposed, and the ARP's size is dynamically determined for each MB, based on the available motion vectors (MVs) of the neighboring MBs. In the refined local search stage, a unit-size rood pattern (URP) is exploited repeatedly, and unrestrictedly, until the final MV is found. To further speed up the search, zero-motion prejudgment (ZMP) is incorporated in our method, which is particularly beneficial to those video sequences containing small motion contents. Extensive experiments conducted based on the MPEG-4 Verification Model (VM) encoding platform show that the search speed of our proposed ARPS-ZMP is about two to three times faster than that of the diamond search (DS), and our method even achieves higher peak signal-to-noise ratio (PSNR) particularly for those video sequences containing large and/or complex motion contents.

605 citations


Additional excerpts

  • ...In [18], constant block size is used so that one block in the coarser level covers several corresponding blocks at its next finer level....

    [...]

Journal ArticleDOI
TL;DR: It is shown how a certain monotonicity property of the dependent R-D curves can be exploited in formulating fast ways to obtain optimal and near-optimal solutions and how to obtain fast solutions that provide nearly optimal full resolution quality while providing much better performance for the subresolution layer.
Abstract: We address the problem of efficient bit allocation in a dependent coding environment. While optimal bit allocation for independently coded signal blocks has been studied in the literature, we extend these techniques to the more general temporally and spatially dependent coding scenarios. Of particular interest are the topical MPEG video coder and multiresolution coders. Our approach uses an operational rate-distortion (R-D) framework for arbitrary quantizer sets. We show how a certain monotonicity property of the dependent R-D curves can be exploited in formulating fast ways to obtain optimal and near-optimal solutions. We illustrate the application of this property in specifying intelligent pruning conditions to eliminate suboptimal operating points for the MPEG allocation problem, for which we also point out fast nearly-optimal heuristics. Additionally, we formulate an efficient allocation strategy for multiresolution coders, using the spatial pyramid coder as an example. We then extend this analysis to a spatio-temporal 3-D pyramidal coding scheme. We tackle the compatibility problem of optimizing full-resolution quality while simultaneously catering to subresolution bit rate or quality constraints. We show how to obtain fast solutions that provide nearly optimal (typically within 0.3 dB) full resolution quality while providing much better performance for the subresolution layer (typically 2-3 dB better than the full-resolution optimal solution). >

492 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Journal ArticleDOI
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations

Journal ArticleDOI
TL;DR: The author describes the mathematical properties of such decompositions and introduces the wavelet transform, which relates to the decomposition of an image into a wavelet orthonormal basis.
Abstract: The author reviews recent multichannel models developed in psychophysiology, computer vision, and image processing. In psychophysiology, multichannel models have been particularly successful in explaining some low-level processing in the visual cortex. The expansion of a function into several frequency channels provides a representation which is intermediate between a spatial and a Fourier representation. The author describes the mathematical properties of such decompositions and introduces the wavelet transform. He reviews the classical multiresolution pyramidal transforms developed in computer vision and shows how they relate to the decomposition of an image into a wavelet orthonormal basis. He discusses the properties of the zero crossings of multifrequency channels. Zero-crossing representations are particularly well adapted for pattern recognition in computer vision. >

2,109 citations

Journal ArticleDOI
TL;DR: The perfect reconstruction condition is posed as a Bezout identity, and it is shown how it is possible to find all higher-degree complementary filters based on an analogy with the theory of Diophantine equations.
Abstract: The wavelet transform is compared with the more classical short-time Fourier transform approach to signal analysis. Then the relations between wavelets, filter banks, and multiresolution signal processing are explored. A brief review is given of perfect reconstruction filter banks, which can be used both for computing the discrete wavelet transform, and for deriving continuous wavelet bases, provided that the filters meet a constraint known as regularity. Given a low-pass filter, necessary and sufficient conditions for the existence of a complementary high-pass filter that will permit perfect reconstruction are derived. The perfect reconstruction condition is posed as a Bezout identity, and it is shown how it is possible to find all higher-degree complementary filters based on an analogy with the theory of Diophantine equations. An alternative approach based on the theory of continued fractions is also given. These results are used to design highly regular filter banks, which generate biorthogonal continuous wavelet bases with symmetries. >

1,804 citations

01 Jan 1986

1,696 citations