scispace - formally typeset
Search or ask a question

Showing papers on "Upsampling published in 2008"


Journal ArticleDOI
TL;DR: Three new algorithms for 2D translation image registration to within a small fraction of a pixel that use nonlinear optimization and matrix-multiply discrete Fourier transforms are compared to evaluate a translation-invariant error metric.
Abstract: Three new algorithms for 2D translation image registration to within a small fraction of a pixel that use nonlinear optimization and matrix-multiply discrete Fourier transforms are compared. These algorithms can achieve registration with an accuracy equivalent to that of the conventional fast Fourier transform upsampling approach in a small fraction of the computation time and with greatly reduced memory requirements. Their accuracy and computation time are compared for the purpose of evaluating a translation-invariant error metric.

1,715 citations


Journal ArticleDOI
01 Dec 2008
TL;DR: A feedback-control framework which faithfully recovers the high-resolution image information from the input data, without imposing additional local structure constraints learned from other examples, which makes the method independent of the quality and number of the selected examples, while producing high-quality results without observable unsightly artifacts.
Abstract: We propose a simple but effective upsampling method for automatically enhancing the image/video resolution, while preserving the essential structural information. The main advantage of our method lies in a feedback-control framework which faithfully recovers the high-resolution image information from the input data, without imposing additional local structure constraints learned from other examples. This makes our method independent of the quality and number of the selected examples, which are issues typical of learning-based algorithms, while producing high-quality results without observable unsightly artifacts. Another advantage is that our method naturally extends to video upsampling, where the temporal coherence is maintained automatically. Finally, our method runs very fast. We demonstrate the effectiveness of our algorithm by experimenting with different image/video data.

330 citations


05 Oct 2008
TL;DR: This work presents an adaptive multi-lateral upsampling filter that takes into account the inherent noisy nature of real-time depth data and can greatly improve reconstruction quality, boost the resolution of the data to that of the video sensor, and prevent unwanted artifacts like texture copy into geometry.
Abstract: A new generation of active 3D range sensors, such as time-of-flight cameras, enables recording of full-frame depth maps at video frame rate. Unfortunately, the captured data are typically starkly contaminated by noise and the sensors feature only a rather limited image resolution. We therefore present a pipeline to enhance the quality and increase the spatial resolution of range data in real-time by upsampling the range information with the data from a high resolution video camera. Our algorithm is an adaptive multi-lateral upsampling filter that takes into account the inherent noisy nature of real-time depth data. Thus, we can greatly improve reconstruction quality, boost the resolution of the data to that of the video sensor, and prevent unwanted artifacts like texture copy into geometry. Our technique has been crafted to achieve improvement in depth map quality while maintaining high computational efficiency for a real-time application. By implementing our approach on the GPU, the creation of a real-time 3D camera with video camera resolution is feasible.

323 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: It is shown that ideas from traditional color image superresolution can be applied to TOF cameras in order to obtain 3D data of higher X-Y resolution and less noise.
Abstract: Time-of-flight (TOF) cameras robustly provide depth data of real world scenes at video frame rates. Unfortunately, currently available camera models provide rather low X-Y resolution. Also, their depth measurements are starkly influenced by random and systematic errors which renders them inappropriate for high-quality 3D scanning. In this paper we show that ideas from traditional color image superresolution can be applied to TOF cameras in order to obtain 3D data of higher X-Y resolution and less noise. We will also show that our approach, which works using depth images only, bears many advantages over alternative depth upsampling methods that combine information from separate high-resolution color and low-resolution depth data.

192 citations


Patent
Yan Ye1, Yiliang Bao1
08 Jan 2008
TL;DR: In this article, the authors describe techniques for coding information in a scalable video coding (SVC) scheme that supports spatial scalability, including interpolating values for one or more pixel locations of the upsampled base layer residual video data.
Abstract: This disclosure describes techniques for coding information in a scalable video coding (SVC) scheme that supports spatial scalability. In one example, a method for coding video data with spatial scalability comprises upsampling base layer residual video data to a spatial resolution of enhancement layer residual video data, and coding the enhancement layer residual video data based on the upsampled base layer residual video data. In accordance with this disclosure, upsampling base layer residual video data includes interpolating values for one or more pixel locations of the upsampled base layer residual video data that correspond to locations between different base layer residual video data blocks.

89 citations


Patent
01 Oct 2008
TL;DR: In this paper, a scalable video bitstream may have an H.264/AVC compatible base layer and a scalable enhancement layer, where scalability refers to color bit-depth.
Abstract: A scalable video bitstream may have an H.264/AVC compatible base layer and a scalable enhancement layer, where scalability refers to color bit-depth. According to the invention, BL information is bit-depth upsampled using separate look-up tables for inverse tone mapping on two or more hierarchy levels, such as picture level, slice level or MB level. The look-up tables are differentially encoded and included in header information. Bit-depth upsampling is a process that increases the number of values that each pixel can have, corresponding to the pixels color intensity. The upsampled base layer data are used to predict the collocated enhancement layer, based on said look-up tables. The upsampling is done at the encoder side and in the same manner at the decoder side, wherein the upsampling may refer to temporal, spatial and bit depth characteristics. Thus, the bit-depth upsampling is compatible with texture upsampling.

61 citations


Proceedings ArticleDOI
18 May 2008
TL;DR: An adaptive downsampling/upsampling video coding scheme is presented in order to achieve better video quality at low bit rates in terms of both measure and visual quality.
Abstract: To transmit video contents over limited bandwidth network, video bitstreams may need to reduce the bit rate by encoding with coarse quantization parameters at the expense of degrading quality. At low bit rates, better coding quality can be achieved by downsampling the video prior to compression and upsampling later after decompression. In this paper, we present an adaptive downsampling/upsampling video coding scheme in order to achieve better video quality at low bit rates in terms of both measure and visual quality. In particular, appropriate downsampling directions/ratios and quantization step sizes are adaptively decided for encoding different regions of video frame with the consideration of local contents. Experimental results have shown the better performance of the proposed scheme over the regular coding and downsampling-based coding scheme with fixed downscaling ratio. In addition, the proposed scheme significantly raises the critical bit rate below which a downsampling-based coding scheme outperforms the regular coding.

60 citations


Journal ArticleDOI
TL;DR: This paper presents an approach that enables the direct layered manufacturing of point set surfaces based on adaptive slicing of moving least squares (MLS) surfaces, which bypasses the laborious surface reconstruction and avoids model conversion induced accuracy loss.
Abstract: Rapid advancement of 3D sensing techniques has led to dense and accurate point cloud of an object to be readily available. The growing use of such scanned point sets in product design, analysis, and manufacturing necessitates research on direct processing of point set surfaces. In this paper, we present an approach that enables the direct layered manufacturing of point set surfaces. This new approach is based on adaptive slicing of moving least squares (MLS) surfaces. Salient features of this new approach include the following: (I) It bypasses the laborious surface reconstruction and avoids model conversion induced accuracy loss. (2) The resulting layer thickness and layer contours are adaptive to local curvatures, and thus it leads to better surface quality and more efficient fabrication. (3) The curvatures are computed from a set of closed formula based on the MLS surface. The MLS surface naturally smoothes the point cloud and allows upsampling and downsampling, and thus it is robust even for noisy or sparse point sets. Experimental results on both synthetic and scanned point sets are presented.

51 citations


Journal ArticleDOI
TL;DR: By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, several new and promising approaches to LP-based audio modeling are obtained.
Abstract: While linear prediction (LP) has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole) LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

26 citations


Proceedings Article
01 Aug 2008
TL;DR: A novel wavelet-domain image upsampling algorithm based on iterative spatially adaptive filtering and the Block-Matching and 3D filtering technique, which results in high-quality upsampled images, with sharp edges and practically no artifacts.
Abstract: In this paper we present a novel wavelet-domain image upsampling algorithm based on iterative spatially adaptive filtering. A high-resolution image is reconstructed by alternating two procedures: spatially adaptive filtering and projection on the observation-constrained subspace. The Block-Matching and 3D filtering (BM3D) [3] technique is used to suppress ringing, and reconstruct missing wavelet detail coefficients. The BM3D algorithm exploits the local image statistics collected from similar blocks to extract local and non-local image features by 3D transform-domain shrinkage. It results in high-quality upsampled images, with sharp edges and practically no artifacts.

22 citations


Patent
02 Dec 2008
TL;DR: In this paper, the upsampling mechanism is used to interpolate a discrete-time input sample stream with time alignment utilizing the addition of randomized high frequency noise (i.e. dithering) in order to eliminate spectral regrowth spurs that would otherwise appear in the output after rounding.
Abstract: A novel and useful apparatus for and method of upsampling/interpolating a discrete-time input sample stream with time alignment utilizing the addition of randomized high frequency noise. The upsampling mechanism is an effective implementation of a second order interpolator that eliminates the need for a conventional filter as the filtering action is effectively built into the mechanism. The upsampling mechanism takes the derivative of the discrete-time input sample stream, thereby effectively providing another order of interpolation over a conventional interpolator. Before outputting the interpolated signal, an integrator takes the integral of the interpolated samples. Any processing performed between the derivative and integrator blocks effectively provides an additional order of interpolation. High frequency noise (i.e. dithering) is added to the differentiated samples in order to eliminate the spectral regrowth spurs that would otherwise appear in the output after rounding. Delay alignment is performed on the differentiated samples in order to time align both phase/frequency and amplitude samples that are processed on different paths.

Journal ArticleDOI
TL;DR: It is shown that PCOFDM systems are a special case of precoded OFDM systems that offer advantageous complexity-performance trade-offs and a proposed partial spreading scheme results in a low complexity decoupled detector.
Abstract: This paper discusses the design and analysis of post coded OFDM (PC-OFDM) systems. Coded or precoded OFDM systems are generally employed to overcome the symbol recovery problem in uncoded OFDM systems. We show that PCOFDM systems are a special case of precoded OFDM systems that offer advantageous complexity-performance trade-offs. In particular, PC-OFDM systems introduce frequency diversity by manipulating the OFDM symbols in the time domain so that the computational complexity of the system can be significantly reduced. We discuss the design principles of PC-OFDM transmitter that uses upsampling operation and the spreading codes to introduce frequency diversity. We obtain the spreading code construction criterion for minimum error performance and give examples of spreading codes for PC-OFDM systems. We also describe the design of low-complexity receiver for PC-OFDM systems. In particular, our proposed partial spreading scheme results in a low complexity decoupled detector. The probability of error analysis of the receiver leads us to postulate different design criteria. We investigate different choices for detection algorithms suitable for PC-OFDM receiver and compare their performance through simulations over Rayleigh and IEEE UWB channels.

Patent
21 Jan 2008
TL;DR: In this article, upscaling (upsampling) technique for digital images and video for presentation on a display is described, which is a technique for up-sampling images and videos.
Abstract: A upscaling (upsampling) technique for digital images and video for presentation on a display.

Patent
Yonghua Zhang1, Lin Ma1, Feng Wu1
22 Sep 2008
TL;DR: In this article, images are upsampled using a knowledge base derived from a plurality of high-quality training images, which is used to refine a high-frequency component including highfrequency aspects of a high resolution, low-frequency image, interpolated from a low-resolution full-fraction image, into a high frequency component.
Abstract: Images are upsampled using a knowledge base derived from a plurality of high-quality training images. The knowledge base is used to refine a high-frequency component including high-frequency aspects of a high-resolution, low-frequency image, interpolated from a low-resolution full-frequency image, into a high-frequency component. An enhancement step is performed without using a knowledge base to construct a high-compatibility component from the low-resolution, full-frequency image. The low-resolution, full-frequency image is combined with the coarse high-frequency component to yield an enhanced high-frequency component. A second knowledge base step is performed to construct an improved high-frequency component from the enhanced high-frequency component. The improved high-frequency component is blended with a high-resolution, low-frequency image to yield a high-resolution image.

Journal ArticleDOI
TL;DR: This paper surveys common subsampling applications providing examples of the advantages and disadvantages of various approaches and outlines a novel chroma upsampling technique that minimizes erroneous out-of-gamut colors.
Abstract: Chroma subsampling is a lossy process often compounded by concatenation of dissimilar techniques. This paper surveys common subsampling applications providing examples of the advantages and disadvantages of various approaches. It also outlines a novel chroma upsampling technique that minimizes erroneous out-of-gamut colors.

Journal ArticleDOI
TL;DR: This work outlines three approaches to decimate non-uniformly sampled signals, which are all based on interpolation, and computes an approximate Fourier transform, after which truncation and IFFT give the desired result.
Abstract: Decimating a uniformly sampled signal a factor D involves low-pass antialias filtering with normalized cutoff frequency 1/D followed by picking out every Dth sample. Alternatively, decimation can be done in the frequency domain using the fast Fourier transform (FFT) algorithm, after zero-padding the signal and truncating the FFT. We outline three approaches to decimate non-uniformly sampled signals, which are all based on interpolation. The interpolation is done in different domains, and the intersample behavior does not need to be known. The first one interpolates the signal to a uniformly sampling, after which standard decimation can be applied. The second one interpolates a continuous-time convolution integral, that implements the antialias filter, after which every Dth sample can be picked out. The third frequency domain approach computes an approximate Fourier transform, after which truncation and IFFT give the desired result. Simulations indicate that the second approach is particularly useful. A thorough analysis is therefore performed for this case, using the assumption that the non-uniformly distributed sampling instants are generated by a stochastic process.

Proceedings ArticleDOI
14 Oct 2008
TL;DR: The results showed that a number of phases between 16–24 is a good compromise to accurately estimate both ejection and regurgitant flows and a corrective linear model was proposed to reduce flow and velocity measurement errors in normal and pathological conditions.
Abstract: Magnetic resonance imaging is a very efficient tool for assessing velocity and flow in the cardiovascular system under normal and pathological conditions. However, this technique still has some limitations that produce different type of errors. In this study, velocities and flow were measured in vivo using phase-contrast method to determine the optimal number of phases allowing the minimization of the errors. The effect of velocity encoding upsampling was also investigated. The results showed that a number of phases between 16–24 is a good compromise to accurately estimate both ejection and regurgitant flows. Furthermore, a time shift effect caused by velocity encoding upsampling was found and a corrective linear model was proposed. These considerations may reduce flow and velocity measurement errors in normal and pathological conditions.

Proceedings ArticleDOI
01 Nov 2008
TL;DR: In this paper, SIC detection with modified time-domain interpolation estimation is presented and its performance is compared to the existing time- domain and low-pass interpolation channel estimations via computer simulation and shows that the modified method performs better than existing estimation schemes for BPSK and QPSK modulation.
Abstract: The performance of OFDM modulation degrades in a Doppler spread channel due to intercarrier interference (ICI). Multirate sampling theory and a detection scheme, called sequential interference cancellation (SIC), are applied to a standard OFDM system to reduce ICI in. The increased sampling rate (upsampling) causes less ICI for the edge subcarriers. The SIC detection scheme recovers the data from the outer carriers first and then detection moves from the outer carriers to the inner carriers to lower the error floor. In this paper, SIC detection with modified time-domain interpolation estimation is presented and its performance is compared to the existing time-domain and low-pass interpolation channel estimations via computer simulation. The results show that the modified method performs better than existing estimation schemes for BPSK and QPSK modulation.

Patent
Sang-Hee Lee1, Yi-Jen Chiu1
30 Jan 2008
Abstract: Upsampling for video signals is described. In particular chroma pixels may be upsampled in a luminance, chrominance signal using motion adaptive approaches. In one example, the operations include selecting an absent chroma pixel of the current frame, developing a spatial candidate value for the absent chroma pixel using pixel values for nearby pixels in the current frame, developing a temporal candidate value for the absent chroma pixel using pixel values for nearby pixels in the previous and subsequent frames, computing a value for the absent chroma pixel by combining the spatial candidate value and the temporal candidate value, and producing an output video frame including the computed absent chroma pixel values.

Patent
02 Oct 2008
TL;DR: In this article, a flying focal spot x-ray interpolation interlacing is used for computed tomography and a computed-tomography apparatus where a fly-focal spot X-ray is used.
Abstract: A method of computed-tomography and a computed-tomography apparatus where a flying focal spot x-ray interpolation interlacing is used. Weighted or non-weighted interlacing of zero values is performed, or interpolation interlacing is performed. The interpolation interlacing may be implemented as part of backprojection and or may be a separate process prior to backprojection. In both cases interlacing is performed on post-logged convolved data. The interpolation interlacing may also be incorporated into different parts of the processing chain, such as before convolution.

Proceedings ArticleDOI
22 Oct 2008
TL;DR: This paper presents FPGA design of speech compression by using different discrete wavelet transform (DWT) schemes including the Daubechies DWT and the DaUBechies lifting scheme DWT.
Abstract: Compared with the traditional digital signal processor, the field programmable gate array (FPGA) has advantages of configurable datapath, low cost, reprogrammability, and high performance. As a result, the FPGA device becomes more and more popular in the field of digital signal processing applications. This paper presents FPGA design of speech compression by using different discrete wavelet transform (DWT) schemes including the Daubechies DWT and the Daubechies lifting scheme DWT. In this design work, the audio CODEC chip is used to convert analog speech into the digital format. The digital streams can either be stored inside the SDRAM for DWT postprocessing or compressed in real time by using the Daubechies lifting scheme DWT. The low pass filtering part of the DWT result represents the compressed speech. It can be read back from the SDRAM, converted to analog signal and then played clearly in speakers after upsampling is performed.

Journal ArticleDOI
TL;DR: A bit rate adaptive approach is proposed to consider downsampling certain views prior to encoding with relevant downscaling ratios, suggesting that up to 0.9 dB gain or 20% reduction in bit rate can be achieved, reducing the computational complexity in the encoder significantly at the same time.
Abstract: The effect of using downsampling for arbitrary views inside a multi-view sequence on the multi-view coding (MVC) efficiency is explored. A bit rate adaptive approach is proposed to consider downsampling certain views prior to encoding with relevant downscaling ratios. The inter-view references, if any, are downsampled to the same resolution and the decoded view is upsampled back to the original resolution. The results over several multi-view test sequences imply that up to 0.9 dB gain or 20% reduction in bit rate can be achieved, reducing the computational complexity in the encoder significantly at the same time.

Journal ArticleDOI
TL;DR: To satisfy the steady state frequency response, the signals are extended and different extension methods are evaluated and it is demonstrated that anti-aliasing and anti-imaging filters significantly improve the tracking performance, however, the extension methods have little influence on the tracking accuracy.

Patent
11 Apr 2008
TL;DR: In this article, the principle that human eyes are far less sensitive to a chrominance component than to a luminance component is utilized, and a simpler filter is adopted for the chrominance components than that for the luminance components during upsampling in I_BLINTRA_Base interlayer prediction or residual samples image inter-layer prediction.
Abstract: The invention relates to video image compression technologies, and discloses a method and system for upsampling a spatial scalable coded video image so that during upsampling computation complexity may be reduced while coding performance is substantially unchanged. In the invention, the principle that human eyes are far less sensitive to a chrominance components than to a luminance components is utilized, and a simpler filter is adopted for the chrominance components than that for the luminance components during upsampling in I_BLINTRA_Base inter-layer prediction or residual samples image inter-layer prediction, thereby reducing effectively calculation complexity while coding performance is substantially unchanged.

Journal ArticleDOI
TL;DR: A layered resizing scheme is proposed based on the difference of short and long basis vector truncation, which can cater for user terminals' display limitation and channel bandwidth constraints, with additional scalability.
Abstract: As an efficient unitary transform, the discrete cosine transform (DCT) has been widely adopted in compression standards. Most compressed images and videos are stored in the DCT format, and from time to time, they need to be resized for various transmission channels and consumer terminals. In this paper, we investigate existing resizing schemes, focusing on the difference of short and long basis vector truncation. A layered resizing scheme is then proposed based on the above analysis, where compressed images are divided into low- and high-frequency layers. The DCT vectors of these two layers are truncated with different word lengths, and then form the elementary layer (EL) and the enhancement layer (EH) of the downsampled image. The EL and EH can be transmitted together or separately according to the bandwidth available. An upsampling scheme is also provided in this paper to recover visual details. Experimental results show improvements of the proposed approach over existing resizing schemes, which can be explained since its frequency response is closer to the ideal downsample filter. The new approach can cater for user terminals' display limitation and channel bandwidth constraints, with additional scalability.

01 Jun 2008
TL;DR: A new coordinate transformation operator is used to construct a synthesized coordinate map based on different exemplars at different scales, which is effective for upsampling texture-rich images, because the result preserves texture detail well.
Abstract: We synthesize a texture with different structures at different scales. Our technique is based on deterministic parallel synthesis allowing real-time processing on a GPU. A new coordinate transformation operator is used to construct a synthesized coordinate map based on different exemplars at different scales. The runtime overhead is minimal because this operator can be precalculated as a small lookup table. Our technique is effective for upsampling texture-rich images, because the result preserves texture detail well. In addition, a user can design a texture by coloring a low-resolution control image. This design tool can also be used for the interactive synthesis of terrain in the style of a particular exemplar, using the familiar ‘raise and lower’ airbrush to specify elevation.

Book ChapterDOI
25 Jun 2008
TL;DR: Tests involving the re-enlargement of images downsampled with box filtering suggest that natural biquadratic histopolation is the best linear upsampling reconstructor.
Abstract: Interpreting pixel values as averages over abutting squares mimics the image capture process. Average Matching (AM) exact area resampling involves the construction of a surface with averages given by the pixel values; the surface is then averaged over new pixel areas. AM resampling approximately preserves local averages (error bounds are given). Also, original images are recovered by box filtering when the magnification factor is an integer in both directions. Natural biquadratic histosplines, which satisfy a minimal norm property like bicubic splines, are used to construct the AM surface. Recurrence relations associated with tridiagonal systems allow the computation of tensor B-Spline coefficients at modest cost and their storage in reduced precision with little accuracy loss. Pixel values are then obtained by multiplication by narrow band matrices computed from B-Spline antiderivatives. Tests involving the re-enlargement of images downsampled with box filtering suggest that natural biquadratic histopolation is the best linear upsampling reconstructor.

01 Jan 2008
TL;DR: The speech compression system design detail on how to interface DWT block with SDRAM and audio codec chip is given and compressed speech can be heard clearly with some introduced background noise.
Abstract: This paper presents the Discrete Wavelet Transform (DWT) for real-world speech compression design by using the Field Programmable Gate Array (FPGA) device. Compared with many reports which only focused on the DWT architecture design, this paper gives the speech compression system design detail on how to interface DWT block with SDRAM and audio codec chip. Speech compression was achieved by keeping only the approximation part of the DWT result. The compressed speech signal was read back after upsampling was performed. The resulting compressed speech can be heard clearly with some introduced background noise Future work is to reduce noise.

Patent
15 Jul 2008
TL;DR: In this article, a video signal processing apparatus includes a downsampling means which generates 4:2:0-format color difference first and second fields using a down-sampling low-pass filter, respectively, in a vertical direction.
Abstract: PROBLEM TO BE SOLVED: To prevent positions of color difference pixels from being shifted even when format conversion between 4:2:2 format and 4:2:0 format is repeated SOLUTION: A video signal processing apparatus includes a downsampling means which generates 4:2:0-format color difference first and second fields using a downsampling low-pass filter, respectively, in a vertical direction, pixels in 4:2:2-format color difference first and second fields Regarding a downsampling low-pass filter coefficient for the first field, a value of modulo-1 remainder of a group delay at frequency ω=0 is substantially equal to 025 An upsampling low-pass filter satisfies a perfect reconstruction condition within a predetermined error tolerance range together with the first field downsampling low-pass filter A sum of a group delay value, at frequency ω=0, of a normalized filter obtained by making the sum of the coefficients become equal to 1 and the group delay value, at frequency ω=0, of the first downsampling low-pass filter becomes an integer number within a predetermined error tolerance range COPYRIGHT: (C)2010,JPO&INPIT

Journal ArticleDOI
TL;DR: This paper aims to improve the compression efficiency of the LP detail layers through improved interlayer prediction and orthogonal spatial transforms, and the subsequent application of the spatial transforms to the new detail layers aims to achieve better energy compaction.
Abstract: Scalable representation of visual signals, such as image and video signals, has become a subject of active research since early 1980s. Scalability allows the adaptation of the bit rate and/or the resolution of the transmitted data to the network bandwidth and/or the rendering capability of the receiving device. For many years, spatial scalability has been achieved through wavelets, but recently the Laplacian pyramid (LP) has become an alternative choice because of reduced aliasing in the lower resolutions. In this paper, we focus on the coding efficiency of the LP with a view to transmitting it over a communication channel. In particular, we aim to improve the compression efficiency of the LP detail layers through improved interlayer prediction and orthogonal spatial transforms. First, we consider an LP in the open-loop configuration and propose to improve its rate-distortion performance by compressing it to a critically sampled representation. We derive four different orthogonal spatial transforms from the upsampling and downsampling filters that can achieve this representation, and apply them on the detail layers. The application of these transforms to the detail layers renders a fixed number of transform coefficients either zero or redundant, thus making their transmission unnecessary. Then we consider the compression of an LP in the closed-loop configuration through similar spatial transforms. Because of the introduction of quantization in the prediction loop, these spatial transforms applied on the detail layers do not produce the same number of zero or redundant transform coefficients as in the open-loop case. Nevertheless, the insight obtained from the open-loop coding leads us to enhance the interlayer prediction, and the subsequent application of the spatial transforms to the new detail layers aims to achieve better energy compaction.