Showing papers in "IEEE Transactions on Circuits and Systems for Video Technology in 1996"

PDF

Open Access

Journal Article•DOI•

A new, fast, and efficient image codec based on set partitioning in hierarchical trees

[...]

Amir Said¹, William A. Pearlman²•Institutions (2)

State University of Campinas¹, Rensselaer Polytechnic Institute²

01 Jun 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The image coding results, calculated from actual file sizes and images reconstructed by the decoding algorithm, are either comparable to or surpass previous results obtained through much more sophisticated and computationally complex methods.

...read moreread less

Abstract: Embedded zerotree wavelet (EZW) coding, introduced by Shapiro (see IEEE Trans. Signal Processing, vol.41, no.12, p.3445, 1993), is a very effective and computationally simple technique for image compression. We offer an alternative explanation of the principles of its operation, so that the reasons for its excellent performance can be better understood. These principles are partial ordering by magnitude with a set partitioning sorting algorithm, ordered bit plane transmission, and exploitation of self-similarity across different scales of an image wavelet transform. Moreover, we present a new and different implementation based on set partitioning in hierarchical trees (SPIHT), which provides even better performance than our previously reported extension of EZW that surpassed the performance of the original EZW. The image coding results, calculated from actual file sizes and images reconstructed by the decoding algorithm, are either comparable to or surpass previous results obtained through much more sophisticated and computationally complex methods. In addition, the new coding and decoding procedures are extremely fast, and they can be made even faster, with only small loss in performance, by omitting entropy coding of the bit stream by the arithmetic code.

...read moreread less

5,890 citations

Journal Article•DOI•

A novel four-step search algorithm for fast block motion estimation

[...]

Lai-Man Po¹, Wing-Chung Ma¹•Institutions (1)

City University of Hong Kong¹

01 Jun 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Simulation results show that the proposed 4SS performs better than the well-known three- step search and has similar performance to the new three-step search (N3SS) in terms of motion compensation errors.

...read moreread less

Abstract: Based on the real world image sequence's characteristic of center-biased motion vector distribution, a new four-step search (4SS) algorithm with center-biased checking point pattern for fast block motion estimation is proposed in this paper. A halfway-stop technique is employed in the new algorithm with searching steps of 2 to 4 and the total number of checking points is varied from 17 to 27. Simulation results show that the proposed 4SS performs better than the well-known three-step search and has similar performance to the new three-step search (N3SS) in terms of motion compensation errors. In addition, the 4SS also reduces the worst-case computational requirement from 33 to 27 search points and the average computational requirement from 21 to 19 search points, as compared with N3SS.

...read moreread less

1,619 citations

Journal Article•DOI•

A block-based gradient descent search algorithm for block motion estimation in video coding

[...]

Lurng-Kuo Liu¹, Ephraim Feig¹•Institutions (1)

IBM¹

01 Aug 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The proposed block-based gradient descent search (BBGDS) algorithm is proposed to perform block motion estimation in video coding and provides competitive performance with reduced computational complexity.

...read moreread less

Abstract: A block-based gradient descent search (BBGDS) algorithm is proposed in this paper to perform block motion estimation in video coding. The BBGDS evaluates the values of a given objective function starting from a small centralized checking block. The minimum within the checking block is found, and the gradient descent direction where the minimum is expected to lie is used to determine the search direction and the position of the new checking block. The BBGDS is compared with full search (FS), three-step search (TSS), one-at-a-time search (OTS), and new three-step search (NTSS). Experimental results show that the proposed technique provides competitive performance with reduced computational complexity.

...read moreread less

638 citations

Journal Article•DOI•

Rate-distortion optimized mode selection for very low bit rate video coding and the emerging H.263 standard

[...]

Thomas Wiegand, Michael L. Lightstone, Debargha Mukherjee¹, T.G. Campbell, Sanjit K. Mitra¹ - Show less +1 more•Institutions (1)

University of California, Santa Barbara¹

01 Apr 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An efficient solution is proposed in which the optimum combination of macroblock modes and the associated mode parameters are jointly selected so as to minimize the overall distortion for a given bit-rate budget, and is successfully applied to the emerging H.263 video coding standard.

...read moreread less

Abstract: This paper addresses the problem of encoder optimization in a macroblock-based multimode video compression system. An efficient solution is proposed in which, for a given image region, the optimum combination of macroblock modes and the associated mode parameters are jointly selected so as to minimize the overall distortion for a given bit-rate budget. Conditions for optimizing the encoder operation are derived within a rate-constrained product code framework using a Lagrangian formulation. The instantaneous rate of the encoder is controlled by a single Lagrange multiplier that makes the method amenable to mobile wireless networks with time-varying capacity. When rate and distortion dependencies are introduced between adjacent blocks (as is the case when the motion vectors are differentially encoded and/or overlapped block motion compensation is employed), the ensuing encoder complexity is surmounted using dynamic programming. Due to the generic nature of the algorithm, it can be successfully applied to the problem of encoder control in numerous video coding standards, including H.261, MPEG-1, and MPEG-2. Moreover, the strategy is especially relevant for very low bit rate coding over wireless communication channels where the low dimensionality of the images associated with these bit rates makes real-time implementation very feasible. Accordingly, in this paper, the method is successfully applied to the emerging H.263 video coding standard with excellent results at rates as low as 8.0 Kb per second. Direct comparisons with the H.263 test model, TMN5, demonstrate that gains in peak signal-to-noise ratios (PSNR) are achievable over a wide range of rates.

...read moreread less

408 citations

Journal Article•DOI•

Rate control of MPEG video coding and recording by rate-quantization modeling

[...]

Wei Ding¹, Bede Liu¹•Institutions (1)

Princeton University¹

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A feedback re-encoding method with a rate-quantization model, which can be adapted to changes in picture activities, is developed and used for quantization parameter selection at the frame and slice level.

...read moreread less

Abstract: For MPEG video coding and recording applications, it is important to select the quantization parameters at slice and macroblock levels to produce consistent quality image for a given bit budget. A well-designed rate control strategy can improve the overall image quality for video transmission over a constant-bit-rate channel and fulfil the editing requirement of video recording, where a certain number of new pictures are encoded to replace consecutive frames on the storage media using, at most, the same number of bits. We developed a feedback re-encoding method with a rate-quantization model, which can be adapted to changes in picture activities. The model is used for quantization parameter selection at the frame and slice level. The extra computations needed are modest. Experiments show the accuracy of the model and the effectiveness of the proposed rate control method. A new bit allocation algorithm is then proposed for MPEG video coding.

...read moreread less

377 citations

Journal Article•DOI•

Architectures for MPEG compressed bitstream scaling

[...]

Huifang Sun¹, W. Kwok², J.W. Zdepski²•Institutions (2)

Sarnoff Corporation¹, Princeton University²

01 Apr 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper presents several bitstream scaling methods for the purpose of reducing the rate of constant bit rate (CBR) encoded bitstreams and shows typical performance trade-offs of the methods.

...read moreread less

Abstract: The idea of moving picture expert group (MPEG) bitstream scaling relates to altering or scaling the amount of data in a previously compressed MPEG bitstream. The new scaled bitstream conforms to constraints that are not known nor considered when the original preceded bitstream was constructed. Numerous applications for video transmission and storage are being developed based on the MPEG video coding standard. Applications such as video on demand, trick-play track on digital video tape recorders (VTR's) and extended-play recording on VTR's motivate the idea of bitstream scaling. In this paper, we present several bitstream scaling methods for the purpose of reducing the rate of constant bit rate (CBR) encoded bitstreams. The different methods have varying hardware implementation complexity and associated trade-offs in resulting image quality. Simulation results on MPEG test sequences demonstrate the typical performance trade-offs of the methods.

...read moreread less

351 citations

Journal Article•DOI•

Packet loss resilience of MPEG-2 scalable video coding algorithms

[...]

Rangarajan Aravind¹, Mehmet Reha Civanlar², Amy R. Reibman²•Institutions (2)

Indian Institute of Technology Madras¹, AT&T²

01 Oct 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper compares the performance of these techniques (excluding temporal scalability) under various loss rates using realistic length material and discusses their relative merits.

...read moreread less

Abstract: Transmission of compressed video over packet networks with nonreliable transport benefits when packet loss resilience is incorporated into the coding. One promising approach to packet loss resilience, particularly for transmission over networks offering dual priorities such as ATM networks, is based on layered coding which uses at least two bitstreams to encode video. The base-layer bitstream, which can be decoded independently to produce a lower quality picture, is transmitted over a high priority channel. The enhancement-layer bitstream(s) contain less information, so that packet losses are more easily tolerated. The MPEG-2 standard provides four methods to produce a layered video bitstream: data partitioning, signal-to-noise ratio scalability, spatial scalability, and temporal scalability. Each was included in the standard in part for motivations other than loss resilience. This paper compares the performance of these techniques (excluding temporal scalability) under various loss rates using realistic length material and discusses their relative merits. Nonlayered MPEG-2 coding gives generally unacceptable video quality for packet loss ratios of 10/sup -3/ for small packet sizes. Better performance can be obtained using layered coding and dual-priority transmission. With data partitioning, cell loss ratios of 10/sup -4/ in the low-priority layer are definitely acceptable, while for SNR scalable encoding, cell loss ratios of 10/sup -3/ are generally invisible. Spatial scalable encoding can provide even better visual quality under packet losses; however, it has a high implementation complexity.

...read moreread less

227 citations

Journal Article•DOI•

Low bit-rate video transmission over fading channels for wireless microcellular systems

[...]

M. Khansari¹, A. Jalali, Eric Dubois, P. Mermelstein•Institutions (1)

Institut national de la recherche scientifique¹

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This work considers the transmission of QCIF resolution (176/spl times/144 pixels) video signals over wireless channels at transmission rates of 64 kb/s and below and proposes an automatic repeat request (ARQ) error control technique to retransmit erroneous data-frames.

...read moreread less

Abstract: We consider the transmission of QCIF resolution (176/spl times/144 pixels) video signals over wireless channels at transmission rates of 64 kb/s and below. The bursty nature of the errors on the wireless channel requires careful control of transmission performance without unduly increasing the overhead for error protection. A dual-rate source coder is presented that adaptively selects a coding rate according to the current channel conditions. An automatic repeat request (ARQ) error control technique is employed to retransmit erroneous data-frames. The source coding rate is selected based on the occupancy level of the ARQ transmission buffer. Error detection followed by retransmission results in less overhead than forward error correction for the same quality. Simulation results are provided for the statistics of the frame-error bursts of the proposed system over code division multiple access (CDMA) channels with average bit error rates of 10/sup -3/ to 10/sup -4/.

...read moreread less

176 citations

Journal Article•DOI•

A method for motion adaptive frame rate up-conversion

[...]

Roberto Castagno, P. Haavisto¹, Giovanni Ramponi²•Institutions (2)

Nokia¹, University of Trieste²

01 Oct 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The algorithm has been designed mainly for 50 Hz to 75 Hz frame rate up-conversion with applications in a multimedia environment, but it can also be used in advanced television receivers to remove artifacts due to low scan rate.

...read moreread less

Abstract: A frame interpolation algorithm for frame rate up-conversion of progressive image sequences is proposed. The algorithm is based on simple motion compensation and linear interpolation. A motion vector is searched for each pixel in the interpolated image and the resulting motion field is median filtered to remove inconsistent vectors. Averaging along the motion trajectory is used to produce the interpolated pixel values. The main novelty of the proposed method is the motion compensation algorithm which has been designed with low computational complexity as an important criterion. Subsampled blocks are used in block matching and the vector search range is constrained to the most likely motion vectors. Simulation results show that good visual quality has been obtained with moderate complexity. The algorithm has been designed mainly for 50 Hz to 75 Hz frame rate up-conversion with applications in a multimedia environment, but it can also be used in advanced television receivers to remove artifacts due to low scan rate.

...read moreread less

169 citations

Journal Article•DOI•

A perceptually optimized 3-D subband codec for video communication over wireless channels

[...]

Chun-Hsien Chou¹, Chi-Wei Chen¹•Institutions (1)

Tatung University¹

01 Apr 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Simulation results show that acceptable visual quality can be maintained in transmitting video sequences with low bit rates over the wireless channel of high error rates, and the distortion due to erroneous transmission of coded data can be effectively suppressed.

...read moreread less

Abstract: Visual communication over wireless channels is becoming important in multimedia. Because of the limited bandwidth and high error rates of the wireless channel, the video codec should be designed to have high coding efficiency in maintaining acceptable visual quality at low bit rates and robustness to suppress the distortion due to transmission errors. The coding efficiency of a 3D subband video codec is optimized by removing not only the redundancy due to spatial and temporal correlation but also perceptually insignificant components from video signals. Unequal error protection is applied to the source code bits of different perceptual importance. An error concealment method is employed to hide the distortion due to erroneous transmission of perceptually important signals. The evaluation of each signal's perceptual importance is made first from measuring the just-noticeable distortion (JND) profile as the perceptual redundancy inherent in video signals, and then from allocating JND energy to signals of different subbands according to the sensitivity of human visual responses to spatio-temporal frequencies. Simulation results show that acceptable visual quality can be maintained in transmitting video sequences with low bit rates (<64 kbps) over the wireless channel of high error rates (up to BER=10/sup -2/), and the distortion due to erroneous transmission of coded data can be effectively suppressed. In the simulation, the noisy channel is assumed to be corrupted by the random errors depending on the average strength of the received wave and the burst errors due to Rayleigh fading.

...read moreread less

164 citations

Journal Article•DOI•

Stack-run image coding

[...]

M.J. Tsai¹, J.D. Villasenor¹, F. Chen¹•Institutions (1)

University of California, Los Angeles¹

01 Oct 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A new image coding approach in which a 4-ary arithmetic coder is used to represent significant coefficient values and the lengths of zero runs between coefficients, which involves much lower addressing complexity than other algorithms such as zerotree coding.

...read moreread less

Abstract: We describe a new image coding approach in which a 4-ary arithmetic coder is used to represent significant coefficient values and the lengths of zero runs between coefficients. This algorithm works by raster scanning within subbands, and therefore involves much lower addressing complexity than other algorithms such as zerotree coding that require the creation and maintenance of lists of dependencies across different decomposition levels. Despite its simplicity, and the fact that these dependencies are not explicitly utilized, the algorithm presented here is competitive with the best enhancements of zerotree coding. In addition, it performs comparably with adaptive subband splitting approaches that involve much higher implementation complexity.

...read moreread less

Journal Article•DOI•

New adaptive pixel decimation for block motion vector estimation

[...]

Yui-Lam Chan, Wan-Chi Siu

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper select the most representative pixels based on image content in each block for the matching criterion, due to the fact that high activity in the luminance signal such as edges and texture mainly contributes to the match criterion.

...read moreread less

Abstract: A new adaptive technique based on pixel decimation for the estimation of motion vector is presented. In a traditional approach, a uniform pixel decimation is used. Since part of the pixels in each block do not enter into the matching criterion, this approach limits the accuracy of the motion vector. In this paper, we select the most representative pixels based on image content in each block for the matching criterion. This is due to the fact that high activity in the luminance signal such as edges and texture mainly contributes to the matching criterion. Our approach can compensate the drawback in standard pixel decimation techniques. Computer simulations show that this technique is close to the performance of the exhaustive search with significant computational reduction.

...read moreread less

Journal Article•DOI•

Efficient block motion estimation using integral projections

[...]

K. Sauer¹, B. Schwartz²•Institutions (2)

University of Notre Dame¹, Delco Electronics²

01 Oct 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This work presents a block motion estimation scheme which is based on matching of integral projections of motion blocks with those of the search area in the previous frame, and takes advantage of the similarity of motion vectors in adjacent blocks in typical imagery by subsampling the motion vector field.

...read moreread less

Abstract: Several efficient techniques have previously been proposed to reduce the computational burden of block matching for motion estimation in video coding. The goal is efficient motion estimation with minimal error in the motion-compensated predicted image. We present a block motion estimation scheme which is based on matching of integral projections of motion blocks with those of the search area in the previous frame. Like many other techniques, ours operates in a sequence of decreasing search radii, but it performs an exhaustive search at each level of the hierarchy. The projection method is much less computationally costly than block matching and has a prediction accuracy of competitive quality with both full block matching and other efficient techniques. Our algorithm also takes advantage of the similarity of motion vectors in adjacent blocks in typical imagery by subsampling the motion vector field. It has the added advantage of allowing parallel computation of vertical and horizontal displacements.

...read moreread less

Journal Article•DOI•

Adaptive interpolation technique for scanning rate conversion

[...]

C.J. Kuo¹, Ching Liao¹, C.C. Lin¹•Institutions (1)

National Chung Cheng University¹

01 Jun 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An adaptive technique for scanning rate conversion and interpolation that performs better than the edge-based line average algorithm, especially for an image with more horizontal edges is proposed.

...read moreread less

Abstract: An adaptive technique for scanning rate conversion and interpolation is proposed. This technique performs better than the edge-based line average algorithm, especially for an image with more horizontal edges. Moreover, it is easy to implement and a simple VLSI architecture is proposed. Computer simulation shows that a 37.0 dB image can be obtained via our proposed technique, while edge-based line average algorithm only achieves 35.2 dB.

...read moreread less

Journal Article•DOI•

A reconfigurable modem for increased network capacity and video, voice, and data transmission over GSM PCS

[...]

C. Brown¹, K. Feher•Institutions (1)

Intel¹

01 Apr 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The result is that an average video frame rate of up 14 frames per second can be supported within a single TDMA time slot, doubling the number of conventional GMSK multimedia transmissions per GSM channel.

...read moreread less

Abstract: A reconfigurable global mobile standard (GSM) compatible radio modem interface which doubles the number of simultaneous video and voice transmissions per channel is presented for personal communication systems (PCSs). The result is that an average video frame rate of up 14 frames per second (fps) can be supported within a single TDMA time slot, doubling the number of conventional GMSK multimedia transmissions per GSM channel. The design employs an in-circuit reconfigurable (ICR) cross-correlated Feher's (see Englewood Cliffs, NJ: Prentice-Hall, 1995). quadrature phase shift keyed (FQPSK) signal processing technique to support a data rate of 357.5 kb/s in a 200 kHz bandlimited GSM channel. This innovative new network modem is reconfigured in-circuit for high bit rate data transmission or voice operation that is compatible with existing GSM equipment. Spectrum efficiency /spl eta//sub f/ (b/s/Hz) is investigated in a nonlinear amplified (NLA) environment-providing a 6 to 9 dB advantage in power efficiency for increased battery life of hand-held terminals. Results show that RF power efficient nonlinear amplified spectrum efficiency is increased from 1.35 b/s/Hz to 1.85 b/s/Hz. Bit error rate (BER) performance is evaluated in a Gaussian and Rayleigh fading channel, and the merits of coherent demodulation in microcellular PCS are examined. Our results show that the GSM compatible configuration of a specific cross correlated FQPSK-KF implementation offers up to 2 dB improvement in E/sub b//N/sub 0/ over conventional GMSK BT/sub b/=0.3. The PCS network capacity /spl eta//sub T/ (Erlangs/Hz m/sup 2/) may be increased 37% over GMSK BT/sub b/=0.3.

...read moreread less

Journal Article•DOI•

Use of two-dimensional deformable mesh structures for video coding. II. The analysis problem and a region-based coder employing an active mesh representation

[...]

Yao Wang, O. Lee¹, Anthony Vetro²•Institutions (2)

Hanshin University¹, Mitsubishi Electric²

01 Dec 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper explores the use of the deformable mesh structure for motion/shape analysis and synthesis in an image sequence, including scene-adaptive mesh generation and node tracking over successive frames, and presents algorithms for the analysis problem.

...read moreread less

Abstract: For pt.I see ibid., vol.6, no.6, p.636-46 (1996). This paper explores the use of the deformable mesh structure for motion/shape analysis and synthesis in an image sequence. We present algorithms for the analysis problem, including scene-adaptive mesh generation and node tracking over successive frames. We also describe a region-based video coder that integrates the analysis and synthesis algorithms presented. The coder describes each region by an ensemble of connected quadrilateral elements embedded in a mesh structure. For each region, its shape and texture are described by the nodal positions and image functions of the elements in this region in an initial frame, while its motion (including shape deformation) is characterized by the nodal trajectories in the following frames, which are in turn specified by a few motion parameters. This coder has been applied to a typical common intermediate format (CIF) resolution, head-and-shoulder type sequence. The visual quality is significantly better than the H.263-TMN4 algorithm at about 50 kb/s (for the luminance component only, 30 Hz).

...read moreread less

Journal Article•DOI•

Use of two-dimensional deformable mesh structures for video coding .I. The synthesis problem: mesh-based function approximation and mapping

[...]

Yao Wang, O. Lee¹•Institutions (1)

New York University¹

01 Dec 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: It is shown that the concepts of shape functions and master elements are crucial for developing computationally efficient algorithms for both the analysis and synthesis problems.

...read moreread less

Abstract: This paper explores the use of a deformable mesh (also known as the control grid) structure for motion analysis and synthesis in an image sequence. We focus on the synthesis problem, i.e., how to interpolate an image function given nodal positions and values and how to predict a present image frame from a reference one given nodal displacements between the two images. For this purpose, we review the fundamental theory and numerical techniques that have been developed in the finite element method for function approximation and mapping using a mesh structure. Specifically, we focus on (i) the use of shape functions for node-based function interpolation and mapping; and (ii) the use of regular master elements to simplify numerical calculations involved in dealing with irregular mesh structures. In addition to a general introduction that is applicable to an arbitrary mesh structure, we also present specific results for triangular and quadrilateral mesh structures, which are the most useful two-dimensional (2-D) meshes. Finally, we describe how to apply the above results for motion compensated frame prediction and interpolation. It is shown that the concepts of shape functions and master elements are crucial for developing computationally efficient algorithms for both the analysis and synthesis problems.

...read moreread less

Journal Article•DOI•

A fast hierarchical motion-compensation scheme for video coding using block feature matching

[...]

Xiaobing Lee, Ya-Qin Zhan

01 Dec 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A fast hierarchical feature matching-motion estimation scheme (HFM-ME) that can be used in H.263, H.261, MPEG 1, MPEG 2, and HDTV applications, where the sign truncated feature is defined and used for block template matching, as opposed to the pixel intensity values used in conventional block matching methods.

...read moreread less

Abstract: This paper presents a fast hierarchical feature matching-motion estimation scheme (HFM-ME) that can be used in H.263, H.261, MPEG 1, MPEG 2, and HDTV applications. In the HFM-ME scheme, the sign truncated feature (STF) is defined and used for block template matching, as opposed to the pixel intensity values used in conventional block matching methods. The STF extraction process can be considered as a zero-crossing phase detection with the mean as the bias and binary sign pattern as the phase deviation. Using the STF definition, a data block can be represented by a mean and a set of binary features with a much reduced data set. The block matching motion estimation is then divided into mean matching and binary phase matching. The proposed technique enables a significant reduction in computational complexity compared with the conventional full-search block matching ME because binary phase matching only involves Boolean logic operations. This feature also significantly reduces the data transfer time between the frame buffer and motion estimator. The proposed HFM-ME algorithm is implemented and compared with the conventional full-search block matching schemes. Our test results using three full-motion MPEG sequences indicate that the performance of the HFM-ME is comparable with the full-search block matching under the same search ranges, however, HFM-ME can be implemented about 64 times faster than the conventional full-search schemes. The proposed scheme can be combined with other fast algorithms to further reduce the computational complexity, at the expense of picture quality.

...read moreread less

Journal Article•DOI•

Subband DCT: definition, analysis, and applications

[...]

Sung-Hwan Jung¹, Sanjit K. Mitra², Debargha Mukherjee²•Institutions (2)

Changwon National University¹, University of California, Santa Barbara²

01 Jun 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper investigates a modified DCT computation scheme, to be called the subband DCT (SB-DCT), that provides a simple, efficient solution to the reduction of the block artifacts while achieving faster computation.

...read moreread less

Abstract: The discrete cosine transform (DCT) is well known for its highly efficient coding performance and is widely used in many image compression applications. However, in low bit rate coding, it produces undesirable block artifacts that are visually not pleasing. In addition, in many practical applications, faster computation and easier VLSI implementation of DCT coefficients are also important issues. The removal of the block artifacts and faster DCT computation are therefore of practical interest. In this paper, we investigate a modified DCT computation scheme, to be called the subband DCT (SB-DCT), that provides a simple, efficient solution to the reduction of the block artifacts while achieving faster computation. We have applied the new approach for the low bit rate coding and decoding of images. Simulation results on real images have verified the improved performance obtained using the proposed method over the standard JPEG method.

...read moreread less

Journal Article•DOI•

A common framework for rate and distortion based scaling of highly scalable compressed video

[...]

David Taubman¹, Avideh Zakhor²•Institutions (2)

Hewlett-Packard¹, University of California, Berkeley²

01 Aug 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Data structures for highly scalable compressed video are described, which are able to support simple, generic scaling approaches for both constant bit rate and constant distortion scaling criteria, and the performance of the proposed scaling methodologies is experimentally investigated.

...read moreread less

Abstract: Scalability refers to the ability to modify the resolution and/or bit rate associated with an already compressed data source in order to satisfy requirements which could not be foreseen at the time of compression. A number of researchers have already demonstrated the feasibility of efficient scalable image and video compression. The principle focus of this paper is to describe data structures for highly scalable compressed video, which are able to support simple, generic scaling approaches for both constant bit rate and constant distortion scaling criteria. Interactive video material presents particular challenges when the data stream is to be scaled to maintain an approximately constant level of distortion, rather than just a constant bit rate. Special attention is paid, therefore, to the development of generic, robust scaling algorithms for such applications. The data structures and scaling methodologies developed are particularly appealing for the distribution of highly scalable compressed video over heterogeneous media, because they simultaneously support both variable bit rate (VBR) and constant bit rate (CBR) services with a wide range of available service qualities, using only simple, generic mechanisms for scaling. The performance of the proposed scaling methodologies is experimentally investigated using a highly scalable video compression algorithm, which is able to achieve comparable compression performance to that of the inherently nonscalable MPEG-1 compression standard.

...read moreread less

Journal Article•DOI•

An adaptive nearest neighbor multichannel filter

[...]

Kostas N. Plataniotis¹, V. Sri¹, Dimitrios Androutsos¹, Anastasios N. Venetsanopoulos¹•Institutions (1)

University of Toronto¹

01 Dec 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper addresses the problem of noise attenuation for multichannel data by utilizing adaptively determined data-dependent coefficients based on a novel distance measure which combines vector directional with vector magnitude filtering.

...read moreread less

Abstract: This paper addresses the problem of noise attenuation for multichannel data. The proposed filter utilizes adaptively determined data-dependent coefficients based on a novel distance measure which combines vector directional with vector magnitude filtering. The special case of color image processing is studied as an important example of multichannel signal processing.

...read moreread less

Journal Article•DOI•

A flexible parallel architecture adapted to block-matching motion-estimation algorithms

[...]

S. Dutta¹, Wayne Wolf¹•Institutions (1)

Princeton University¹

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A novel architecture that offers the flexibility of implementing widely varying motion-estimation algorithms by employing multiple processing elements which communicate with multiple memory banks via a multistage interconnection network is described.

...read moreread less

Abstract: This paper describes a novel architecture that offers the flexibility of implementing widely varying motion-estimation algorithms. To achieve real-time performance, we employ multiple processing elements (PE's) which communicate with multiple memory banks via a multistage interconnection network. Three different block-matching algorithms-full search, three-step search, and conjugate-direction search-have been mapped onto this architecture to illustrate its programmability. We schedule the desired operations and design the required data-flow in such a way that processor utilization is high and memory bandwidth is at a feasible level. The details regarding the flow of the pixel data and the scheduling and allocation of the desired ALU operations (which pixels are processed on which processors in which clock cycles) are described in the paper. We analyze the performance of the proposed architecture for several different interconnection networks and data-memory organizations.

...read moreread less

Journal Article•DOI•

Classified perceptual coding with adaptive quantization

[...]

Soon Hie Tan¹, K.K. Pang¹, King Ngi Ngan•Institutions (1)

Monash University, Clayton campus¹

01 Aug 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Subjective results confirm the efficacy of the proposed classified coder over the RMS based H.261 coder in two ways: it consistently produces better quality sequences and achieves a bit rate saving of 35% when measuring at the same picture quality.

...read moreread less

Abstract: A new technique of adaptively classifying the scene content of an image block has been developed in the proposed perceptual coder. It measures the texture masking energy of an image block and classifies it into one of four perceptual classes: flat, edge, texture, and fine-texture. Each class has an associated factor to adapt the quantizer with the aim of achieving constant quality across an image. A second feature of the perceptual coder is the visual thresholding, a process that reduces bit rate by discarding subthreshold discrete cosine transform (DCT) coefficients without degrading the image perceived quality. Finally, further quality gain is achieved by an improved reference model 8 (RM8) intramode decision, which removes sticking noise artifacts from newly uncovered background found in H.261 coded sequences. Subjective viewing tests, guided by Rec. 500-5, were conducted with 30 subjects. Subjective results confirm the efficacy of the proposed classified coder over the RMS based H.261 coder in two ways: (i) it consistently produces better quality sequences (with a mean opinion score, MOS, of approximately 2.0) when comparing at any fix bit rate; and (ii) it achieves a bit rate saving of 35% when measuring at the same picture quality (i.e., same MOS).

...read moreread less

Journal Article•DOI•

Reconstruction-optimized lapped orthogonal transforms for robust image transmission

[...]

Sheila S. Hemami¹•Institutions (1)

Cornell University¹

01 Apr 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper presents and solves the dual problem-a block-based coding technique, namely a family of lapped orthogonal transforms (LOTs) is designed to maximize the reconstruction performance of a specified reconstruction algorithm.

...read moreread less

Abstract: Wireless transmission of compressed visual information presents new challenges in image coding and reconstruction techniques. Wireless channels do not offer guaranteed transmission, and data loss over such channels can result in catastrophic errors in the decoded visual information. Visual data, however, can be reconstructed using lossy signal processing techniques. To date, reconstruction algorithms have been developed for fixed coding techniques. This paper presents and solves the dual problem-a block-based coding technique, namely a family of lapped orthogonal transforms (LOTs) is designed to maximize the reconstruction performance of a specified reconstruction algorithm. Mean-reconstruction, in which a missing coefficient block is replaced with the average of its available neighbors, is selected for its simplicity and ease of implementation. A reconstruction criterion is defined as the equal distribution of reconstruction errors across all transform coefficients, and a family of LOTs is then designed to meet the reconstruction criterion as well as consider the transform coding gain. Reconstruction capability and coding gain are traded off, and the LOT family consists of transforms that provide increasing reconstruction capability with lower coding gain. The reconstruction-optimized LOT family provides excellent reconstruction capability, and a transform can be selected based on the loss characteristics of the channel, the desired reconstruction performance, and the desired compression.

...read moreread less

Journal Article•DOI•

A locally quadratic model of the motion estimation error criterion function and its application to subpixel interpolations

[...]

Xiaoming Li¹, C.A. Gonzales¹•Institutions (1)

IBM¹

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This locally quadratic functional model decomposes the motion estimation optimization at subpixel resolutions into a two-stage pipelinable processes: full-search at full-pixel resolution and interpolation at any subpixel resolution.

...read moreread less

Abstract: Accurate motion estimation is essential to effective motion compensated video signal processing, and subpixel resolutions are required for high quality applications. It is observed that around the optimum point of the motion estimation process the error criterion function is well modeled as a quadratic function with respect to the motion vector offsets. This locally quadratic functional model decomposes the motion estimation optimization at subpixel resolutions into a two-stage pipelinable processes: full-search at full-pixel resolution and interpolation at any subpixel resolution. Practical approximation formulas lead to the explicit computations of both motion vectors and error criterion functional values at subpixel resolutions.

...read moreread less

Journal Article•DOI•

VLSI design of high-speed time-recursive 2-D DCT/IDCT processor for video applications

[...]

V. Srinivasan, K.J.R. Liu

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: It is shown that the VLSI implementation of this class of DCT/IDCT algorithms can easily meet the high-speed requirements of high-definition television (HDTV) due to its modularity, regularity, local connectivity, and scalability.

...read moreread less

Abstract: In this paper we present a full-custom VLSI design of high-speed 2-D DCT/IDCT processor based on the new class of time-recursive algorithms and architectures which has never been implemented to demonstrate its performance. We show that the VLSI implementation of this class of DCT/IDCT algorithms can easily meet the high-speed requirements of high-definition television (HDTV) due to its modularity, regularity, local connectivity, and scalability. Our design of the 8/spl times/8 DCT/IDCT can operate at 50 MHz (or have a 50 MSamples/s throughput) based on a very conservative estimate under 1.2 /spl mu/ CMOS technology. In comparison to the existing designs, our approach offers many advantages that can be further explored for even higher performance.

...read moreread less

Journal Article•DOI•

Postprocessing of late cells for packet video

[...]

Mohammed Ghanbari¹•Institutions (1)

University of Essex¹

01 Dec 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: It is shown that although the picture quality due to cell loss is temporarily degraded, it is immediately brought back to its original quality upon the reception of the late cells, as if no loss has occurred.

...read moreread less

Abstract: A method for preventing accumulation of image artifacts due to cell loss in packet video is presented. At each ATM switching node an auxiliary buffer is used to store the overflow traffic of the main switching buffer. The main buffer is served with an absolute priority over the auxiliary buffer. Decoded pictures are normally reconstructed from the cells of the main buffer. The late cells received from the auxiliary buffer are processed and properly added to the current decoded picture. Postprocessing of these cells for the standard video codecs such as H.261 and MPEG is presented. It is shown that although the picture quality due to cell loss is temporarily degraded, it is immediately brought back to its original quality upon the reception of the late cells, as if no loss has occurred.

...read moreread less

Journal Article•DOI•

VLSI architectures for block matching algorithms using systolic arrays

[...]

Sung Bum Pan¹, Seung Soo Chae, Rae-Hong Park•Institutions (1)

Sogang University¹

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The proposed hardware architectures for the two-stage BMA and FS BMA are faster than the conventional hardware architectures with lower hardware complexity and the functional validity of the proposed architecture is shown.

...read moreread less

Abstract: We investigate hardware implementation of block matching algorithms (BMAs) for motion estimation of moving sequences. Using systolic arrays, we propose VLSI architectures for the two-stage BMA and full search (FS) BMA. The two-stage BMA using integral projections reduces greatly the computational complexity with its performance comparable to that of the FS BMA. The proposed hardware architectures for the two-stage BMA and FS BMA are faster than the conventional hardware architectures with lower hardware complexity. Also, the proposed architecture of the first stage of the two-stage BMA is modeled in VHDL and simulated. Simulation results show the functional validity of the proposed architecture.

...read moreread less

Journal Article•DOI•

Architecture and applications of the HiPAR video signal processor

[...]

K. Ronner, J. Kneip

01 Feb 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The architecture of a highly parallel DSP (HiPAR-DSP) as a flexible and programmable processor for image and video processing is proposed, based on an analysis of image processing algorithms in terms of available parallelization resources, demands on program control, and required data access mechanisms.

...read moreread less

Abstract: We propose the architecture of a highly parallel DSP (HiPAR-DSP) as a flexible and programmable processor for image and video processing. The design is based on an analysis of image processing algorithms in terms of available parallelization resources, demands on program control, and required data access mechanisms. This led to a very long instruction word (VLIW)-controlled ASIMD RISC-architecture with four or sixteen data paths, employing data-level parallelism, parallel instructions, micro-instruction pipelining, and data transfer concurrently to data processing. Common data access patterns for image processing algorithms are supported by use of a shared on-chip memory with parallel matrix type access patterns and a separate data-cache per data path. By properly balancing processing and controlling capabilities as internal and external memory bandwidth, this approach is optimized to make the best use of currently available silicon resources. A high clock frequency is achieved by implementation of classic RISC features. The architecture fully supports high level language programming. With the 16 data path version and a 100 MHz clock, a sustained performance of more than 2 billion arithmetic operations per second (GOPS) is achieved for a wide range of algorithms. The examples show the parallel implementation of image processing algorithms like histogramming, Hough transform, or search in a sorted list with efficient use of the processor resources. A prototype of the architecture with four parallel data paths is available, using a 0.6 /spl mu/m CMOS technology.

...read moreread less

Journal Article•DOI•

Multiviewpoint video coding with MPEG-2 compatibility

[...]

B.L. Tseng¹, D. Anastossiou¹•Institutions (1)

Columbia University¹

01 Aug 1996-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An efficient video coding scheme is presented as an extension of the MPEG-2 standard to accommodate the transmission of multiple viewpoint sequences on bandwidth-limited channels, thus providing fast and accurate constructions of multiple perspectives.

...read moreread less

Abstract: An efficient video coding scheme is presented as an extension of the MPEG-2 standard to accommodate the transmission of multiple viewpoint sequences on bandwidth-limited channels. With the goal of compression and speed, the proposed approach incorporates a variety of existing computer graphics tools and techniques. Construction of each viewpoint image is predicted using a combination of perspective projection of three-dimensional (3-D) models, texture mapping, and digital image warping. Immediate application of the coding specification is foreseeable in systems with hardware-based real-time rendering capabilities, thus providing fast and accurate constructions of multiple perspectives.

...read moreread less