Bio: H. Aydinoglu is an academic researcher from Georgia Institute of Technology. The author has contributed to research in topics: Data compression & Subspace topology. The author has an hindex of 5, co-authored 12 publications receiving 179 citations.
TL;DR: A new stereo image coding algorithm that is based on disparity compensation and subspace projection is described, and empirical results suggest that the SPT approach outperforms current stereo coding techniques.
Abstract: Due to advances in display technology, three-dimensional (3-D) imaging systems are becoming increasingly popular. One way of stimulating 3-D perception is to use stereo pairs, a pair of images of the same scene acquired from different perspectives. Since there is an inherent redundancy between the images of a stereo pair, data compression algorithms should be employed to represent stereo pairs efficiently. This paper focuses on the stereo image coding problem. We begin with a description of the problem and a survey of current stereo coding techniques. A new stereo image coding algorithm that is based on disparity compensation and subspace projection is described. This algorithm, the subspace projection technique (SPT), is a transform domain approach with a space-varying transformation matrix and may be interpreted as a spatial-transform domain representation of the stereo data. The advantage of the proposed approach is that it can locally adapt to the changes in the cross-correlation characteristics of the stereo pairs. Several design issues and implementations of the algorithm are discussed. Finally, we present empirical results suggesting that the SPT approach outperforms current stereo coding techniques.
••13 Nov 1994
TL;DR: The authors investigate efficient storage and transmission techniques for multi-view images and propose a block-based disparity compensated coder that solves the occlusion problem and is a good candidate for real-time implementation.
Abstract: Three dimensional display of video sequences with look-around capability enhances realism and introduces a unique sense of "being there". This final step of display realism can be achieved by stereoscopic techniques with the use of multi-view image sequences. A multi-view image system is an extended stereoscopic system with look-around capability. One technical problem that keeps these systems from widespread application is the large amount of information contained in a multi-view system. This information is generally 5 to 10 times more than the information content of an image. However, there is high inter-frame correlation in a multi-view image set which can be exploited for compression purposes. Therefore, the authors investigate efficient storage and transmission techniques for multi-view images. A block-based disparity compensated coder is proposed. This proposed algorithm solves the occlusion problem and is a good candidate for real-time implementation. >
••23 Oct 1995
TL;DR: A region-based stereo image coding algorithm is proposed and evaluated and is shown to outperform standard block-based and independent coding algorithms.
Abstract: A region-based stereo image coding algorithm is proposed and evaluated. Three types of regions are considered: occlusion, edge and smooth regions. The region in the right image that is occluded due to finite viewing area is estimated using a new approach. This region is independently coded. The non-occluded region is segmented into edge and smooth regions. Each region is composed of fixed size blocks. The disparity for each block in a non-occluded region is estimated using a block-based approach. The estimated disparity field is encoded employing a lossy residual uniform scalar quantizer and an adaptive arithmetic coder based on the segmentation information. The decoded vectors are used for the subspace projection technique, which is a combined disparity and illumination compensation algorithm. The proposed approach is shown to outperform standard block-based and independent coding algorithms.
28 Apr 1995
TL;DR: An approach to stereo image coding based on disparity compensation and subspace projection and SPT, an orthogonal basis approach, which performs better than the disparity compensation in the rate-distortion sense.
Abstract: An approach to stereo image coding based on disparity compensation and subspace projection is proposed and evaluated. Traditional stereo image coding techniques employing block-based disparity compensation have problems with occlusion regions and photometric variations. We show that the results of these algorithms can be improved if the subspace projection technique (SPT) is employed as a post processing technique after disparity estimation. SPT is an orthogonal basis approach. It combines block-based disparity estimation with two-dimensional first order approximation. For the regions where disparity estimation fails, such as occlusion regions, first order approximation provides a decent estimate. On the other hand, zeroth order term compensates for photometric variations. Empirical results verified that SPT performs better than the disparity compensation in the rate-distortion sense.
••09 May 1995
TL;DR: A new framework for the compression of multi-view image sequences by employing a bidirectional disparity estimator and a modified version of the subspace projection technique (SPT).
Abstract: We propose a new framework for the compression of multi-view image sequences. We define three types of frames, and each type is coded with a different strategy. The first type of frame is independently coded and is called I-frame. The second is a B-frame and is coded using a bidirectional disparity estimator and a modified version of the subspace projection technique (SPT). The SPT algorithm compensates the photometric variations between the multi-view frames. The projection block size is chosen to be small so that coding of the residual image is not necessary. On the other hand, to decrease the overhead information both disparity vectors and projection coefficients are coded with a lossy scheme. Finally, the third type of frame is a P-frame and is coded by employing a unidirectional disparity estimator and DC level compensation.
TL;DR: The perceptual requirements for 3-D TV that can be extracted from the literature are summarized and issues that require further investigation are addressed in order for 3D TV to be a success.
Abstract: A high-quality three-dimensional (3-D) broadcast service (3-D TV) is becoming increasingly feasible based on various recent technological developments combined with an enhanced understanding of 3-D perception and human factors issues surrounding 3-D TV. In this paper, 3-D technology and perceptually relevant issues, in particular 3-D image quality and visual comfort, in relation to 3-D TV systems are reviewed. The focus is on near-term displays for broadcast-style single- and multiple-viewer systems. We discuss how an image quality model for conventional two-dimensional images needs to be modified to be suitable for image quality research for 3-D TV. In this respect, studies are reviewed that have focused on the relationship between subjective attributes of 3-D image quality and physical system parameters that induce them (e.g., parameter choices in image acquisition, compression, and display). In particular, artifacts that may arise in 3-D TV systems are addressed, such as keystone distortion, depth-plane curvature, puppet theater effect, cross talk, cardboard effect, shear distortion, picket-fence effect, and image flipping. In conclusion, we summarize the perceptual requirements for 3-D TV that can be extracted from the literature and address issues that require further investigation in order for 3-D TV to be a success.
TL;DR: The techniques for image-based rendering (IBR), in which 3-D geometry of the scene is known, are surveyed and the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies are explored.
Abstract: We survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical.
19 Jun 2008
TL;DR: In this paper, a depth map for center view frames and an occlusion data frame are encoded on the basis of the depth map, and a distinction is made between functional and non-functional data in the data frame.
Abstract: In a method for encoding and an encoder for a 3D video signal, centre view frames, a depth map for centre view frames and an occlusion data frame are encoded. On the basis of the depth map for the centre view frame a distinction is made between functional and non- functional data in an occlusion data frame. This allows a strong reduction in bits needed for the encoded occlusion data frame. In the decoder a combined data stream is made of functional data in the encoded occlusion data frames and the centre view frames. Preferably the centre view frames are used as reference frames in encoding the occlusion data frames.
TL;DR: This work proposes disparity-compensated lifting for wavelet compression of light fields, which solves the irreversibility limitations of previous light field wavelet coding approaches, using the lifting structure.
Abstract: We propose disparity-compensated lifting for wavelet compression of light fields. With this approach, we obtain the benefits of wavelet coding, such as scalability in all dimensions, as well as superior compression performance. Additionally, the proposed approach solves the irreversibility limitations of previous light field wavelet coding approaches, using the lifting structure. Our scheme incorporates disparity compensation into the lifting structure for the transform across the views in the light field data set. Another transform is performed to exploit the coherence among neighboring pixels, followed by a modified SPIHT coder and rate-distortion optimized bitstream assembly. A view-sequencing algorithm is developed to organize the views for encoding. For light fields of an object, we propose to use shape adaptation to improve the compression efficiency and visual quality of the images. The necessary shape information is efficiently coded based on prediction from the existing geometry model. Experimental results show that the proposed scheme exhibits superior compression performance over existing light field compression techniques.
TL;DR: Two different approaches which exploit three-dimensional scene geometry for multi-view compression are presented, which show that texture-based coding is more sensitive to geometry inaccuracies than predictive coding, while model-aided predictive coding performs best.
Abstract: To store and transmit the large amount of image data necessary for Image-based Rendering (IBR), efficient coding schemes are required. This paper presents two different approaches which exploit three-dimensional scene geometry for multi-view compression. In texture-based coding, images are converted to view-dependent texture maps for compression. In model-aided predictive coding, scene geometry is used for disparity compensation and occlusion detection between images. While both coding strategies are able to attain compression ratios exceeding 2000:1, individual coding performance is found to depend on the accuracy of the available geometry model. Experiments with real-world as well as synthetic image sets show that texture-based coding is more sensitive to geometry inaccuracies than predictive coding. A rate-distortion theoretical analysis of both schemes supports these findings. For reconstructed approximate geometry models, model-aided predictive coding performs best, while texture-based coding yields superior coding results if scene geometry is exactly known.