scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Object-based coding of stereo image sequences using joint 3-D motion/disparity compensation

TL;DR: An object-based coding scheme is proposed for the coding of a stereoscopic image sequence using motion and disparity information and the use of the depth map information for the generation of intermediate views at the receiver is discussed.
Abstract: An object-based coding scheme is proposed for the coding of a stereoscopic image sequence using motion and disparity information. A hierarchical block-based motion estimation approach is used for initialization, while disparity estimation is performed using a pixel-based hierarchical dynamic programming algorithm. A split-and-merge segmentation procedure based on three-dimensional (3-D) motion modeling is then used to determine regions with similar motion parameters. The segmentation part of the algorithm is interleaved with the estimation part in order to optimize the coding performance of the procedure. Furthermore, a technique is examined for propagating the segmentation information with time. A 3-D motion-compensated prediction technique is used for both intensity and depth image sequence coding. Error images and depth maps are encoded using discrete cosine transform (DCT) and Huffman methods. Alternately, an efficient wireframe depth modeling technique may be used to convey depth information to the receiver. Motion and wireframe model parameters are then quantized and transmitted to the decoder along with the segmentation information. As a straightforward application, the use of the depth map information for the generation of intermediate views at the receiver is also discussed. The performance of the proposed compression methods is evaluated experimentally and is compared to other stereoscopic image sequence coding schemes.
Citations
More filters
Patent
18 Nov 2013
TL;DR: In this article, a modular intelligent transportation system, comprising an environmentally protected enclosure, a system communications bus, a processor module, communicating with said bus, having a image data input and an audio input, the processor module analyzing the image data and/or audio input for data patterns represented therein, having at least one available option slot, a power supply, and a communication link for external communications.
Abstract: A modular intelligent transportation system, comprising an environmentally protected enclosure, a system communications bus, a processor module, communicating with said bus, having a image data input and an audio input, the processor module analyzing the image data and/or audio input for data patterns represented therein, having at least one available option slot, a power supply, and a communication link for external communications, in which at least one available option slot can be occupied by a wireless local area network access point, having a communications path between said communications link and said wireless access point, or other modular components.

377 citations

Journal ArticleDOI
TL;DR: The perceptual requirements for 3-D TV that can be extracted from the literature are summarized and issues that require further investigation are addressed in order for 3D TV to be a success.
Abstract: A high-quality three-dimensional (3-D) broadcast service (3-D TV) is becoming increasingly feasible based on various recent technological developments combined with an enhanced understanding of 3-D perception and human factors issues surrounding 3-D TV. In this paper, 3-D technology and perceptually relevant issues, in particular 3-D image quality and visual comfort, in relation to 3-D TV systems are reviewed. The focus is on near-term displays for broadcast-style single- and multiple-viewer systems. We discuss how an image quality model for conventional two-dimensional images needs to be modified to be suitable for image quality research for 3-D TV. In this respect, studies are reviewed that have focused on the relationship between subjective attributes of 3-D image quality and physical system parameters that induce them (e.g., parameter choices in image acquisition, compression, and display). In particular, artifacts that may arise in 3-D TV systems are addressed, such as keystone distortion, depth-plane curvature, puppet theater effect, cross talk, cardboard effect, shear distortion, picket-fence effect, and image flipping. In conclusion, we summarize the perceptual requirements for 3-D TV that can be extracted from the literature and address issues that require further investigation in order for 3-D TV to be a success.

333 citations

Journal ArticleDOI
TL;DR: The correlation between subjective and objective evaluation of color plus depth video and transmission over Internet protocol (IP) is investigated, and subjective results are used to determine more accurate objective quality assessment metrics for 3D color plus Depth video.
Abstract: In the near future, many conventional video applications are likely to be replaced by immersive video to provide a sense of ldquobeing there.rdquo This transition is facilitated by the recent advancement of 3D capture, coding, transmission, and display technologies. Stereoscopic video is the simplest form of 3D video available in the literature. ldquoColor plus depth maprdquo based stereoscopic video has attracted significant attention, as it can reduce storage and bandwidth requirements for the transmission of stereoscopic content over communication channels. However, quality assessment of coded video sequences can currently only be performed reliably using expensive and inconvenient subjective tests. To enable researchers to optimize 3D video systems in a timely fashion, it is essential that reliable objective measures are found. This paper investigates the correlation between subjective and objective evaluation of color plus depth video. The investigation is conducted for different compression ratios, and different video sequences. Transmission over Internet protocol (IP) is also investigated. Subjective tests are performed to determine the image quality and depth perception of a range of differently coded video sequences, with packet loss rates ranging from 0% to 20%. The subjective results are used to determine more accurate objective quality assessment metrics for 3D color plus depth video.

169 citations


Cites methods from "Object-based coding of stereo image..."

  • ...D compression techniques could be used to encode both color and depth map sequences simultaneously [12]....

    [...]

Journal ArticleDOI
TL;DR: It is shown that complete surfel-based reconstructions can be created by repeatedly applying an algorithm called Surfel Sampling that combines sampling and parameter estimation to fit a single surfel to a small, bounded region of space-time.
Abstract: In this paper we study the problem of recovering the 3D shape, reflectance, and non-rigid motion properties of a dynamic 3D scene. Because these properties are completely unknown and because the scene's shape and motion may be non-smooth, our approach uses multiple views to build a piecewise-continuous geometric and radiometric representation of the scene's trace in space-time. A basic primitive of this representation is the dynamic surfel, which (1) encodes the instantaneous local shape, reflectance, and motion of a small and bounded region in the scene, and (2) enables accurate prediction of the region's dynamic appearance under known illumination conditions. We show that complete surfel-based reconstructions can be created by repeatedly applying an algorithm called Surfel Sampling that combines sampling and parameter estimation to fit a single surfel to a small, bounded region of space-time. Experimental results with the Phong reflectance model and complex real scenes (clothing, shiny objects, skin) illustrate our method's ability to explain pixels and pixel variations in terms of their underlying causes—shape, reflectance, motion, illumination, and visibility.

154 citations


Cites background from "Object-based coding of stereo image..."

  • ..., 1999, 2000) and on the brightness constancy assumption (Vedula et al., 1999, 2000; Zhang and Kambhamettu, 2000; Tzovaras and Grammalidis, 1997) restricts them to slowly-moving Lambertian scenes, where the effects of shading and shadows on scene appearance is negligible....

    [...]

  • ...…flow calculations (Vedula et al., 1999, 2000) and on the brightness constancy assumption (Vedula et al., 1999, 2000; Zhang and Kambhamettu, 2000; Tzovaras and Grammalidis, 1997) restricts them to slowly-moving Lambertian scenes, where the effects of shading and shadows on scene appearance is…...

    [...]

Journal ArticleDOI
TL;DR: The optimal predictors of a lifting scheme in the general n-dimensional case are obtained and applied for the lossless compression of still images using first quincunx sampling and then simple row-column sampling, and the best of the resulting coders produces better results than other known algorithms for multiresolution-based lossless image coding.
Abstract: The optimal predictors of a lifting scheme in the general n-dimensional case are obtained and applied for the lossless compression of still images using first quincunx sampling and then simple row-column sampling. In each case, the efficiency of the linear predictors is enhanced nonlinearly. Directional postprocessing is used in the quincunx case, and adaptive-length postprocessing in the row-column case. Both methods are seen to perform well. The resulting nonlinear interpolation schemes achieve extremely efficient image decorrelation. We further investigate context modeling and adaptive arithmetic coding of wavelet coefficients in a lossless compression framework. Special attention is given to the modeling contexts and the adaptation of the arithmetic coder to the actual data. Experimental evaluation shows that the best of the resulting coders produces better results than other known algorithms for multiresolution-based lossless image coding.

145 citations


Cites background from "Object-based coding of stereo image..."

  • ...Many applications such as the transmission of depth maps for the construction of 3-D views of a scene [2] or the efficient storage and communications of medical images require lossless coding [3], [4]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new approach for the interpretation of optical flow fields is presented, where the flow field is partitioned into connected segments of flow vectors, where each segment is consistent with a rigid motion of a roughly planar surface.
Abstract: A new approach for the interpretation of optical flow fields is presented. The flow field, which can be produced by a sensor moving through an environment with several independently moving, rigid objects, is allowed to be sparse, noisy, and partially incorrect. The approach is based on two main stages. In the first stage, the flow field is partitioned into connected segments of flow vectors, where each segment is consistent with a rigid motion of a roughly planar surface. In the second stage, segments are grouped under the hypothesis that they are induced by a single, rigidly moving object. Each hypothesis is tested by searching for three-dimensional (3-D) motion parameters which are compatible with all the segments in the corresponding group. Once the motion parameters are recovered, the relative environmental depth can be estimated as well. Experiments based on real and simulated data are presented.

902 citations

Journal ArticleDOI
TL;DR: An algorithm for matching images of real world scenes is presented, which quickly converges to good estimates of disparity, which reflect the spatial organization of the scene.
Abstract: An algorithm for matching images of real world scenes is presented The matching is a specification of the geometrical disparity between the images and may be used to partially reconstruct the three-dimensional structure of the scene Sets of candidate matching points are selected independently in each image These points are the locations of small, distinct features which are likely to be detectable in both images An initial network of possible matches between the two sets of candidates is constructed Each possible match specifies a possible disparity of a candidate point in a selected reference image An initial estimate of the probability of each possible disparity is made, based on the similarity of subimages surrounding the points These estimates are iteratively improved by a relaxation labeling technique making use of the local continuity property of disparity that is a consequence of the continuity of real world surfaces The algorithm is effective for binocular parallax, motion parallax, and object motion It quickly converges to good estimates of disparity, which reflect the spatial organization of the scene

891 citations


"Object-based coding of stereo image..." refers background in this paper

  • ...Due to the epipolar line constraint [ 21 ], the search area for each pixel of the right image is the interval in the left image, where is the maximum allowed disparity and ....

    [...]

  • ...Since the stereo camera configuration is known, the depth estimation problem reduces to that of disparity estimation [ 21 ]‐[23]....

    [...]

Journal ArticleDOI
TL;DR: An object-oriented analysis-synthesis coder is presented which encodes objects instead of blocks of N × N picture elements, which allows to introduce geometrical distortions instead of quantization errors.
Abstract: An object-oriented analysis-synthesis coder is presented which encodes objects instead of blocks of N × N picture elements. The objects are described by three parameter sets defining the motion, shape and colour of an object. The parameter sets are obtained by image analysis based on source models of either moving 2D-objects or moving 3D-objects. Known coding techniques are used to encode the parameter sets. An object-depending parameter coding allows to introduce geometrical distortions instead of quantization errors. Using the transmitted parameter sets an image can be reconstructed by model-based image synthesis. Experimental results achieved with a first implementation of the coder are given and are discussed.

451 citations

Journal ArticleDOI
TL;DR: An object-oriented analysis-synthesis coder is presented which encodes arbitrarily shaped objects instead of rectangular blocks and the efficient coding of motion and shape parameters can efficiently be coded.
Abstract: An object-oriented analysis-synthesis coder is presented which encodes arbitrarily shaped objects instead of rectangular blocks. The objects are described by three parameter sets defining their motion, shape and colour. Throughout this contribution, the colour parameters denote the luminance and chrominance values of the object surface. The parameter sets of each object are obtained by image analysis based on source models of moving 2D-objects and coded by an object-dependent parameter coding. Using the coded parameter sets an image can be reconstructed by model-based image synthesis. In order to cut down the generated bit-rate of the parameter coding, the colour updating of an object is suppressed if the modelling of the object by the source model is sufficiently exact, i.e., if only a relatively small colour update information would be needed for an errorless image synthesis. Omitting colour update information, small position errors of objects denoted as geometrical distortions are allowed for image synthesis instead of quantization error distortions. Tolerating geometrical distortions, the image area to be updated by colour coding can be decreased to 4% of the image size without introducing annoying distortions. As motion and shape parameters can efficiently be coded, about 1 bit per pel remains for colour updating in a 64 kbit/s coder compared to about 0.1 bit per pel in the standard reference coder (RM8) of the CCITT. Experimental results concerning the efficient coding of motion and shape parameters are given and discussed. The coding of the colour information will be dealt with in further research.

161 citations

Journal ArticleDOI
TL;DR: Results show that transmitting shape information and allowing small position errors (geometrical distortions) avoids the mosquito and blocking artefacts of a block-oriented coder, and the reconstructed image of an object-oriented analysis-synthesis coder appears sharper compared to block- oriented hybrid coding.
Abstract: An object-oriented analysis-synthesis coder is presented concentrating on the optimal relationship of its components image analysis, image synthesis and parameter coding and on a comparison of its coding efficiency for block-oriented hybrid coding. As the block-oriented hybrid coder, the RM8 of the CCITT is used. The presented object-oriented analysis-synthesis coder is based on the source model of moving flexible 2D-objects and encodes arbitrarily shaped objects instead of rectangular blocks. The objects are described by three parameter sets defining their motion, shape and colour (colour parameters denoting luminance as well as chrominance values of the object surface). The parameter sets of each object are obtained by image analysis and coded by an object dependent parameter coding. Using the coded parameter sets, an image can be reconstructed by model-based image synthesis. Experimental results show that transmitting shape information and allowing small position errors (geometrical distortions) avoids the mosquito and blocking artefacts of a block-oriented coder. Furthermore, important image areas such as facial areas can be reconstructed with an image quality improvement up to 4 dB using the image analysis. As a whole, the reconstructed image of an object-oriented analysis-synthesis coder appears sharper compared to block-oriented hybrid coding. >

95 citations


"Object-based coding of stereo image..." refers background in this paper

  • ...Object-based techniques have been extensively investigated for monoscopic image sequence coding [6]‐[ 9 ]....

    [...]