scispace - formally typeset
Search or ask a question
Author

N. Atzpadin

Bio: N. Atzpadin is an academic researcher from Fraunhofer Society. The author has contributed to research in topics: Depth map & Videoconferencing. The author has an hindex of 4, co-authored 4 publications receiving 524 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This paper discusses an advanced approach for a 3DTV service, which is based on the concept of video-plus-depth data representations, and provides a modular and flexible system architecture supporting a wide range of multi-view structures.
Abstract: Due to enormous progress in the areas of auto-stereoscopic 3D displays, digital video broadcast and computer vision algorithms, 3D television (3DTV) has reached a high technical maturity and many people now believe in its readiness for marketing. Experimental prototypes of entire 3DTV processing chains have been demonstrated successfully during the last few years, and the motion picture experts group (MPEG) of ISO/IEC has launched related ad hoc groups and standardization efforts envisaging the emerging market segment of 3DTV. In this context the paper discusses an advanced approach for a 3DTV service, which is based on the concept of video-plus-depth data representations. It particularly considers aspects of interoperability and multi-view adaptation for the case that different multi-baseline geometries are used for multi-view capturing and 3D display. Furthermore it presents algorithmic solutions for the creation of depth maps and depth image-based rendering related to this framework of multi-view adaptation. In contrast to other proposals, which are more focused on specialized configurations, the underlying approach provides a modular and flexible system architecture supporting a wide range of multi-view structures.

434 citations

Proceedings ArticleDOI
01 Jan 2008
TL;DR: An overall concept, which includes the geometrical design of the whole prototype demonstrator, the arrangement of the cameras and displays and the general multi-view video analysis chain, is presented to fulfil the requirements of a novel 3D immersive videoconferencing system, including directional eye gaze and gesture awareness.
Abstract: Traditional set-top camera video-conferencing systems still fail to meet the 'telepresence challenge' of providing a viable alternative for physical business travel, which is nowadays characterized by unacceptable delays, costs, inconvenience, and an increasingly large ecological footprint. Even recent high-end commercial solutions, while partially removing some of these traditional shortcomings, still present the problems of not scaling easily, expensive implementations, not utilizing 3D life-sized representations of the remote participants and addressing only eye contact and gesture-based interactions in very limited ways. The European FP7 project 3DPresence will develop a multi-party, high-end 3D videoconferencing concept that will tackle the problem of transmitting the feeling of physical presence in real-time to multiple remote locations in a transparent and natural way. In this paper, we present an overall concept, which includes the geometrical design of the whole prototype demonstrator, the arrangement of the cameras and displays and the general multi-view video analysis chain. The driving force behind the design strategy is to fulfil the requirements of a novel 3D immersive videoconferencing system, including directional eye gaze and gesture awareness. (8 pages)

59 citations

Proceedings ArticleDOI
07 Jun 2010
TL;DR: A novel scalable and high performance 3D acquisition framework for immersive 3D videoconference systems which takes benefit from both sides and is able to integrate complex computer vision algorithms, such as Visual Hull, multi-view stereo matching, segmentation, image rectification, lens distortion correction and virtual view synthesis in one real-time framework.
Abstract: The interest in immersive 3D video conference systems exists now for many years from both sides, the commercialization point of view as well as from a research perspective. Still, one of the major bottlenecks in this context is the computational complexity of the required algorithmic modules. This paper discusses this problem from a hardware point of view. We use new fast graphics board solutions, which allow high algorithmic parallelization in consumer PC environments on one hand and look at state-of-the-art powerful multi-core CPU processing capabilities on the other hand. We propose a novel scalable and high performance 3D acquisition framework for immersive 3D videoconference systems which takes benefit from both sides. In this way we are able to integrate complex computer vision algorithms, such as Visual Hull, multi-view stereo matching, segmentation, image rectification, lens distortion correction and virtual view synthesis as well as data encoding, network signaling and capturing for 16 HD cameras in one real-time framework. This paper is based on results and experiences of the European FP7 research project 3DPresence which aims to build a real-time three party and multi-user 3D videoconferencing system.

33 citations

Proceedings ArticleDOI
07 Nov 2009
TL;DR: The fusion of two competing approaches which have, from a camera configuration point of view, contrary to each other properties is discussed, which will combine the volumetric Visual Hull (VH) approach with the stereo matching based Hybrid Recursive Matching (HRM) to a new method which benefits from the advantages of both techniques and discards their weak points.
Abstract: This paper discusses the problem of high quality depth map estimation for real-time systems. Our work is based on the European FP7 project 3DPresence which aims to build a multi-view and multiuser 3D videoconferencing system. Based on new multi-view auto-stereoscopic display technology the remote conferees will be rendered as an integral part of a three dimensional virtual shared environment. In order to create the related views for the 3D displays as well as to virtually correct the eye contact problem robust depth maps are required. For this purpose, in this paper we will discuss the fusion of two competing approaches which have, from a camera configuration point of view, contrary to each other properties. Namely, we will combine the volumetric Visual Hull (VH) approach with the stereo matching based Hybrid Recursive Matching (HRM) to a new method which benefits from the advantages of both techniques and discards their weak points.

7 citations


Cited by
More filters
Proceedings ArticleDOI
12 Nov 2007
TL;DR: The impact on image quality of rendered arbitrary intermediate views is investigated and analyzed in a second part, comparing compressed multi-view video plus depth data at different bit rates with the uncompressed original.
Abstract: A study on the video plus depth representation for multi-view video sequences is presented. Such a 3D representation enables functionalities like 3D television and free viewpoint video. Compression is based on algorithms for multi-view video coding, which exploit statistical dependencies from both temporal and inter-view reference pictures for prediction of both color and depth data. Coding efficiency of prediction structures with and without inter-view reference pictures is analyzed for multi-view video plus depth data, reporting gains in luma PSNR of up to 0.5 dB for depth and 0.3 dB for color. The main benefit from using a multi-view video plus depth representation is that intermediate views can be easily rendered. Therefore the impact on image quality of rendered arbitrary intermediate views is investigated and analyzed in a second part, comparing compressed multi-view video plus depth data at different bit rates with the uncompressed original.

485 citations

Journal ArticleDOI
01 Apr 2011
TL;DR: This paper describes efficient coding methods for video and depth data, and synthesis methods are presented, which mitigate errors from depth estimation and coding, for the generation of views.
Abstract: Current 3-D video (3DV) technology is based on stereo systems. These systems use stereo video coding for pictures delivered by two input cameras. Typically, such stereo systems only reproduce these two camera views at the receiver and stereoscopic displays for multiple viewers require wearing special 3-D glasses. On the other hand, emerging autostereoscopic multiview displays emit a large numbers of views to enable 3-D viewing for multiple users without requiring 3-D glasses. For representing a large number of views, a multiview extension of stereo video coding is used, typically requiring a bit rate that is proportional to the number of views. However, since the quality improvement of multiview displays will be governed by an increase of emitted views, a format is needed that allows the generation of arbitrary numbers of views with the transmission bit rate being constant. Such a format is the combination of video signals and associated depth maps. The depth maps provide disparities associated with every sample of the video signal that can be used to render arbitrary numbers of additional views via view synthesis. This paper describes efficient coding methods for video and depth data. For the generation of views, synthesis methods are presented, which mitigate errors from depth estimation and coding.

420 citations

Journal ArticleDOI
TL;DR: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data, and develops and integrated a novel encoder control that guarantees that high quality intermediate views can be generated based on the decoded data.
Abstract: This paper describes an extension of the high efficiency video coding (HEVC) standard for coding of multi-view video and depth data. In addition to the known concept of disparity-compensated prediction, inter-view motion parameter, and inter-view residual prediction for coding of the dependent video views are developed and integrated. Furthermore, for depth coding, new intra coding modes, a modified motion compensation and motion vector coding as well as the concept of motion parameter inheritance are part of the HEVC extension. A novel encoder control uses view synthesis optimization, which guarantees that high quality intermediate views can be generated based on the decoded data. The bitstream format supports the extraction of partial bitstreams, so that conventional 2D video, stereo video, and the full multi-view video plus depth format can be decoded from a single bitstream. Objective and subjective results are presented, demonstrating that the proposed approach provides 50% bit rate savings in comparison with HEVC simulcast and 20% in comparison with a straightforward multi-view extension of HEVC without the newly developed coding tools.

365 citations

Journal ArticleDOI
TL;DR: 3DTV coding technology is maturating, however, the research area is relatively young compared to coding of other types of media, and there is still a lot of room for improvement and new development of algorithms.
Abstract: Research efforts on 3DTV technology have been strengthened worldwide recently, covering the whole media processing chain from capture to display. Different 3DTV systems rely on different 3D scene representations that integrate various types of data. Efficient coding of these data is crucial for the success of 3DTV. Compression of pixel-type data including stereo video, multiview video, and associated depth or disparity maps extends available principles of classical video coding. Powerful algorithms and open international standards for multiview video coding and coding of video plus depth data are available and under development, which will provide the basis for introduction of various 3DTV systems and services in the near future. Compression of 3D mesh models has also reached a high level of maturity. For static geometry, a variety of powerful algorithms are available to efficiently compress vertices and connectivity. Compression of dynamic 3D geometry is currently a more active field of research. Temporal prediction is an important mechanism to remove redundancy from animated 3D mesh sequences. Error resilience is important for transmission of data over error prone channels, and multiple description coding (MDC) is a suitable way to protect data. MDC of still images and 2D video has already been widely studied, whereas multiview video and 3D meshes have been addressed only recently. Intellectual property protection of 3D data by watermarking is a pioneering research area as well. The 3D watermarking methods in the literature are classified into three groups, considering the dimensions of the main components of scene representations and the resulting components after applying the algorithm. In general, 3DTV coding technology is maturating. Systems and services may enter the market in the near future. However, the research area is relatively young compared to coding of other types of media. Therefore, there is still a lot of room for improvement and new development of algorithms.

326 citations

Journal ArticleDOI
TL;DR: The eight articles in this special section are devoted to multi-view imaging and three dimensional television displays.
Abstract: The eight articles in this special section are devoted to multi-view imaging and three dimensional television displays.

324 citations