scispace - formally typeset
Search or ask a question
Book ChapterDOI

Fixed Point Learning Based 3D Conversion of 2D Videos

30 Jun 2015-pp 85-94
TL;DR: A fast automatic 2D to 3D conversion technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of query images using model trained from a training database of 2D color and depth images.
Abstract: The depth cues which are also called monocular cues from single still image are more versatile while depth cues of multiple images gives more accurate depth extraction. Machine learning is a promising and new research direction for this type of conversion in today scenario. In our paper, a fast automatic 2D to 3D conversion technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of query images using model trained from a training database of 2D color and depth images. The depth maps obtained from monocular and motion depth cues of input images/video and ground truth depths are used in training database for the fixed point iteration. The results produces with fixed point model are more accurate and reliable than MRF fusion of both types of depth cues. The stereo pairs are generated then using input video frames and their corresponding depth maps obtained from fixed point learning framework. These stereo pairs are put together to get the final 3D video which can be displayed on any 3DTV and seen using 3D glasses.

Content maybe subject to copyright    Report

References
More filters
Proceedings ArticleDOI
13 Jun 2010
TL;DR: It is discovered that “classical” flow formulations perform surprisingly well when combined with modern optimization and implementation techniques, and while median filtering of intermediate flow fields during optimization is a key to recent performance gains, it leads to higher energy solutions.
Abstract: The accuracy of optical flow estimation algorithms has been improving steadily as evidenced by results on the Middlebury optical flow benchmark. The typical formulation, however, has changed little since the work of Horn and Schunck. We attempt to uncover what has made recent advances possible through a thorough analysis of how the objective function, the optimization method, and modern implementation practices influence accuracy. We discover that “classical” flow formulations perform surprisingly well when combined with modern optimization and implementation techniques. Moreover, we find that while median filtering of intermediate flow fields during optimization is a key to recent performance gains, it leads to higher energy solutions. To understand the principles behind this phenomenon, we derive a new objective that formalizes the median filtering heuristic. This objective includes a nonlocal term that robustly integrates flow estimates over large spatial neighborhoods. By modifying this new term to include information about flow and image boundaries we develop a method that ranks at the top of the Middlebury benchmark.

1,529 citations


"Fixed Point Learning Based 3D Conve..." refers methods in this paper

  • ...In our experiments, optical flow is used for depth extraction and it is calculated between two consecutive frames taken from the scene as described in [6]....

    [...]

Journal ArticleDOI
TL;DR: This work considers the problem of estimating detailed 3D structure from a single still image of an unstructured environment and uses a Markov random field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch.
Abstract: We consider the problem of estimating detailed 3D structure from a single still image of an unstructured environment. Our goal is to create 3D models that are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov random field (MRF) to infer a set of "plane parametersrdquo that capture both the 3D location and 3D orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3D structure than does prior art and also give a much richer experience in the 3D flythroughs created using image-based rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3D models for 64.9 percent of 588 images downloaded from the Internet. We have also extended our model to produce large-scale 3D models from a few images.

1,522 citations

Proceedings ArticleDOI
20 Jun 2009
TL;DR: Compared to existing object recognition approaches that require training for each object category, the proposed nonparametric scene parsing system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.
Abstract: In this paper we propose a novel nonparametric approach for object recognition and scene parsing using dense scene alignment. Given an input image, we retrieve its best matches from a large database with annotated images using our modified, coarse-to-fine SIFT flow algorithm that aligns the structures within two images. Based on the dense scene correspondence obtained from the SIFT flow, our system warps the existing annotations, and integrates multiple cues in a Markov random field framework to segment and recognize the query image. Promising experimental results have been achieved by our nonparametric scene parsing system on a challenging database. Compared to existing object recognition approaches that require training for each object category, our system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.

396 citations


"Fixed Point Learning Based 3D Conve..." refers background in this paper

  • ...The semantic labels and a higher complex supervised model is incorporated in [3] to achieve more accurate depth maps....

    [...]

Journal ArticleDOI
TL;DR: This paper presents a simple yet effective approach to estimate the amount of spatially varying defocus blur at edge locations, and demonstrates the effectiveness of this method in providing a reliable estimation of the defocus map.

370 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A simplified and computationally-efficient version of the recent 2D-to-3D image conversion algorithm, which is validated quantitatively on a Kinect-captured image+depth dataset against the Make3D algorithm.
Abstract: Among 2D-to-3D image conversion methods, those involving human operators have been most successful but also time-consuming and costly. Automatic methods, that typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality as they often rely on assumptions that are easily violated in practice. In this paper, we adopt the radically different approach of “learning” the 3D scene structure. We develop a simplified and computationally-efficient version of our recent 2D-to-3D image conversion algorithm. Given a repository of 3D images, either as stereopairs or image+depth pairs, we find k pairs whose photometric content most closely matches that of a 2D query to be converted. Then, we fuse the k corresponding depth fields and align the fused depth with the 2D query. Unlike in our original work, we validate the simplified algorithm quantitatively on a Kinect-captured image+depth dataset against the Make3D algorithm. While far from perfect, the presented results demonstrate that online repositories of 3D content can be used for effective 2D-to-3D image conversion.

131 citations


"Fixed Point Learning Based 3D Conve..." refers methods in this paper

  • ...The machine learning approach has also been adopted in [2] that exchanges transference of labels by directly depth map data....

    [...]