Fixed Point Learning Based 3D Conversion of 2D Videos

doi:10.1007/978-3-319-19941-2_9

Home
/
Papers
/
Fixed Point Learning Based 3D Conversion of 2D Videos

Book Chapter•DOI•

Fixed Point Learning Based 3D Conversion of 2D Videos

Nidhi Chahal¹, Santanu Chaudhury¹•Institutions (1)

Indian Institutes of Technology¹

30 Jun 2015-pp 85-94

TL;DR: A fast automatic 2D to 3D conversion technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of query images using model trained from a training database of 2D color and depth images.

read less

Abstract: The depth cues which are also called monocular cues from single still image are more versatile while depth cues of multiple images gives more accurate depth extraction. Machine learning is a promising and new research direction for this type of conversion in today scenario. In our paper, a fast automatic 2D to 3D conversion technique is proposed which utilizes a fixed point learning framework for the accurate estimation of depth maps of query images using model trained from a training database of 2D color and depth images. The depth maps obtained from monocular and motion depth cues of input images/video and ground truth depths are used in training database for the fixed point iteration. The results produces with fixed point model are more accurate and reliable than MRF fusion of both types of depth cues. The stereo pairs are generated then using input video frames and their corresponding depth maps obtained from fixed point learning framework. These stereo pairs are put together to get the final 3D video which can be displayed on any 3DTV and seen using 3D glasses.

...read moreread less

Content maybe subject to copyright Report

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Secrets of optical flow estimation and their principles

[...]

Deqing Sun¹, Stefan Roth², Michael J. Black¹•Institutions (2)

Brown University¹, Technische Universität Darmstadt²

13 Jun 2010

TL;DR: It is discovered that “classical” flow formulations perform surprisingly well when combined with modern optimization and implementation techniques, and while median filtering of intermediate flow fields during optimization is a key to recent performance gains, it leads to higher energy solutions.

...read moreread less

Abstract: The accuracy of optical flow estimation algorithms has been improving steadily as evidenced by results on the Middlebury optical flow benchmark. The typical formulation, however, has changed little since the work of Horn and Schunck. We attempt to uncover what has made recent advances possible through a thorough analysis of how the objective function, the optimization method, and modern implementation practices influence accuracy. We discover that “classical” flow formulations perform surprisingly well when combined with modern optimization and implementation techniques. Moreover, we find that while median filtering of intermediate flow fields during optimization is a key to recent performance gains, it leads to higher energy solutions. To understand the principles behind this phenomenon, we derive a new objective that formalizes the median filtering heuristic. This objective includes a nonlocal term that robustly integrates flow estimates over large spatial neighborhoods. By modifying this new term to include information about flow and image boundaries we develop a method that ranks at the top of the Middlebury benchmark.

...read moreread less

1,529 citations

"Fixed Point Learning Based 3D Conve..." refers methods in this paper

...In our experiments, optical flow is used for depth extraction and it is calculated between two consecutive frames taken from the scene as described in [6]....
[...]

Journal Article•DOI•

Make3D: Learning 3D Scene Structure from a Single Still Image

[...]

Ashutosh Saxena¹, Min Sun², Andrew Y. Ng¹•Institutions (2)

Stanford University¹, Princeton University²

01 May 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work considers the problem of estimating detailed 3D structure from a single still image of an unstructured environment and uses a Markov random field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch.

...read moreread less

Abstract: We consider the problem of estimating detailed 3D structure from a single still image of an unstructured environment. Our goal is to create 3D models that are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov random field (MRF) to infer a set of "plane parametersrdquo that capture both the 3D location and 3D orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3D structure than does prior art and also give a much richer experience in the 3D flythroughs created using image-based rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3D models for 64.9 percent of 588 images downloaded from the Internet. We have also extended our model to produce large-scale 3D models from a few images.

...read moreread less

1,522 citations

Proceedings Article•DOI•

Nonparametric scene parsing: Label transfer via dense scene alignment

[...]

Ce Liu¹, Jenny Yuen¹, Antonio Torralba¹•Institutions (1)

Massachusetts Institute of Technology¹

20 Jun 2009

TL;DR: Compared to existing object recognition approaches that require training for each object category, the proposed nonparametric scene parsing system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.

...read moreread less

Abstract: In this paper we propose a novel nonparametric approach for object recognition and scene parsing using dense scene alignment. Given an input image, we retrieve its best matches from a large database with annotated images using our modified, coarse-to-fine SIFT flow algorithm that aligns the structures within two images. Based on the dense scene correspondence obtained from the SIFT flow, our system warps the existing annotations, and integrates multiple cues in a Markov random field framework to segment and recognize the query image. Promising experimental results have been achieved by our nonparametric scene parsing system on a challenging database. Compared to existing object recognition approaches that require training for each object category, our system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.

...read moreread less

396 citations

"Fixed Point Learning Based 3D Conve..." refers background in this paper

...The semantic labels and a higher complex supervised model is incorporated in [3] to achieve more accurate depth maps....
[...]

Journal Article•DOI•

Defocus map estimation from a single image

[...]

Shaojie Zhuo¹, Terence Sim¹•Institutions (1)

National University of Singapore¹

01 Sep 2011-Pattern Recognition

TL;DR: This paper presents a simple yet effective approach to estimate the amount of spatially varying defocus blur at edge locations, and demonstrates the effectiveness of this method in providing a reliable estimation of the defocus map.

...read moreread less

370 citations

Proceedings Article•DOI•

2D-to-3D image conversion by learning depth from examples

[...]

Janusz Konrad¹, Meng Wang¹, Prakash Ishwar¹•Institutions (1)

Boston University¹

16 Jun 2012

TL;DR: A simplified and computationally-efficient version of the recent 2D-to-3D image conversion algorithm, which is validated quantitatively on a Kinect-captured image+depth dataset against the Make3D algorithm.

...read moreread less

Abstract: Among 2D-to-3D image conversion methods, those involving human operators have been most successful but also time-consuming and costly. Automatic methods, that typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality as they often rely on assumptions that are easily violated in practice. In this paper, we adopt the radically different approach of “learning” the 3D scene structure. We develop a simplified and computationally-efficient version of our recent 2D-to-3D image conversion algorithm. Given a repository of 3D images, either as stereopairs or image+depth pairs, we find k pairs whose photometric content most closely matches that of a 2D query to be converted. Then, we fuse the k corresponding depth fields and align the fused depth with the 2D query. Unlike in our original work, we validate the simplified algorithm quantitatively on a Kinect-captured image+depth dataset against the Make3D algorithm. While far from perfect, the presented results demonstrate that online repositories of 3D content can be used for effective 2D-to-3D image conversion.

...read moreread less

131 citations