scispace - formally typeset
Open Access

Learning to Segment and Track in RGBD.

Alex Teichman, +1 more
- pp 575-590
Reads0
Chats0
TLDR
It is shown that it is possible to achieve an order of magnitude speedup and thus real-time performance on a laptop computer by applying simple algorithmic optimizations to the original work, which makes this approach applicable to a broader range of tasks.
Abstract
We consider the problem of segmenting and tracking deformable objects in color video with depth (RGBD) data available from commodity sensors such as the Asus Xtion Pro Live or Microsoft Kinect. We frame this problem with very few assumptions-no prior object model, no stationary sensor, and no prior 3-D map-thus making a solution potentially useful for a large number of applications, including semi-supervised learning, 3-D model capture, and object recognition. Our approach makes use of a rich feature set, including local image appearance, depth discontinuities, optical flow, and surface normals to inform the segmentation decision in a conditional random field model. In contrast to previous work in this field, the proposed method learns how to best make use of these features from ground-truth segmented sequences. We provide qualitative and quantitative analyses which demonstrate substantial improvement over the state of the art. This paper is an extended version of our previous work. Building on our previous work, we show that it is possible to achieve an order of magnitude speedup and thus real-time performance ( ~ 20 FPS) on a laptop computer by applying simple algorithmic optimizations to the original work. This speedup comes at only a minor cost in overall accuracy and thus makes this approach applicable to a broader range of tasks. We demonstrate one such task: real-time, online, interactive segmentation to efficiently collect training data for an off-the-shelf object detector.

read more

Citations
More filters
Book ChapterDOI

Estimating Depth from RGB and Sparse Sensing

TL;DR: A deep model that can accurately produce dense depth maps given an RGB image with known depth at a very sparse set of pixels is presented and it is demonstrated that it would indeed be possible to efficiently transform sparse depth measurements obtained using e.g. lower-power depth sensors or SLAM systems into high-quality densedepth maps.
Proceedings ArticleDOI

Efficient Hierarchical Graph-Based Segmentation of RGBD Videos

TL;DR: It is shown that a multistage segmentation with depth then color yields better results than a linear combination of depth and color, and bipartite graph matching at a given level of the hierarchical tree yields the final segmentation of the point clouds.
Journal ArticleDOI

Behavioral decision‐making model of the intelligent vehicle based on driving risk assessment

TL;DR: This paper presents a new behavioral decision‐making model to achieve both safety and high efficiency and also to reduce the adverse effect of autonomous vehicles on the other road users while driving and proposes a combined spring model for assessing driving risk.
Journal ArticleDOI

Use of a time-of-flight camera with an Omek Beckon™ framework to analyze, evaluate and correct in real time the verticality of multiple sclerosis patients during exercise.

TL;DR: A human body verticality detection system using a time-of-flight camera as a tool to detect incorrect postures and improve them in real time to improve patients’ evolution through better positions and performance of the exercises.
Journal ArticleDOI

Parameterized Distortion-Invariant Feature for Robust Tracking in Omnidirectional Vision

TL;DR: To robustly handle challenging occlusion in the distorted image, a flexible fragment-based joint-feature framework is presented for robust non-rigid human target tracking and the proposed tracking approaches leads to much better performance from the perspective of efficiency and robustness.
References
More filters
Journal ArticleDOI

Snakes : Active Contour Models

TL;DR: This work uses snakes for interactive interpretation, in which user-imposed constraint forces guide the snake near features of interest, and uses scale-space continuation to enlarge the capture region surrounding a feature.
Proceedings ArticleDOI

3D is here: Point Cloud Library (PCL)

TL;DR: PCL (Point Cloud Library) is presented, an advanced and extensive approach to the subject of 3D perception that contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation.
Journal ArticleDOI

Tracking-Learning-Detection

TL;DR: A novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection, and develops a novel learning method (P-N learning) which estimates the errors by a pair of “experts”: P-expert estimates missed detections, and N-ex Expert estimates false alarms.
Book ChapterDOI

An Experimental Comparison of Min-cut/Max-flow Algorithms for Energy Minimization in Vision

TL;DR: The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision, comparing the running times of several standard algorithms, as well as a new algorithm that is recently developed.
Journal ArticleDOI

Real-time human pose recognition in parts from single depth images

TL;DR: This work takes an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem, and generates confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.
Related Papers (5)