Home
/
Authors
/
David Kim

Author

David Kim

Other affiliations: Newcastle University, Ludwig Maximilian University of Munich, Google

Bio: David Kim is an academic researcher from Microsoft. The author has contributed to research in topics: Augmented reality & Depth map. The author has an hindex of 36, co-authored 55 publications receiving 11020 citations. Previous affiliations of David Kim include Newcastle University & Ludwig Maximilian University of Munich.

Papers published on a yearly basis

2020
2018
2017
2016
2015
2014
2013
2012
2011
2010
2008
2007

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

KinectFusion: Real-time dense surface mapping and tracking

[...]

Richard Newcombe¹, Shahram Izadi², Otmar Hilliges², David Molyneaux³, David Kim⁴, Andrew J. Davison¹, Pushmeet Kohi², Jamie Shotton², Steve Hodges⁴, Andrew Fitzgibbon² - Show less +6 more•Institutions (4)

Imperial College London¹, Microsoft², Lancaster University³, Newcastle University⁴

26 Oct 2011

TL;DR: A system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware, which fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real- time.

...read moreread less

Abstract: We present a system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware. We fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real-time. The current sensor pose is simultaneously obtained by tracking the live depth frame relative to the global model using a coarse-to-fine iterative closest point (ICP) algorithm, which uses all of the observed depth data available. We demonstrate the advantages of tracking against the growing full surface model compared with frame-to-frame tracking, obtaining tracking and mapping results in constant time within room sized scenes with limited drift and high accuracy. We also show both qualitative and quantitative results relating to various aspects of our tracking and mapping system. Modelling of natural scenes, in real-time with only commodity sensor and GPU hardware, promises an exciting step forward in augmented reality (AR), in particular, it allows dense surfaces to be reconstructed in real-time, with a level of detail and robustness beyond any solution yet presented using passive computer vision.

...read moreread less

4,184 citations

Proceedings Article•DOI•

KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera

[...]

Shahram Izadi¹, David Kim¹, Otmar Hilliges¹, David Molyneaux¹, Richard Newcombe², Pushmeet Kohli¹, Jamie Shotton¹, Steve Hodges¹, Dustin Freeman³, Andrew J. Davison², Andrew Fitzgibbon¹ - Show less +7 more•Institutions (3)

Microsoft¹, Imperial College London², University of Toronto³

16 Oct 2011

TL;DR: Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction, to enable real-time multi-touch interactions anywhere.

...read moreread less

Abstract: KinectFusion enables a user holding and moving a standard Kinect camera to rapidly create detailed 3D reconstructions of an indoor scene. Only the depth data from Kinect is used to track the 3D pose of the sensor and reconstruct, geometrically precise, 3D models of the physical scene in real-time. The capabilities of KinectFusion, as well as the novel GPU-based pipeline are described in full. Uses of the core system for low-cost handheld scanning, and geometry-aware augmented reality and physics-based interactions are shown. Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction. These extensions are used to enable real-time multi-touch interactions anywhere, allowing any planar or non-planar reconstructed physical surface to be appropriated for touch.

...read moreread less

2,373 citations

Proceedings Article•DOI•

Holoportation: Virtual 3D Teleportation in Real-time

[...]

Sergio Orts-Escolano¹, Christoph Rhemann¹, Sean Fanello¹, Wayne Chang¹, Adarsh Kowdle¹, Yury Degtyarev¹, David Kim¹, Philip Davidson¹, Sameh Khamis¹, Mingsong Dou¹, Vladimir Tankovich¹, Charles Loop¹, Qin Cai¹, Philip A. Chou¹, Sarah Mennicken¹, Julien Valentin¹, Vivek Pradeep¹, Shenlong Wang¹, Sing Bing Kang¹, Pushmeet Kohli¹, Yuliya Lutchyn¹, Cem Keskin¹, Shahram Izadi¹ - Show less +19 more•Institutions (1)

Microsoft¹

16 Oct 2016

TL;DR: This paper demonstrates high-quality, real-time 3D reconstructions of an entire space, including people, furniture and objects, using a set of new depth cameras, and allows users wearing virtual or augmented reality displays to see, hear and interact with remote participants in 3D, almost as if they were present in the same physical space.

...read moreread less

Abstract: We present an end-to-end system for augmented and virtual reality telepresence, called Holoportation. Our system demonstrates high-quality, real-time 3D reconstructions of an entire space, including people, furniture and objects, using a set of new depth cameras. These 3D models can also be transmitted in real-time to remote users. This allows users wearing virtual or augmented reality displays to see, hear and interact with remote participants in 3D, almost as if they were present in the same physical space. From an audio-visual perspective, communicating and interacting with remote users edges closer to face-to-face communication. This paper describes the Holoportation technical system in full, its key interactive capabilities, the application scenarios it enables, and an initial qualitative study of using this new communication medium.

...read moreread less

543 citations

Proceedings Article•DOI•

Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor

[...]

David Kim¹, Otmar Hilliges², Shahram Izadi², Alex Butler², Jiawen Chen², Iason Oikonomidis³, Patrick Olivier¹ - Show less +3 more•Institutions (3)

Newcastle University¹, Microsoft², University of Crete³

07 Oct 2012

TL;DR: Digits is a wrist-worn sensor that recovers the full 3D pose of the user's hand, which enables a variety of freehand interactions on the move and is specifically designed to be low-power and easily reproducible using only off-the-shelf hardware.

...read moreread less

Abstract: Digits is a wrist-worn sensor that recovers the full 3D pose of the user's hand. This enables a variety of freehand interactions on the move. The system targets mobile settings, and is specifically designed to be low-power and easily reproducible using only off-the-shelf hardware. The electronics are self-contained on the user's wrist, but optically image the entirety of the user's hand. This data is processed using a new pipeline that robustly samples key parts of the hand, such as the tips and lower regions of each finger. These sparse samples are fed into new kinematic models that leverage the biomechanical constraints of the hand to recover the 3D pose of the user's hand. The proposed system works without the need for full instrumentation of the hand (for example using data gloves), additional sensors in the environment, or depth cameras which are currently prohibitive for mobile scenarios due to power and form-factor considerations. We demonstrate the utility of Digits for a variety of application scenarios, including 3D spatial interaction with mobile devices, eyes-free interaction on-the-move, and gaming. We conclude with a quantitative and qualitative evaluation of our system, and discussion of strengths, limitations and future work.

...read moreread less

488 citations

Journal Article•DOI•

Fusion4D: real-time performance capture of challenging scenes

[...]

Mingsong Dou¹, Sameh Khamis¹, Yury Degtyarev¹, Philip Davidson¹, Sean Fanello¹, Adarsh Kowdle¹, Sergio Orts-Escolano¹, Christoph Rhemann¹, David Kim¹, Jonathan Taylor¹, Pushmeet Kohli¹, Vladimir Tankovich¹, Shahram Izadi¹ - Show less +9 more•Institutions (1)

Microsoft¹

11 Jul 2016

TL;DR: This work contributes a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time, highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes.

...read moreread less

Abstract: We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well as parameterizing the nonrigid scene motion. Our approach is highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes. We demonstrate advantages over related real-time techniques that either deform an online generated template or continually fuse depth data nonrigidly into a single reference model. Finally, we show geometric reconstruction results on par with offline methods which require orders of magnitude more processing time and many more RGBD cameras.

...read moreread less

487 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Proceedings Article•DOI•

KinectFusion: Real-time dense surface mapping and tracking

[...]

Imperial College London¹, Microsoft², Lancaster University³, Newcastle University⁴

26 Oct 2011

...read moreread less

4,184 citations

Journal Article•DOI•

ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras

[...]

Raul Mur-Artal¹, Juan D. Tardós¹•Institutions (1)

University of Zaragoza¹

12 Jun 2017-IEEE Transactions on Robotics

TL;DR: ORB-SLAM2, a complete simultaneous localization and mapping (SLAM) system for monocular, stereo and RGB-D cameras, including map reuse, loop closing, and relocalization capabilities, is presented, being in most cases the most accurate SLAM solution.

...read moreread less

Abstract: We present ORB-SLAM2, a complete simultaneous localization and mapping (SLAM) system for monocular, stereo and RGB-D cameras, including map reuse, loop closing, and relocalization capabilities. The system works in real time on standard central processing units in a wide variety of environments from small hand-held indoors sequences, to drones flying in industrial environments and cars driving around a city. Our back-end, based on bundle adjustment with monocular and stereo observations, allows for accurate trajectory estimation with metric scale. Our system includes a lightweight localization mode that leverages visual odometry tracks for unmapped regions and matches with map points that allow for zero-drift localization. The evaluation on 29 popular public sequences shows that our method achieves state-of-the-art accuracy, being in most cases the most accurate SLAM solution. We publish the source code, not only for the benefit of the SLAM community, but with the aim of being an out-of-the-box SLAM solution for researchers in other fields.

...read moreread less

3,499 citations

Proceedings Article•DOI•

A benchmark for the evaluation of RGB-D SLAM systems

[...]

Jrgen Sturm¹, Nikolas Engelhard², Felix Endres², Wolfram Burgard², Daniel Cremers¹ - Show less +1 more•Institutions (2)

Technische Universität München¹, University of Freiburg²

24 Dec 2012

TL;DR: A large set of image sequences from a Microsoft Kinect with highly accurate and time-synchronized ground truth camera poses from a motion capture system is recorded for the evaluation of RGB-D SLAM systems.

...read moreread less

Abstract: In this paper, we present a novel benchmark for the evaluation of RGB-D SLAM systems. We recorded a large set of image sequences from a Microsoft Kinect with highly accurate and time-synchronized ground truth camera poses from a motion capture system. The sequences contain both the color and depth images in full sensor resolution (640 × 480) at video frame rate (30 Hz). The ground-truth trajectory was obtained from a motion-capture system with eight high-speed tracking cameras (100 Hz). The dataset consists of 39 sequences that were recorded in an office environment and an industrial hall. The dataset covers a large variety of scenes and camera motions. We provide sequences for debugging with slow motions as well as longer trajectories with and without loop closures. Most sequences were recorded from a handheld Kinect with unconstrained 6-DOF motions but we also provide sequences from a Kinect mounted on a Pioneer 3 robot that was manually navigated through a cluttered indoor environment. To stimulate the comparison of different approaches, we provide automatic evaluation tools both for the evaluation of drift of visual odometry systems and the global pose error of SLAM systems. The benchmark website [1] contains all data, detailed descriptions of the scenes, specifications of the data formats, sample code, and evaluation tools.

...read moreread less

3,050 citations

Journal Article•DOI•

ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras

[...]

Raul Mur-Artal¹, Juan D. Tardós¹•Institutions (1)

University of Zaragoza¹

20 Oct 2016-arXiv: Robotics

TL;DR: ORB-SLAM2 as mentioned in this paper is a complete SLAM system for monocular, stereo and RGB-D cameras, including map reuse, loop closing and relocalization capabilities.

...read moreread less

Abstract: We present ORB-SLAM2 a complete SLAM system for monocular, stereo and RGB-D cameras, including map reuse, loop closing and relocalization capabilities. The system works in real-time on standard CPUs in a wide variety of environments from small hand-held indoors sequences, to drones flying in industrial environments and cars driving around a city. Our back-end based on bundle adjustment with monocular and stereo observations allows for accurate trajectory estimation with metric scale. Our system includes a lightweight localization mode that leverages visual odometry tracks for unmapped regions and matches to map points that allow for zero-drift localization. The evaluation on 29 popular public sequences shows that our method achieves state-of-the-art accuracy, being in most cases the most accurate SLAM solution. We publish the source code, not only for the benefit of the SLAM community, but with the aim of being an out-of-the-box SLAM solution for researchers in other fields.

...read moreread less

2,857 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse