Home
/
Authors
/
Shahram Izadi

Author

Shahram Izadi

Other affiliations: PARC, Xerox, Microsoft ...read more

Bio: Shahram Izadi is an academic researcher from Google. The author has contributed to research in topics: Augmented reality & Depth map. The author has an hindex of 82, co-authored 304 publications receiving 25952 citations. Previous affiliations of Shahram Izadi include PARC & Xerox.

Topics: Augmented reality, Depth map, Pose, Ubiquitous computing, Pixel ...read more

Papers published on a yearly basis

2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

KinectFusion: Real-time dense surface mapping and tracking

[...]

Richard Newcombe¹, Shahram Izadi², Otmar Hilliges², David Molyneaux³, David Kim⁴, Andrew J. Davison¹, Pushmeet Kohi², Jamie Shotton², Steve Hodges⁴, Andrew Fitzgibbon² - Show less +6 more•Institutions (4)

Imperial College London¹, Microsoft², Lancaster University³, Newcastle University⁴

26 Oct 2011

TL;DR: A system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware, which fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real- time.

...read moreread less

Abstract: We present a system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware. We fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real-time. The current sensor pose is simultaneously obtained by tracking the live depth frame relative to the global model using a coarse-to-fine iterative closest point (ICP) algorithm, which uses all of the observed depth data available. We demonstrate the advantages of tracking against the growing full surface model compared with frame-to-frame tracking, obtaining tracking and mapping results in constant time within room sized scenes with limited drift and high accuracy. We also show both qualitative and quantitative results relating to various aspects of our tracking and mapping system. Modelling of natural scenes, in real-time with only commodity sensor and GPU hardware, promises an exciting step forward in augmented reality (AR), in particular, it allows dense surfaces to be reconstructed in real-time, with a level of detail and robustness beyond any solution yet presented using passive computer vision.

...read moreread less

4,184 citations

Proceedings Article•DOI•

KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera

[...]

Shahram Izadi¹, David Kim¹, Otmar Hilliges¹, David Molyneaux¹, Richard Newcombe², Pushmeet Kohli¹, Jamie Shotton¹, Steve Hodges¹, Dustin Freeman³, Andrew J. Davison², Andrew Fitzgibbon¹ - Show less +7 more•Institutions (3)

Microsoft¹, Imperial College London², University of Toronto³

16 Oct 2011

TL;DR: Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction, to enable real-time multi-touch interactions anywhere.

...read moreread less

Abstract: KinectFusion enables a user holding and moving a standard Kinect camera to rapidly create detailed 3D reconstructions of an indoor scene. Only the depth data from Kinect is used to track the 3D pose of the sensor and reconstruct, geometrically precise, 3D models of the physical scene in real-time. The capabilities of KinectFusion, as well as the novel GPU-based pipeline are described in full. Uses of the core system for low-cost handheld scanning, and geometry-aware augmented reality and physics-based interactions are shown. Novel extensions to the core GPU pipeline demonstrate object segmentation and user interaction directly in front of the sensor, without degrading camera tracking or reconstruction. These extensions are used to enable real-time multi-touch interactions anywhere, allowing any planar or non-planar reconstructed physical surface to be appropriated for touch.

...read moreread less

2,373 citations

Journal Article•DOI•

Real-time 3D reconstruction at scale using voxel hashing

[...]

Matthias Nießner¹, Michael Zollhöfer¹, Shahram Izadi², Marc Stamminger¹•Institutions (2)

University of Erlangen-Nuremberg¹, Microsoft²

01 Nov 2013

TL;DR: An online system for large and fine scale volumetric reconstruction based on a memory and speed efficient data structure that compresses space, and allows for real-time access and updates of implicit surface data, without the need for a regular or hierarchical grid data structure.

...read moreread less

Abstract: Online 3D reconstruction is gaining newfound interest due to the availability of real-time consumer depth cameras. The basic problem takes live overlapping depth maps as input and incrementally fuses these into a single 3D model. This is challenging particularly when real-time performance is desired without trading quality or scale. We contribute an online system for large and fine scale volumetric reconstruction based on a memory and speed efficient data structure. Our system uses a simple spatial hashing scheme that compresses space, and allows for real-time access and updates of implicit surface data, without the need for a regular or hierarchical grid data structure. Surface data is only stored densely where measurements are observed. Additionally, data can be streamed efficiently in or out of the hash table, allowing for further scalability during sensor motion. We show interactive reconstructions of a variety of scenes, reconstructing both fine-grained details and large scale environments. We illustrate how all parts of our pipeline from depth map pre-processing, camera pose estimation, depth map fusion, and surface rendering are performed at real-time rates on commodity graphics hardware. We conclude with a comparison to current state-of-the-art online systems, illustrating improved performance and reconstruction quality.

...read moreread less

940 citations

Proceedings Article•DOI•

Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images

[...]

Jamie Shotton¹, Ben Glocker¹, Christopher Zach¹, Shahram Izadi¹, Antonio Criminisi¹, Andrew Fitzgibbon¹ - Show less +2 more•Institutions (1)

Microsoft¹

23 Jun 2013

TL;DR: This work addresses the problem of inferring the pose of an RGB-D camera relative to a known 3D scene, given only a single acquired image, and employs a regression forest that is capable of inferting an estimate of each pixel's correspondence to 3D points in the scene's world coordinate frame.

...read moreread less

Abstract: We address the problem of inferring the pose of an RGB-D camera relative to a known 3D scene, given only a single acquired image. Our approach employs a regression forest that is capable of inferring an estimate of each pixel's correspondence to 3D points in the scene's world coordinate frame. The forest uses only simple depth and RGB pixel comparison features, and does not require the computation of feature descriptors. The forest is trained to be capable of predicting correspondences at any pixel, so no interest point detectors are required. The camera pose is inferred using a robust optimization scheme. This starts with an initial set of hypothesized camera poses, constructed by applying the forest at a small fraction of image pixels. Preemptive RANSAC then iterates sampling more pixels at which to evaluate the forest, counting inliers, and refining the hypothesized poses. We evaluate on several varied scenes captured with an RGB-D camera and observe that the proposed technique achieves highly accurate relocalization and substantially out-performs two state of the art baselines.

...read moreread less

796 citations

Book Chapter•DOI•

SenseCam: a retrospective memory aid

[...]

Steve Hodges¹, Lyndsay Williams¹, Emma Berry¹, Shahram Izadi¹, James Srinivasan¹, Alex Butler¹, Gavin Smyth¹, Narinder Kapur, Ken Wood¹ - Show less +5 more•Institutions (1)

Microsoft¹

17 Sep 2006

TL;DR: The results of this initial evaluation of the SenseCam are extremely promising; periodic review of images of events recorded by SenseCam results in significant recall of those events by the patient, which was previously impossible.

...read moreread less

Abstract: This paper presents a novel ubiquitous computing device, the SenseCam, a sensor augmented wearable stills camera. SenseCam is designed to capture a digital record of the wearer's day, by recording a series of images and capturing a log of sensor data. We believe that reviewing this information will help the wearer recollect aspects of earlier experiences that have subsequently been forgotten, and thereby form a powerful retrospective memory aid. In this paper we review existing work on memory aids and conclude that there is scope for an improved device. We then report on the design of SenseCam in some detail for the first time. We explain the details of a first in-depth user study of this device, a 12-month clinical trial with a patient suffering from amnesia. The results of this initial evaluation are extremely promising; periodic review of images of events recorded by SenseCam results in significant recall of those events by the patient, which was previously impossible. We end the paper with a discussion of future work, including the application of SenseCam to a wider audience, such as those with neurodegenerative conditions such as Alzheimer's disease.

...read moreread less

753 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Proceedings Article•DOI•

The Cityscapes Dataset for Semantic Urban Scene Understanding

[...]

Marius Cordts¹, Mohamed Omran², Sebastian Ramos³, Timo Rehfeld¹, Markus Enzweiler³, Rodrigo Benenson², Uwe Franke³, Stefan Roth¹, Bernt Schiele² - Show less +5 more•Institutions (3)

Technische Universität Darmstadt¹, Max Planck Society², Daimler AG³

01 Jun 2016

TL;DR: This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.

...read moreread less

Abstract: Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations, 20 000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.

...read moreread less

7,547 citations

Journal Article•DOI•

Bowling alone: The collapse and revival of American community

[...]

John A. Knote

01 Dec 2004-Journal of The American College of Radiology

TL;DR: As an example of how the current "war on terrorism" could generate a durable civic renewal, Putnam points to the burst in civic practices that occurred during and after World War II, which he says "permanently marked" the generation that lived through it and had a "terrific effect on American public life over the last half-century."

...read moreread less

Abstract: The present historical moment may seem a particularly inopportune time to review Bowling Alone, Robert Putnam's latest exploration of civic decline in America. After all, the outpouring of volunteerism, solidarity, patriotism, and self-sacrifice displayed by Americans in the wake of the September 11 terrorist attacks appears to fly in the face of Putnam's central argument: that \"social capital\" -defined as \"social networks and the norms of reciprocity and trustworthiness that arise from them\" (p. 19)'has declined to dangerously low levels in America over the last three decades. However, Putnam is not fazed in the least by the recent effusion of solidarity. Quite the contrary, he sees in it the potential to \"reverse what has been a 30to 40-year steady decline in most measures of connectedness or community.\"' As an example of how the current \"war on terrorism\" could generate a durable civic renewal, Putnam points to the burst in civic practices that occurred during and after World War II, which he says \"permanently marked\" the generation that lived through it and had a \"terrific effect on American public life over the last half-century.\" 3 If Americans can follow this example and channel their current civic

...read moreread less

5,309 citations

Proceedings Article•DOI•

KinectFusion: Real-time dense surface mapping and tracking

[...]

Imperial College London¹, Microsoft², Lancaster University³, Newcastle University⁴

26 Oct 2011

...read moreread less

4,184 citations

On robust estimation of the location parameter

[...]

Frederick R. Forst

01 Jan 1980

3,652 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse