Home
/
Authors
/
Davi Geiger

Author

Davi Geiger

Other affiliations: Siemens, Case Western Reserve University, Courant Institute of Mathematical Sciences ...read more

Bio: Davi Geiger is an academic researcher from New York University. The author has contributed to research in topics: Image segmentation & Markov random field. The author has an hindex of 31, co-authored 114 publications receiving 4372 citations. Previous affiliations of Davi Geiger include Siemens & Case Western Reserve University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2017
2016
2015
2014
2013
2012
2010
2009
2007
2006
2005
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Dynamic programming for detecting, tracking, and matching deformable contours

[...]

Davi Geiger¹, Alok Gupta², Luiz Augusto Riani Costa³, J. Vlontzos²•Institutions (3)

New York University¹, Siemens², University of São Paulo³

01 Mar 1995-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The information provided by the user's selected points is explored and an optimal method to detect contours which allows a segmentation of the image is applied, based on dynamic programming (DP), and applies to a wide variety of shapes.

...read moreread less

Abstract: The problem of segmenting an image into separate regions and tracking them over time is one of the most significant problems in vision. Terzopoulos et al. (1987) proposed an approach to detect the contour regions of complex shapes, assuming a user selected initial contour not very far from the desired solution. We propose to further explore the information provided by the user's selected points and apply an optimal method to detect contours which allows a segmentation of the image. The method is based on dynamic programming (DP), and applies to a wide variety of shapes. It is exact and not iterative. We also consider a multiscale approach capable of speeding up the algorithm by a factor of 20, although at the expense of losing the guaranteed optimality characteristic. The problem of tracking and matching these contours is addressed. For tracking, the final contour obtained at one frame is sampled and used as initial points for the next frame. Then, the same DP process is applied. For matching, a novel strategy is proposed where the solution is a smooth displacement field in which unmatched regions are allowed while cross vectors are not. The algorithm is again based on DP and the optimal solution is guaranteed. We have demonstrated the algorithms on natural objects in a large spectrum of applications, including interactive segmentation and automatic tracking of the regions of interest in medical images. >

...read moreread less

512 citations

Journal Article•DOI•

Parallel and deterministic algorithms from MRFs: surface reconstruction

[...]

Davi Geiger¹, Federico Girosi¹•Institutions (1)

Massachusetts Institute of Technology¹

01 May 1991-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Deterministic approximations to Markov random field (MRF) models are derived and one of the models is shown to give in a natural way the graduated nonconvexity (GNC) algorithm proposed by A. Blake and A. Zisserman (1987).

...read moreread less

Abstract: Deterministic approximations to Markov random field (MRF) models are derived. One of the models is shown to give in a natural way the graduated nonconvexity (GNC) algorithm proposed by A. Blake and A. Zisserman (1987). This model can be applied to smooth a field preserving its discontinuities. A class of more complex models is then proposed in order to deal with a variety of vision problems. All the theoretical results are obtained in the framework of statistical mechanics and mean field techniques. A parallel, iterative algorithm to solve the deterministic equations of the two models is presented, together with some experiments on synthetic and real images. >

...read moreread less

486 citations

Journal Article•DOI•

Occlusions and binocular stereo

[...]

Davi Geiger¹, Bruce Ladendorf¹, Alan L. Yuille²•Institutions (2)

Princeton University¹, Harvard University²

01 Apr 1995-International Journal of Computer Vision

TL;DR: A theory for stereo is described based on the Bayesian approach, using adaptive windows and a prior weak smoothness constraint, which incorporates occlusion, and it is shown that occlusions can help stereo computation by providing cues for depth discontinuities.

...read moreread less

Abstract: Binocular stereo is the process of obtaining depth information from a pair of left and right cameras. In the past occlusions have been regions where stereo algorithms have failed. We show that, on the contrary, they can help stereo computation by providing cues for depth discontinuities.

...read moreread less

280 citations

Journal Article•DOI•

Determining the similarity of deformable shapes

[...]

Ronen Basri¹, Luiz Augusto Riani Costa², Davi Geiger³, David W. Jacobs⁴•Institutions (4)

Weizmann Institute of Science¹, University of São Paulo², New York University³, Princeton University⁴

01 Aug 1998-Vision Research

TL;DR: This paper identifies a number of possibly desirable properties of a shape similarity method, and determines the extent to which these properties can be captured by approaches that compare local properties of the contours of the shapes, through elastic matching.

...read moreread less

236 citations

Book Chapter•DOI•

Occlusions, Discontinuities, and Epipolar Lines in Stereo

[...]

Hiroshi Ishikawa¹, Davi Geiger¹•Institutions (1)

New York University¹

02 Jun 1998

TL;DR: A new approach to compute the disparity map by solving a global optimization problem that models occlusions, discontinuities, and epipolar-line interactions is presented.

...read moreread less

Abstract: Binocular stereo is the process of obtaining depth information from a pair of left and right views of a scene. We present a new approach to compute the disparity map by solving a global optimization problem that models occlusions, discontinuities, and epipolar-line interactions.

...read moreread less

224 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

Collapse

Cited by

PDF

Open Access

More filters

Book•

A wavelet tour of signal processing

[...]

Stéphane Mallat

01 Jan 1998

TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.

...read moreread less

Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

...read moreread less

17,693 citations

Journal Article•DOI•

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

[...]

Liang-Chieh Chen¹, George Papandreou¹, Iasonas Kokkinos², Kevin Murphy¹, Alan L. Yuille³ - Show less +1 more•Institutions (3)

Google¹, University College London², Johns Hopkins University³

01 Apr 2018-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

...read moreread less

Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First , we highlight convolution with upsampled filters, or ‘atrous convolution’, as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second , we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third , we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed “DeepLab” system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7 percent mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

...read moreread less

11,856 citations

Posted Content•

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

[...]

Liang-Chieh Chen¹, George Papandreou¹, Iasonas Kokkinos², Kevin Murphy¹, Alan L. Yuille³ - Show less +1 more•Institutions (3)

Google¹, University College London², Johns Hopkins University³

02 Jun 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: DeepLab as discussed by the authors proposes atrous spatial pyramid pooling (ASPP) to segment objects at multiple scales by probing an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views.

...read moreread less

Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

...read moreread less

10,120 citations

Journal Article•DOI•

Robust Face Recognition via Sparse Representation

[...]

John Wright¹, Allen Y. Yang², Arvind Ganesh¹, S. Shankar Sastry², Yi Ma¹ - Show less +1 more•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of California, Berkeley²

01 Feb 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.

...read moreread less

Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

...read moreread less

9,658 citations

Journal Article•DOI•

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

[...]

Daniel Scharstein¹, Richard Szeliski², Ramin Zabih³•Institutions (3)

Middlebury College¹, Microsoft², Cornell University³

09 Dec 2001-International Journal of Computer Vision

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

...read moreread less

Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

...read moreread less

7,458 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse