Home
/
Authors
/
Carlo Tomasi

Author

Carlo Tomasi

Other affiliations: Carnegie Mellon University, Cornell University, University of Padua ...read more

Bio: Carlo Tomasi is an academic researcher from Duke University. The author has contributed to research in topics: Motion estimation & Image retrieval. The author has an hindex of 55, co-authored 150 publications receiving 32233 citations. Previous affiliations of Carlo Tomasi include Carnegie Mellon University & Cornell University.

Topics: Motion estimation, Image retrieval, Motion field, Real image, Pixel ...read more

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1987
1984
1982

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Bilateral filtering for gray and color images

[...]

Carlo Tomasi¹, Roberto Manduchi²•Institutions (2)

Stanford University¹, Apple Inc.²

04 Jan 1998

TL;DR: In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception.

...read moreread less

Abstract: Bilateral filtering smooths images while preserving edges, by means of a nonlinear combination of nearby image values. The method is noniterative, local, and simple. It combines gray levels or colors based on both their geometric closeness and their photometric similarity, and prefers near values to distant values in both domain and range. In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception. Also, in contrast with standard filtering, bilateral filtering produces no phantom colors along edges in color images, and reduces phantom colors where they appear in the original image.

...read moreread less

8,738 citations

Journal Article•DOI•

The Earth Mover's Distance as a Metric for Image Retrieval

[...]

Yossi Rubner¹, Carlo Tomasi¹, Leonidas J. Guibas¹•Institutions (1)

Stanford University¹

01 Nov 2000-International Journal of Computer Vision

TL;DR: This paper investigates the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval, and compares the retrieval performance of the EMD with that of other distances.

...read moreread less

Abstract: We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.

...read moreread less

4,593 citations

Journal Article•DOI•

Shape and motion from image streams under orthography: a factorization method

[...]

Carlo Tomasi¹, Takeo Kanade²•Institutions (2)

Cornell University¹, Carnegie Mellon University²

01 Nov 1992-International Journal of Computer Vision

TL;DR: In this paper, the singular value decomposition (SVDC) technique is used to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively, and two of the three translation components are computed in a preprocessing stage.

...read moreread less

Abstract: Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step. An image stream can be represented by the 2FxP measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3. Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures. The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.

...read moreread less

2,696 citations

Detection and Tracking of Point Features

[...]

Carlo Tomasi

01 Jan 1991

2,443 citations

Proceedings Article•DOI•

A metric for distributions with applications to image databases

[...]

Yossi Rubner¹, Carlo Tomasi¹, Leonidas J. Guibas¹•Institutions (1)

Stanford University¹

04 Jan 1998

TL;DR: This paper uses the Earth Mover's Distance to exhibit the structure of color-distribution and texture spaces by means of Multi-Dimensional Scaling displays, and proposes a novel approach to the problem of navigating through a collection of color images, which leads to a new paradigm for image database search.

...read moreread less

Abstract: We introduce a new distance between two distributions that we call the Earth Mover's Distance (EMD), which reflects the minimal amount of work that must be performed to transform one distribution into the other by moving "distribution mass" around. This is a special case of the transportation problem from linear optimization, for which efficient algorithms are available. The EMD also allows for partial matching. When used to compare distributions that have the same overall mass, the EMD is a true metric, and has easy-to-compute lower bounds. In this paper we focus on applications to image databases, especially color and texture. We use the EMD to exhibit the structure of color-distribution and texture spaces by means of Multi-Dimensional Scaling displays. We also propose a novel approach to the problem of navigating through a collection of color images, which leads to a new paradigm for image database search.

...read moreread less

1,828 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Mean shift: a robust approach toward feature space analysis

[...]

Dorin Comaniciu¹, Peter Meer¹•Institutions (1)

Princeton University¹

01 May 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.

...read moreread less

Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

...read moreread less

11,727 citations

Proceedings Article•DOI•

Non-local Neural Networks

[...]

Xiaolong Wang¹, Ross Girshick¹, Abhinav Gupta², Kaiming He¹•Institutions (2)

Facebook¹, Carnegie Mellon University²

18 Jun 2018

TL;DR: In this article, the non-local operation computes the response at a position as a weighted sum of the features at all positions, which can be used to capture long-range dependencies.

...read moreread less

Abstract: Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper, we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method [4] in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our nonlocal models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code will be made available.

...read moreread less

8,059 citations

Journal Article•DOI•

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

[...]

Daniel Scharstein¹, Richard Szeliski², Ramin Zabih³•Institutions (3)

Middlebury College¹, Microsoft², Cornell University³

09 Dec 2001-International Journal of Computer Vision

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.

...read moreread less

Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

...read moreread less

7,458 citations

Journal Article•DOI•

Fast approximate energy minimization via graph cuts

[...]

Yuri Boykov¹, Olga Veksler¹, Ramin Zabih²•Institutions (2)

Princeton University¹, Cornell University²

01 Nov 2001-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.

...read moreread less

Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

...read moreread less

7,413 citations

Proceedings Article•DOI•

A non-local algorithm for image denoising

[...]

Antoni Buades, Bartomeu Coll, Jean-Michel Morel¹•Institutions (1)

École normale supérieure de Cachan¹

20 Jun 2005

TL;DR: A new measure, the method noise, is proposed, to evaluate and compare the performance of digital image denoising methods, and a new algorithm, the nonlocal means (NL-means), based on a nonlocal averaging of all pixels in the image is proposed.

...read moreread less

Abstract: We propose a new measure, the method noise, to evaluate and compare the performance of digital image denoising methods. We first compute and analyze this method noise for a wide class of denoising algorithms, namely the local smoothing filters. Second, we propose a new algorithm, the nonlocal means (NL-means), based on a nonlocal averaging of all pixels in the image. Finally, we present some experiments comparing the NL-means algorithm and the local smoothing filters.

...read moreread less

6,804 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse