Home
/
Authors
/
Anupam Gupta

Author

Anupam Gupta

Other affiliations: Cincinnati Children's Hospital Medical Center, University of California, Berkeley, Cornell University ...read more

Bio: Anupam Gupta is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Approximation algorithm & Steiner tree problem. The author has an hindex of 55, co-authored 314 publications receiving 11295 citations. Previous affiliations of Anupam Gupta include Cincinnati Children's Hospital Medical Center & University of California, Berkeley.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1995

Papers

PDF

Open Access

More filters

Journal Article•DOI•

An elementary proof of a theorem of Johnson and Lindenstrauss

[...]

Sanjoy Dasgupta¹, Anupam Gupta²•Institutions (2)

AT&T Labs¹, Bell Labs²

01 Jan 2003-Random Structures and Algorithms

TL;DR: A result of Johnson and Lindenstrauss shows that a set of n points in high dimensional Euclidean space can be mapped into an O(log n/e2)-dimensional Euclidesan space such that the distance between any two points changes by only a factor of (1 ± e).

...read moreread less

Abstract: A result of Johnson and Lindenstrauss [13] shows that a set of n points in high dimensional Euclidean space can be mapped into an O(log n/e2)-dimensional Euclidean space such that the distance between any two points changes by only a factor of (1 ± e). In this note, we prove this theorem using elementary probabilistic techniques.

...read moreread less

1,036 citations

Proceedings Article•DOI•

Bounded geometries, fractals, and low-distortion embeddings

[...]

Anupam Gupta¹, Robert Krauthgamer², James R. Lee²•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

11 Oct 2003

TL;DR: This work considers both general doubling metrics, as well as more restricted families such as those arising from trees, from graphs excluding a fixed minor, and from snowflaked metrics, which contains many families of metrics that occur in applied settings.

...read moreread less

Abstract: The doubling constant of a metric space (X, d) is the smallest value /spl lambda/ such that every ball in X can be covered by /spl lambda/ balls of half the radius. The doubling dimension of X is then defined as dim (X) = log/sub 2//spl lambda/. A metric (or sequence of metrics) is called doubling precisely when its doubling dimension is bounded. This is a robust class of metric spaces which contains many families of metrics that occur in applied settings. We give tight bounds for embedding doubling metrics into (low-dimensional) normed spaces. We consider both general doubling metrics, as well as more restricted families such as those arising from trees, from graphs excluding a fixed minor, and from snowflaked metrics. Our techniques include decomposition theorems for doubling metrics, and an analysis of a fractal in the plane according to T. J. Laakso (2002). Finally, we discuss some applications and point out a central open question regarding dimensionality reduction in L/sub 2/.

...read moreread less

511 citations

Proceedings Article•DOI•

Near-optimal sensor placements: maximizing information while minimizing communication cost

[...]

Andreas Krause¹, Carlos Guestrin¹, Anupam Gupta¹, Jon Kleinberg²•Institutions (2)

Carnegie Mellon University¹, Cornell University²

19 Apr 2006

TL;DR: A data-driven approach to measuring the predictive quality of a set of sensor locations, predicting the communication cost involved with these placements, and designing an algorithm with provable quality guarantees that optimizes the NP-hard tradeoff is presented.

...read moreread less

Abstract: When monitoring spatial phenomena with wireless sensor networks, selecting the best sensor placements is a fundamental task. Not only should the sensors be informative, but they should also be able to communicate efficiently. In this paper, we present a data-driven approach that addresses the three central aspects of this problem: measuring the predictive quality of a set of sensor locations (regardless of whether sensors were ever placed at these locations), predicting the communication cost involved with these placements, and designing an algorithm with provable quality guarantees that optimizes the NP-hard tradeoff. Specifically, we use data from a pilot deployment to build non-parametric probabilistic models called Gaussian Processes (GPs) both for the spatial phenomena of interest and for the spatial variability of link qualities, which allows us to estimate predictive power and communication cost of un-sensed locations. Surprisingly, uncertainty in the representation of link qualities plays an important role in estimating communication costs. Using these models, we present a novel, polynomial-time, data-driven algorithm, pSPIEL, which selects Sensor Placements at Informative and cost-Effective Locations. Our approach exploits two important properties of this problem: submodularity, formalizing the intuition that adding a node to a small deployment can help more than adding a node to a large deployment; and locality, under which nodes that are far from each other provide almost independent information. Exploiting these properties, we prove strong approximation guarantees for our pSPlEL approach. We also provide extensive experimental validation of this practical approach on several real-world placement problems, and built a complete system implementation on 46 Tmote Sky motes, demonstrating significant advantages over existing methods.

...read moreread less

495 citations

Proceedings Article•DOI•

Provisioning a virtual private network: a network design problem for multicommodity flow

[...]

Anupam Gupta¹, Jon Kleinberg¹, Amit Kumar¹, Rajeev Rastogi², Bülent Yener² - Show less +1 more•Institutions (2)

Cornell University¹, Bell Labs²

06 Jul 2001

TL;DR: This work establishes a relation between this collection of network design problems and a variant of the facility location problem introduced by Karger and Minkoff, and provides optimal and approximate algorithms for several variants of this problem, depending on whether the traffic matrix is required to be symmetric.

...read moreread less

Abstract: Consider a setting in which a group of nodes, situated in a large underlying network, wishes to reserve bandwidth on which to support communication. Virtual private networks (VPNs) are services that support such a construct; rather than building a new physical network on the group of nodes that must be connected, bandwidth in the underlying network is reserved for communication within the group, forming a virtual “sub-network.”Provisioning a virtual private network over a set off terminals gives rise to the following general network design problem. We have bounds on the cumulative amount of traffic each terminal can send and receive; we must choose a path for each pair of terminals, and a bandwidth allocation for each edge of the network, so that any traffic matrix consistent with the given upper bounds can be feasibly routed. Thus, we are seeking to design a network that can support a continuum of possible traffic scenarios.We provide optimal and approximate algorithms for several variants of this problem, depending on whether the traffic matrix is required to be symmetric, and on whether the designed network is required to be a tree (a natural constraint in a number of basic applications). We also establish a relation between this collection of network design problems and a variant of the facility location problem introduced by Karger and Minkoff; we extend their results by providing a stronger approximation algorithm for this latter problem.

...read moreread less

318 citations

Journal Article•

Robust Submodular Observation Selection

[...]

Andreas Krause¹, H. Brendan McMahan, Carlos Guestrin, Anupam Gupta•Institutions (1)

Carnegie Mellon University¹

01 Jan 2008-Journal of Machine Learning Research

TL;DR: This paper presents the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for cases where the possible objective functions exhibit submodularity, an intuitive diminishing returns property, and proves that better approximation algorithms do not exist unless NP-complete problems admit efficient algorithms.

...read moreread less

Abstract: In many applications, one has to actively select among a set of expensive observations before making an informed decision. For example, in environmental monitoring, we want to select locations to measure in order to most effectively predict spatial phenomena. Often, we want to select observations which are robust against a number of possible objective functions. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for cases where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NP-complete problems admit efficient algorithms. We show how our algorithm can be extended to handle complex cost functions (incorporating non-unit observation cost or communication and path costs). We also show how the algorithm can be used to near-optimally trade off expected-case (e.g., the Mean Square Prediction Error in Gaussian Process regression) and worst-case (e.g., maximum predictive variance) performance. We show that many important machine learning problems fit our robust submodular observation selection formalism, and provide extensive empirical evaluation on several real-world problems. For Gaussian Process regression, our algorithm compares favorably with state-of-the-art heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDP-based algorithms. c ©2008 Andreas Krause, H. Brendan McMahan, Carlos Guestrin and Anupam Gupta. KRAUSE, MCMAHAN, GUESTRIN AND GUPTA

...read moreread less

307 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Soil Chemical Analysis

[...]

C. I. Rich

01 May 1958-Agronomy Journal

7,335 citations

Journal Article•DOI•

Statistics for Spatial Data.

[...]

Andrew B. Lawson¹, Noel A Cressie•Institutions (1)

University of Dundee¹

01 Mar 1993-The Statistician

6,278 citations

Convex Analysisの二,三の進展について

[...]

徹丸山

01 Feb 1977

5,933 citations

Book•

The Algorithmic Foundations of Differential Privacy

[...]

Cynthia Dwork¹, Aaron Roth²•Institutions (2)

Microsoft¹, University of Pennsylvania²

11 Aug 2014

TL;DR: The preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example.

...read moreread less

Abstract: The problem of privacy-preserving data analysis has a long history spanning multiple disciplines. As electronic data about individuals becomes increasingly detailed, and as technology enables ever more powerful collection and curation of these data, the need increases for a robust, meaningful, and mathematically rigorous definition of privacy, together with a computationally rich class of algorithms that satisfy this definition. Differential Privacy is such a definition.After motivating and discussing the meaning of differential privacy, the preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example. A key point is that, by rethinking the computational goal, one can often obtain far better results than would be achieved by methodically replacing each step of a non-private computation with a differentially private implementation. Despite some astonishingly powerful computational results, there are still fundamental limitations — not just on what can be achieved with differential privacy but on what can be achieved with any method that protects against a complete breakdown in privacy. Virtually all the algorithms discussed herein maintain differential privacy against adversaries of arbitrary computational power. Certain algorithms are computationally intensive, others are efficient. Computational complexity for the adversary and the algorithm are both discussed.We then turn from fundamentals to applications other than queryrelease, discussing differentially private methods for mechanism design and machine learning. The vast majority of the literature on differentially private algorithms considers a single, static, database that is subject to many analyses. Differential privacy in other models, including distributed databases and computations on data streams is discussed.Finally, we note that this work is meant as a thorough introduction to the problems and techniques of differential privacy, but is not intended to be an exhaustive survey — there is by now a vast amount of work in differential privacy, and we can cover only a small portion of it.

...read moreread less

5,190 citations

Journal Article•DOI•

The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells

[...]

Cole Trapnell¹, Davide Cacchiarelli¹, Davide Cacchiarelli², Jonna Grimsby², Prapti Pokharel², Shuqiang Li³, Michael A. Morse², Michael A. Morse¹, Niall J. Lennon², Kenneth J. Livak³, Tarjei S. Mikkelsen², Tarjei S. Mikkelsen¹, John L. Rinn¹, John L. Rinn², John L. Rinn⁴ - Show less +11 more•Institutions (4)

Harvard University¹, Broad Institute², Fluidigm Corporation³, Beth Israel Deaconess Medical Center⁴

23 Mar 2014-Nature Biotechnology

TL;DR: Monocle is described, an unsupervised algorithm that increases the temporal resolution of transcriptome dynamics using single-cell RNA-Seq data collected at multiple time points that revealed switch-like changes in expression of key regulatory factors, sequential waves of gene regulation, and expression of regulators that were not known to act in differentiation.

...read moreread less

Abstract: Defining the transcriptional dynamics of a temporal process such as cell differentiation is challenging owing to the high variability in gene expression between individual cells. Time-series gene expression analyses of bulk cells have difficulty distinguishing early and late phases of a transcriptional cascade or identifying rare subpopulations of cells, and single-cell proteomic methods rely on a priori knowledge of key distinguishing markers. Here we describe Monocle, an unsupervised algorithm that increases the temporal resolution of transcriptome dynamics using single-cell RNA-Seq data collected at multiple time points. Applied to the differentiation of primary human myoblasts, Monocle revealed switch-like changes in expression of key regulatory factors, sequential waves of gene regulation, and expression of regulators that were not known to act in differentiation. We validated some of these predicted regulators in a loss-of function screen. Monocle can in principle be used to recover single-cell gene expression kinetics from a wide array of cellular processes, including differentiation, proliferation and oncogenic transformation.

...read moreread less

4,119 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse