Author

# Sandeep Sen

Other affiliations: University of North Carolina at Chapel Hill, Duke University, Shiv Nadar University ...read more

Bio: Sandeep Sen is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Parallel algorithm & Randomized algorithm. The author has an hindex of 30, co-authored 123 publications receiving 2719 citations. Previous affiliations of Sandeep Sen include University of North Carolina at Chapel Hill & Duke University.

##### Papers published on a yearly basis

##### Papers

More filters

••

17 Oct 2004

TL;DR: This work presents the first linear time (1 + /spl epsiv/)-approximation algorithm for the k-means problem for fixed k and /spl Epsiv/, which runs in O(nd) time.

Abstract: We present the first linear time (1 + /spl epsiv/)-approximation algorithm for the k-means problem for fixed k and /spl epsiv/. Our algorithm runs in O(nd) time, which is linear in the size of the input. Another feature of our algorithm is its simplicity - the only technique involved is random sampling.

263 citations

••

TL;DR: The size of the t-spanner computed essentially matches the worst case lower bound implied by a 43-year old girth lower bound conjecture made independently by Erdos, Bollobas, and Bondy & Simonovits.

Abstract: Let G = (V,E) be an undirected weighted graph on |V | = n vertices and |E| = m edges. A t-spanner of the graph G, for any t ≥ 1, is a subgraph (V,ES), ES ⊆ E, such that the distance between any pair of vertices in the subgraph is at most t times the distance between them in the graph G. Computing a t-spanner of minimum size (number of edges) has been a widely studied and well-motivated problem in computer science. In this paper we present the first linear time randomized algorithm that computes a t-spanner of a given weighted graph. Moreover, the size of the t-spanner computed essentially matches the worst case lower bound implied by a 43-year old girth lower bound conjecture made independently by Erdos, Bollobas, and Bondy & Simonovits.
Our algorithm uses a novel clustering approach that avoids any distance computation altogether. This feature is somewhat surprising since all the previously existing algorithms employ computation of some sort of local or global distance information, which involves growing either breadth first search trees up to t(t)-levels or full shortest path trees on a large fraction of vertices. The truly local approach of our algorithm also leads to equally simple and efficient algorithms for computing spanners in other important computational environments like distributed, parallel, and external memory. © 2006 Wiley Periodicals, Inc. Random Struct. Alg., 2007
Preliminary version of this work appeared in the 30th International Colloquium on Automata, Languages and Programming, pages 384–396, 2003.

159 citations

••

TL;DR: This work presents a general approach for designing approximation algorithms for a fundamental class of geometric clustering problems in arbitrary dimensions and leads to simple randomized algorithms for the k-means, median and discrete problems.

Abstract: We present a general approach for designing approximation algorithms for a fundamental class of geometric clustering problems in arbitrary dimensions. More specifically, our approach leads to simple randomized algorithms for the k-means, k-median and discrete k-means problems that yield (1+e) approximations with probability ≥ 1/2 and running times of O(2(k/e)O(1)dn). These are the first algorithms for these problems whose running times are linear in the size of the input (nd for n points in d dimensions) assuming k and e are fixed. Our method is general enough to be applicable to clustering problems satisfying certain simple properties and is likely to have further applications.

153 citations

••

TL;DR: This article shows that one can actually construct approximate distance oracles in expected O(n) time if the graph is unweighted, and leads to the first expected linear-time algorithm for computing an optimal size (2, 1)-spanner of an unweighting graph.

Abstract: Let G = (V, E) be an undirected graph on n vertices, and let δ(u, v) denote the distance in G between two vertices u and v. Thorup and Zwick showed that for any positive integer t, the graph G can be preprocessed to build a data structure that can efficiently report t-approximate distance between any pair of vertices. That is, for any u, v ∈ V, the distance reported is at least δ(u, v) and at most tδ(u, v). The remarkable feature of this data structure is that, for t≥3, it occupies subquadratic space, that is, it does not store all-pairs distances explicitly, and still it can answer any t-approximate distance query in constant time. They named the data structure “approximate distance oracle” because of this feature. Furthermore, the trade-off between the stretch t and the size of the data structure is essentially optimal.In this article, we show that we can actually construct approximate distance oracles in expected O(n2) time if the graph is unweighted. One of the new ideas used in the improved algorithm also leads to the first expected linear-time algorithm for computing an optimal size (2, 1)-spanner of an unweighted graph. A (2, 1) spanner of an undirected unweighted graph G = (V, E) is a subgraph (V, E), E ⊆ E, such that for any two vertices u and v in the graph, their distance in the subgraph is at most 2δ(u, v) p 1.

101 citations

••

Duke University

^{1}TL;DR: This paper presents an algorithm for hidden surface removal for a class of polyhedral surfaces which have a property that they can be ordered relatively quickly like the terrain maps and presents a parallel algorithm based on a similar approach.

Abstract: In this paper we present an algorithm for hidden surface removal for a class of polyhedral surfaces which have a property that they can be ordered relatively quickly like the terrain maps A distinguishing feature of this algorithm is that its running time is sensitive to the actual size of the visible image rather than the total number of intersections in the image plane which can be much larger than the visible image The time complexity of this algorithm is O((k +n)lognloglogn) where n and k are respectively the input and the output sizes Thus, in a significant number of situations this will be faster than the worst case optimal algorithms which have running time O(n2) irrespective of the output size (where as the output size k is O(n2) only in the worst case) We also present a parallel algorithm based on a similar approach which runs in time O(log4(n+k)) using O((n + k)/log(n+k)) processors in a CREW PRAM model All our bounds are obtained using ammortized analysis

98 citations

##### Cited by

More filters

••

TL;DR: A new superpixel algorithm is introduced, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels and is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

Abstract: Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort to understand the benefits and drawbacks of existing methods, we empirically compare five state-of-the-art superpixel algorithms for their ability to adhere to image boundaries, speed, memory efficiency, and their impact on segmentation performance. We then introduce a new superpixel algorithm, simple linear iterative clustering (SLIC), which adapts a k-means clustering approach to efficiently generate superpixels. Despite its simplicity, SLIC adheres to boundaries as well as or better than previous methods. At the same time, it is faster and more memory efficient, improves segmentation performance, and is straightforward to extend to supervoxel generation.

7,849 citations

••

07 Jan 2007TL;DR: By augmenting k-means with a very simple, randomized seeding technique, this work obtains an algorithm that is Θ(logk)-competitive with the optimal clustering.

Abstract: The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically.

7,539 citations

••

TL;DR: Two algorithms for the approximate nearest neighbor problem in high dimensional spaces for data sets of size n living in IR are presented, achieving query times that are sub-linear in n and polynomial in d.

Abstract: We present two algorithms for the approximate nearest neighbor problem in high dimensional spaces. For data sets of size n living in IR, the algorithms require space that is only polynomial in n and d, while achieving query times that are sub-linear in n and polynomial in d. We also show applications to other high-dimensional geometric problems, such as the approximate minimum spanning tree.

1,182 citations

••

Bell Labs

^{1}TL;DR: Asymptotically tight bounds for a combinatorial quantity of interest in discrete and computational geometry, related to halfspace partitions of point sets, are given.

Abstract: Random sampling is used for several new geometric algorithms. The algorithms are “Las Vegas,” and their expected bounds are with respect to the random behavior of the algorithms. One algorithm reports all the intersecting pairs of a set of line segments in the plane, and requires O(A + n log n) expected time, where A is the size of the answer, the number of intersecting pairs reported. The algorithm requires O(n) space in the worst case. Another algorithm computes the convex hull of a point set in E3 in O(n log A) expected time, where n is the number of points and A is the number of points on the surface of the hull. A simple Las Vegas algorithm triangulates simple polygons in O(n log log n) expected time. Algorithms for half-space range reporting are also given. In addition, this paper gives asymptotically tight bounds for a combinatorial quantity of interest in discrete and computational geometry, related to halfspace partitions of point sets.

1,163 citations