scispace - formally typeset
Search or ask a question
Author

Kasturi Varadarajan

Other affiliations: Rutgers University, Duke University
Bio: Kasturi Varadarajan is an academic researcher from University of Iowa. The author has contributed to research in topics: Approximation algorithm & Metric space. The author has an hindex of 37, co-authored 117 publications receiving 4454 citations. Previous affiliations of Kasturi Varadarajan include Rutgers University & Duke University.


Papers
More filters
Book ChapterDOI
18 Feb 2009
TL;DR: It is shown that this well-known problem is NP-hard even for instances in the plane, answering an open question posed by Dasgupta [6].
Abstract: In the k-means problem, we are given a finite set S of points in $\Re^m$, and integer k ≥ 1, and we want to find k points (centers) so as to minimize the sum of the square of the Euclidean distance of each point in S to its nearest center. We show that this well-known problem is NP-hard even for instances in the plane, answering an open question posed by Dasgupta [6].

494 citations

Journal ArticleDOI
TL;DR: The specific applications of the technique include ϵ-approximation algorithms for computing diameter, width, and smallest bounding box, ball, and cylinder of P, and maintaining all the previous measures for a set of moving points.
Abstract: We present a general technique for approximating various descriptors of the extent of a set P of n points in Rd when the dimension d is an arbitrary fixed constant. For a given extent measure μ and a parameter e > 0, it computes in time O(n + 1/eO(1)) a subset Q ⊆ P of size 1/eO(1), with the property that (1 − e)μ(P) ≤ μ(Q) ≤ μ(P). The specific applications of our technique include e-approximation algorithms for (i) computing diameter, width, and smallest bounding box, ball, and cylinder of P, (ii) maintaining all the previous measures for a set of moving points, and (iii) fitting spheres and cylinders through a point set P. Our algorithms are considerably simpler, and faster in many cases, than previously known algorithms.

400 citations

01 Jan 2007
TL;DR: The paradigm of coresets has recently emerged as a powerful tool for efficiently approximating various extent measures of a point set P and has been successfully applied to various optimization and extent measure problems.
Abstract: The paradigm of coresets has recently emerged as a powerful tool for efficiently approximating various extent measures of a point set P . Using this paradigm, one quickly computes a small subset Q of P , called a coreset, that approximates the original set P and and then solves the problem on Q using a relatively inefficient algorithm. The solution for Q is then translated to an approximate solution to the original point set P . This paper describes the ways in which this paradigm has been successfully applied to various optimization and extent measure problems.

391 citations

Journal ArticleDOI
TL;DR: It is shown that this well-known problem is NP-hard even for instances in the plane, answering an open question posed by Dasgupta (2007).

234 citations

Journal ArticleDOI
TL;DR: It is shown that polynomial-time approximation algorithms with provable performance exist, under a certain general condition: that for a random subset $R\subset S$ and nondecreasing function f(·), there is a decomposition of the complement ${Bbb U}\backslash\bigcup (R)$ into an expected at most f(|R|) regions, each region of a particular simple form.
Abstract: Given a collection S of subsets of some set ${\Bbb U},$ and ${\Bbb M}\subset{\Bbb U},$ the set cover problem is to find the smallest subcollection $C\subset S$ that covers ${\Bbb M},$ that is, ${\Bbb M} \subseteq \bigcup (C),$ where $\bigcup(C)$ denotes $\bigcup_{Y \in C} Y.$ We assume of course that S covers ${\Bbb M}.$ While the general problem is NP-hard to solve, even approximately, here we consider some geometric special cases, where usually ${\Bbb U} = {\Bbb R}^d.$ Combining previously known techniques [4], [5], we show that polynomial-time approximation algorithms with provable performance exist, under a certain general condition: that for a random subset $R\subset S$ and nondecreasing function f(·), there is a decomposition of the complement ${\Bbb U}\backslash\bigcup (R)$ into an expected at most f(|R|) regions, each region of a particular simple form. Under this condition, a cover of size O(f(|C|)) can be found in polynomial time. Using this result, and combinatorial geometry results implying bounding functions f(c) that are nearly linear, we obtain o(log c) approximation algorithms for covering by fat triangles, by pseudo-disks, by a family of fat objects, and others. Similarly, constant-factor approximations follow for similar-sized fat triangles and fat objects, and for fat wedges. With more work, we obtain constant-factor approximation algorithms for covering by unit cubes in ${\Bbb R}^3,$ and for guarding an x-monotone polygonal chain.

223 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.
Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the $k$ dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

3,248 citations

Posted Content
TL;DR: In this article, a modular framework for constructing randomized algorithms that compute partial matrix decompositions is presented, which uses random sampling to identify a subspace that captures most of the action of a matrix and then the input matrix is compressed to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization.
Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis.

2,356 citations

Journal ArticleDOI
TL;DR: Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.
Abstract: In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time, and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Researchers in Theoretical Computer Science, Databases, IP Networking and Computer Systems are working on the data stream challenges. This article is an overview and survey of data stream algorithmics and is an updated version of [1].

1,598 citations

Book
01 Jan 2005
TL;DR: In this paper, the authors present a survey of basic mathematical foundations for data streaming systems, including basic mathematical ideas, basic algorithms, and basic algorithms and algorithms for data stream processing.
Abstract: 1 Introduction 2 Map 3 The Data Stream Phenomenon 4 Data Streaming: Formal Aspects 5 Foundations: Basic Mathematical Ideas 6 Foundations: Basic Algorithmic Techniques 7 Foundations: Summary 8 Streaming Systems 9 New Directions 10 Historic Notes 11 Concluding Remarks Acknowledgements References

1,506 citations

Book
02 Jan 1991

1,377 citations