scispace - formally typeset
Search or ask a question
Author

Mihai Patrascu

Other affiliations: AT&T, University of Twente, ASML Holding  ...read more
Bio: Mihai Patrascu is an academic researcher from AT&T Labs. The author has contributed to research in topics: Upper and lower bounds & Hash function. The author has an hindex of 30, co-authored 72 publications receiving 3188 citations. Previous affiliations of Mihai Patrascu include AT&T & University of Twente.


Papers
More filters
Proceedings ArticleDOI
Mihai Patrascu1
05 Jun 2010
TL;DR: This work describes a carefully-chosen dynamic version of set disjointness (the "multiphase problem"), and conjecture that it requires n^Omega(1) time per operation, and forms the first nonalgebraic reduction from 3SUM, which allows3SUM-hardness results for combinatorial problems.
Abstract: We consider a number of dynamic problems with no known poly-logarithmic upper bounds, and show that they require nΩ(1) time per operation, unless 3SUM has strongly subquadratic algorithms. Our result is modular: (1) We describe a carefully-chosen dynamic version of set disjointness (the "multiphase problem"), and conjecture that it requires n^Omega(1) time per operation. All our lower bounds follow by easy reduction. (2) We reduce 3SUM to the multiphase problem. Ours is the first nonalgebraic reduction from 3SUM, and allows 3SUM-hardness results for combinatorial problems. For instance, it implies hardness of reporting all triangles in a graph. (3) It is plausible that an unconditional lower bound for the multiphase problem can be established via a number-on-forehead communication game.

286 citations

Proceedings ArticleDOI
Mihai Patrascu1, Ryan Williams2
17 Jan 2010
TL;DR: Reductions from the problem of determining the satisfiability of Boolean CNF formulas (CNF-SAT) to several natural algorithmic problems are described, showing that attaining any of the following bounds would improve the state of the art in algorithms for SAT.
Abstract: We describe reductions from the problem of determining the satisfiability of Boolean CNF formulas (CNF-SAT) to several natural algorithmic problems. We show that attaining any of the following bounds would improve the state of the art in algorithms for SAT:• an O(nk-e) algorithm for k-Dominating Set, for any k ≥ 3,• a (computationally efficient) protocol for 3-party set disjointness with o(m) bits of communication,• an n°(d) algorithm for d-SUM,• an O(n5-e) algorithm for 2-SAT formulas with m = n1+0(1) clauses, where two clauses may have unrestricted length, and• an O((n + m)k-e) algorithm for HornSat with k unrestricted length clauses.One may interpret our reductions as new attacks on the complexity of SAT, or sharp lower bounds conditional on exponential hardness of SAT.

263 citations

Proceedings ArticleDOI
13 Jun 2011
TL;DR: A randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n log n) plus the output size is given, which resolves two open problems: given a set of n axis-aligned rectangles in the plane, the authors can report all k enclosure pairs in O( n lg n + k) expected time; and given aSet of n points in 4-D,they can find all maximal points (points not dominated by any other points) in O
Abstract: We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching All our results are in the standard word RAM model: We present two data structures for 2-d orthogonal range emptiness The first achieves O(n lg lg n) space and O(lg lg n) query time, assuming that the n given points are in rank space This improves the previous results by Alstrup, Brodal, and Rauhe (FOCS'00), with O(n lge n) space and O(lg lg n) query time, or with O(n lg lg n) space and O(lg2lg n) query time Our second data structure uses O(n) space and answers queries in O(lge n) time The best previous O(n)-space data structure, due to Nekrich (WADS'07), answers queries in O(lg n/lg lg n) time We give a data structure for 3-d orthogonal range reporting with O(n lg1+e n) space and O(lg lg n + k) query time for points in rank space, for any constant e>0 This improves the previous results by Afshani (ESA'08), Karpinski and Nekrich (COCOON'09), and Chan (SODA'11), with O(n lg3 n) space and O(lg lg n + k) query time, or with O(n lg1+en) space and O(lg2lg n + k) query time Consequently, we obtain improved upper bounds for orthogonal range reporting in all constant dimensions above 3Our approach also leads to a new data structure for 2D orthogonal range minimum queries with O(n lge n) space and O(lg lg n) query time for points in rank space We give a randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n log n) plus the output size This resolves two open problems (both appeared in Preparata and Shamos' seminal book): given a set of n axis-aligned rectangles in the plane, we can report all k enclosure pairs (ie, pairs (r1,r2) where rectangle r1 completely encloses rectangle r2) in O(n lg n + k) expected time; given a set of n points in 4-d, we can find all maximal points (points not dominated by any other points) in O(n lg n) expected time The most recent previous development on (a) was reported back in SoCG'95 by Gupta, Janardan, Smid, and Dasgupta, whose main result was an O([n lg n + k] lg lg n) algorithm The best previous result on (b) was an O(n lg n lg lg n) algorithm due to Gabow, Bentley, and Tarjan---from STOC'84! As a consequence, we also obtain the current-record time bound for the maxima problem in all constant dimensions above~4

233 citations

Journal ArticleDOI
TL;DR: In this paper, the cell-probe lower bound for dynamic data structures has been shown to be amortized in the external-memory model without assumptions on the data structure (such as the comparison model).
Abstract: We develop a new technique for proving cell-probe lower bounds on dynamic data structures. This technique enables us to prove an amortized randomized $\Omega(\lg n)$ lower bound per operation for several data structural problems on $n$ elements, including partial sums, dynamic connectivity among disjoint paths (or a forest or a graph), and several other dynamic graph problems (by simple reductions). Such a lower bound breaks a long-standing barrier of $\Omega(\lg n\,/\lg\lg n)$ for any dynamic language membership problem. It also establishes the optimality of several existing data structures, such as Sleator and Tarjan's dynamic trees. We also prove the first $\Omega(\log_B n)$ lower bound in the external-memory model without assumptions on the data structure (such as the comparison model). Our lower bounds also give a query-update trade-off curve matched, e.g., by several data structures for dynamic connectivity in graphs. We also prove matching upper and lower bounds for partial sums when parameterized by the word size and the maximum additive change in an update.

201 citations

Posted Content
TL;DR: In this paper, the cell-probe lower bound for searching predecessors among a static set of integers has been shown to be tight in polynomial and near-linear space.
Abstract: We develop a new technique for proving cell-probe lower bounds for static data structures. Previous lower bounds used a reduction to communication games, which was known not to be tight by counting arguments. We give the first lower bound for an explicit problem which breaks this communication complexity barrier. In addition, our bounds give the first separation between polynomial and near linear space. Such a separation is inherently impossible by communication complexity. Using our lower bound technique and new upper bound constructions, we obtain tight bounds for searching predecessors among a static set of integers. Given a set Y of n integers of l bits each, the goal is to efficiently find predecessor(x) = max{y in Y | y <= x}, by representing Y on a RAM using space S. In external memory, it follows that the optimal strategy is to use either standard B-trees, or a RAM algorithm ignoring the larger block size. In the important case of l = c*lg n, for c>1 (i.e. polynomial universes), and near linear space (such as S = n*poly(lg n)), the optimal search time is Theta(lg l). Thus, our lower bound implies the surprising conclusion that van Emde Boas' classic data structure from [FOCS'75] is optimal in this case. Note that for space n^{1+eps}, a running time of O(lg l / lglg l) was given by Beame and Fich [STOC'99].

161 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence collections.
Abstract: Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling the efficient clustering and search of massive sequence collections. Mash reduces large sequences and sequence sets to small, representative sketches, from which global mutation distances can be rapidly estimated. We demonstrate several use cases, including the clustering of all 54,118 NCBI RefSeq genomes in 33 CPU h; real-time database search using assembled or unassembled Illumina, Pacific Biosciences, and Oxford Nanopore data; and the scalable clustering of hundreds of metagenomic samples by composition. Mash is freely released under a BSD license ( https://github.com/marbl/mash ).

1,886 citations

Journal ArticleDOI
TL;DR: An algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O(dn 1c2/+o(1)) and space O(DN + n1+1c2 + o(1) + 1/c2), which almost matches the lower bound for hashing-based algorithm recently obtained.
Abstract: In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The problem is of significant interest in a wide variety of areas.

1,759 citations

Book
27 Jul 2015
TL;DR: This comprehensive textbook presents a clean and coherent account of most fundamental tools and techniques in Parameterized Algorithms and is a self-contained guide to the area, providing a toolbox of algorithmic techniques.
Abstract: This comprehensive textbook presents a clean and coherent account of most fundamental tools and techniques in Parameterized Algorithms and is a self-contained guide to the area. The book covers many of the recent developments of the field, including application of important separators, branching based on linear programming, Cut & Count to obtain faster algorithms on tree decompositions, algorithms based on representative families of matroids, and use of the Strong Exponential Time Hypothesis. A number of older results are revisited and explained in a modern and didactic way. The book provides a toolbox of algorithmic techniques. Part I is an overview of basic techniques, each chapter discussing a certain algorithmic paradigm. The material covered in this part can be used for an introductory course on fixed-parameter tractability. Part II discusses more advanced and specialized algorithmic ideas, bringing the reader to the cutting edge of current research. Part III presents complexity results and lower bounds, giving negative evidence by way of W[1]-hardness, the Exponential Time Hypothesis, and kernelization lower bounds. All the results and concepts are introduced at a level accessible to graduate students and advanced undergraduate students. Every chapter is accompanied by exercises, many with hints, while the bibliographic notes point to original publications and related work.

1,544 citations

Proceedings ArticleDOI
21 Oct 2006
TL;DR: An algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O and space O almost matches the lower bound for hashing-based algorithm recently obtained in [27].
Abstract: We present an algorithm for the c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O\left( {dn^{1/c^2 + o(1)} } \right) and space O\left( {dn + n^{1 + 1/c^2 + o(1)} } \right). This almost matches the lower bound for hashing-based algorithm recently obtained in [27]. We also obtain a space-efficient version of the algorithm, which uses dn+n log^{O(1)} n space, with a query time of dn^{O(1/c^2 )}. Finally, we discuss practical variants of the algorithms that utilize fast bounded-distance decoders for the Leech Lattice.

1,486 citations

Journal ArticleDOI
TL;DR: Two algorithms for the approximate nearest neighbor problem in high dimensional spaces for data sets of size n living in IR are presented, achieving query times that are sub-linear in n and polynomial in d.
Abstract: We present two algorithms for the approximate nearest neighbor problem in high dimensional spaces. For data sets of size n living in IR, the algorithms require space that is only polynomial in n and d, while achieving query times that are sub-linear in n and polynomial in d. We also show applications to other high-dimensional geometric problems, such as the approximate minimum spanning tree.

1,182 citations