scispace - formally typeset
Search or ask a question
Author

Kasper Green Larsen

Bio: Kasper Green Larsen is an academic researcher from Aarhus University. The author has contributed to research in topics: Upper and lower bounds & Computer science. The author has an hindex of 23, co-authored 111 publications receiving 1733 citations.


Papers
More filters
Proceedings ArticleDOI
13 Jun 2011
TL;DR: A randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n log n) plus the output size is given, which resolves two open problems: given a set of n axis-aligned rectangles in the plane, the authors can report all k enclosure pairs in O( n lg n + k) expected time; and given aSet of n points in 4-D,they can find all maximal points (points not dominated by any other points) in O
Abstract: We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching All our results are in the standard word RAM model: We present two data structures for 2-d orthogonal range emptiness The first achieves O(n lg lg n) space and O(lg lg n) query time, assuming that the n given points are in rank space This improves the previous results by Alstrup, Brodal, and Rauhe (FOCS'00), with O(n lge n) space and O(lg lg n) query time, or with O(n lg lg n) space and O(lg2lg n) query time Our second data structure uses O(n) space and answers queries in O(lge n) time The best previous O(n)-space data structure, due to Nekrich (WADS'07), answers queries in O(lg n/lg lg n) time We give a data structure for 3-d orthogonal range reporting with O(n lg1+e n) space and O(lg lg n + k) query time for points in rank space, for any constant e>0 This improves the previous results by Afshani (ESA'08), Karpinski and Nekrich (COCOON'09), and Chan (SODA'11), with O(n lg3 n) space and O(lg lg n + k) query time, or with O(n lg1+en) space and O(lg2lg n + k) query time Consequently, we obtain improved upper bounds for orthogonal range reporting in all constant dimensions above 3Our approach also leads to a new data structure for 2D orthogonal range minimum queries with O(n lge n) space and O(lg lg n) query time for points in rank space We give a randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n log n) plus the output size This resolves two open problems (both appeared in Preparata and Shamos' seminal book): given a set of n axis-aligned rectangles in the plane, we can report all k enclosure pairs (ie, pairs (r1,r2) where rectangle r1 completely encloses rectangle r2) in O(n lg n + k) expected time; given a set of n points in 4-d, we can find all maximal points (points not dominated by any other points) in O(n lg n) expected time The most recent previous development on (a) was reported back in SoCG'95 by Gupta, Janardan, Smid, and Dasgupta, whose main result was an O([n lg n + k] lg lg n) algorithm The best previous result on (b) was an O(n lg n lg lg n) algorithm due to Gabow, Bentley, and Tarjan---from STOC'84! As a consequence, we also obtain the current-record time bound for the maxima problem in all constant dimensions above~4

233 citations

Proceedings ArticleDOI
19 May 2012
TL;DR: This paper develops a new technique for proving lower bounds on the update time and query time of dynamic data structures in the cell probe model and proves the highest lower bound to date for any explicit problem, namely a lower bound of tq=Ω((lg n/lg(wtu))2).
Abstract: In this paper we develop a new technique for proving lower bounds on the update time and query time of dynamic data structures in the cell probe model. With this technique, we prove the highest lower bound to date for any explicit problem, namely a lower bound of tq=Ω((lg n/lg(wtu))2). Here n is the number of update operations, w the cell size, tq the query time and tu the update time. In the most natural setting of cell size w=Θ(lg n), this gives a lower bound of tq=Ω((lg n/lg lg n)2) for any polylogarithmic update time. This bound is almost a quadratic improvement over the highest previous lower bound of Ω(lg n), due to Patrascu and Demaine [SICOMP'06]. We prove our lower bound for the fundamental problem of weighted orthogonal range counting. In this problem, we are to support insertions of two-dimensional points, each assigned a Θ(lg n)-bit integer weight. A query to this problem is specified by a point q=(x,y), and the goal is to report the sum of the weights assigned to the points dominated by q, where a point (x',y') is dominated by q if x' ≤ x and y' ≤ y. In addition to being the highest cell probe lower bound to date, our lower bound is also tight for data structures with update time tu = Ω(lg2+en), where e>0 is an arbitrarily small constant.

84 citations

Proceedings ArticleDOI
01 Oct 2017
TL;DR: For any d, n ≥ 2 and 1=(min{n, d})0.4999, the value of d must be at least 2 to satisfy the inequality of the following type:
Abstract: For any integers $d, n \geq 2$ and $1/({\min\{n,d\}})^{0.4999} < \varepsilon<1$, we show the existence of a set of $n$ vectors $X\subset \mathbb{R}^d$ such that any embedding $f:X\rightarrow \mathbb{R}^m$ satisfying $$ \forall x,y\in X,\ (1-\varepsilon)\|x-y\|_2^2\le \|f(x)-f(y)\|_2^2 \le (1+\varepsilon)\|x-y\|_2^2 $$ must have $$ m = \Omega(\varepsilon^{-2} \lg n). $$ This lower bound matches the upper bound given by the Johnson-Lindenstrauss lemma [JL84]. Furthermore, our lower bound holds for nearly the full range of $\varepsilon$ of interest, since there is always an isometric embedding into dimension $\min\{d, n\}$ (either the identity map, or projection onto $\mathop{span}(X)$). Previously such a lower bound was only known to hold against linear maps $f$, and not for such a wide range of parameters $\varepsilon, n, d$ [LN16]. The best previously known lower bound for general $f$ was $m = \Omega(\varepsilon^{-2}\lg n/\lg(1/\varepsilon))$ [Wel74, Lev83, Alo03], which is suboptimal for any $\varepsilon = o(1)$.

82 citations

Proceedings ArticleDOI
20 Oct 2012
TL;DR: The cell probe complexity of evaluating an n-degree polynomial P over a finite field F of size at least n1+Ω(1) is studied to show that any static data structure for evaluating P(x), where x ∈ F, must use Ω(lg|F|/ lg(Sw/n lg |F|) cell probes to answer a query, which is the highest static cell probe lower bound to date.
Abstract: In this paper, we study the cell probe complexity of evaluating an $n$-degree polynomial $P$ over a finite field $\F$ of size at least $n^{1+\Omega(1)}$. More specifically, we show that any static data structure for evaluating $P(x)$, where $x \in \F$, must use $\Omega(\lg |\F|/\lg(Sw/n\lg|\F|))$ cell probes to answer a query, where $S$ denotes the space of the data structure in number of cells and $w$ the cell size in bits. This bound holds in expectation for randomized data structures with any constant error probability $\delta

67 citations

Journal ArticleDOI
TL;DR: Strong evidence is presented that a query time significantly below $\sqrt{n}$ cannot be achieved by purely combinatorial techniques, and it is shown that boolean matrix multiplication of two $n$ matrices reduces to n range mode queries in an array of size O(n).
Abstract: A mode of a multiset S is an element a?S of maximum multiplicity; that is, a occurs at least as frequently as any other element in S. Given an array A[1:n] of n elements, we consider a basic problem: constructing a static data structure that efficiently answers range mode queries on A. Each query consists of an input pair of indices (i,j) for which a mode of A[i:j] must be returned. The best previous data structure with linear space, by Krizanc, Morin, and Smid (Proceedings of the International Symposium on Algorithms and Computation (ISAAC), pp. 517---526, 2003), requires $\varTheta (\sqrt{n}\log\log n)$ query time in the worst case. We improve their result and present an O(n)-space data structure that supports range mode queries in $O(\sqrt{n/\log n})$ worst-case time. In the external memory model, we give a linear-space data structure that requires $O(\sqrt{n/B})$ I/Os per query, where B denotes the block size. Furthermore, we present strong evidence that a query time significantly below $\sqrt{n}$ cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two $\sqrt{n} \times \sqrt{n}$ matrices reduces to n range mode queries in an array of size O(n). Additionally, we give linear-space data structures for the dynamic problem (queries and updates in near O(n 3/4) time), for orthogonal range mode in d dimensions (queries in near O(n 1?1/2d ) time) and for half-space range mode in d dimensions (queries in $O(n^{1-1/d^{2}})$ time). Finally, we complement our dynamic data structure with a reduction from the multiphase problem, again supporting that we cannot hope for much more efficient data structures.

62 citations


Cited by
More filters
Book
27 Sep 2018
TL;DR: A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.
Abstract: High-dimensional probability offers insight into the behavior of random vectors, random matrices, random subspaces, and objects used to quantify uncertainty in high dimensions Drawing on ideas from probability, analysis, and geometry, it lends itself to applications in mathematics, statistics, theoretical computer science, signal processing, optimization, and more It is the first to integrate theory, key tools, and modern applications of high-dimensional probability Concentration inequalities form the core, and it covers both classical results such as Hoeffding's and Chernoff's inequalities and modern developments such as the matrix Bernstein's inequality It then introduces the powerful methods based on stochastic processes, including such tools as Slepian's, Sudakov's, and Dudley's inequalities, as well as generic chaining and bounds based on VC dimension A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression

1,190 citations

Posted Content
TL;DR: This paper shows how to transform PIR schemes into SPIR schemes (with information-theoretic privacy), paying a constant factor in communication complexity, and introduces a new cryptographic primitive, called conditional disclosure of secrets, which it is believed may be a useful building block for the design of other cryptographic protocols.
Abstract: Private information retrieval (PIR) schemes allow a user to retrieve the ith bit of an n-bit data string x, replicated in k?2 databases (in the information-theoretic setting) or in k?1 databases (in the computational setting), while keeping the value of i private. The main cost measure for such a scheme is its communication complexity. In this paper we introduce a model of symmetrically-private information retrieval (SPIR), where the privacy of the data, as well as the privacy of the user, is guaranteed. That is, in every invocation of a SPIR protocol, the user learns only a single physical bit of x and no other information about the data. Previously known PIR schemes severely fail to meet this goal. We show how to transform PIR schemes into SPIR schemes (with information-theoretic privacy), paying a constant factor in communication complexity. To this end, we introduce and utilize a new cryptographic primitive, called conditional disclosure of secrets, which we believe may be a useful building block for the design of other cryptographic protocols. In particular, we get a k-database SPIR scheme of complexity O(n1/(2k?1)) for every constant k?2 and an O(logn)-database SPIR scheme of complexity O(log2n·loglogn). All our schemes require only a single round of interaction, and are resilient to any dishonest behavior of the user. These results also yield the first implementation of a distributed version of (n1)-OT (1-out-of-n oblivious transfer) with information-theoretic security and sublinear communication complexity.

418 citations

Posted Content
TL;DR: A simple neural model is proposed that combines the efficiency of dual encoders with some of the expressiveness of more costly attentional architectures, and is explored to explore sparse-dense hybrids to capitalize on the precision of sparse retrieval.
Abstract: Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query. We investigate the capacity of this architecture relative to sparse bag-of-words models and attentional neural networks. Using both theoretical and empirical analysis, we establish connections between the encoding dimension, the margin between gold and lower-ranked documents, and the document length, suggesting limitations in the capacity of fixed-length encodings to support precise retrieval of long documents. Building on these insights, we propose a simple neural model that combines the efficiency of dual encoders with some of the expressiveness of more costly attentional architectures, and explore sparse-dense hybrids to capitalize on the precision of sparse retrieval. These models outperform strong alternatives in large-scale retrieval.

227 citations

Journal ArticleDOI
TL;DR: This work designs a new crossover-based genetic algorithm that uses mutation with a higher-than-usual mutation probability to increase the exploration speed and crossover with the parent to repair losses incurred by the more aggressive mutation.

208 citations

Proceedings Article
01 Jan 2005
TL;DR: This study shows that the PR-tree performs similar to the best known R-tree variants on real-life and relatively nicely distributed data, but outperforms them significantly on more extreme data.
Abstract: The query efficiency of a data structure that stores a set of objects, can normally be assessed by analysing the number of objects, pointers etc. looked at when answering a query. However, if the data structure is too big to fit in main memory, data may need to be fetched from disk. In that case, the query efficiency is easily dominated by moving the disk head to the correct locations, rather than by reading the data itself. To reduce the number of disk accesses, once can group the data into blocks, and strive to bound the number of different blocks accessed rather than the number of individual data objects read. An R-tree is a general-purpose data structur that stores a hierarchical grouping of geometric objects into blocks. Many heuristics have been designed to determine which objects should be grouped together, but none of these heuristics could give a guarantee on the resulting worst-case query time. We present the Priority R-tree, or PR-tree, which is the first R-tree variant that always answers a window query by accessing $O((N/B)^{1-1/d} + T/B)$ blocks, where $N$ is the number of $d$-dimensional objects stored, $B$ is the number of objects per block, and $T$ is the number of objects whose bounding boxes intersect the query window. This is provably asymptotically optimal. Experiments show that the PR-tree performs similar to the best known heuristics on real-life and relatively nicely distributed data, but outperforms them significantly on more extreme data.

187 citations