scispace - formally typeset
Search or ask a question

Showing papers by "Alejandro López-Ortiz published in 2001"


Book ChapterDOI
05 Jan 2001
TL;DR: This paper presents experiments for searching 114 megabytes of text from the World Wide Web using 5,000 actual user queries from a commercial search engine, and studies several improvement techniques for the standard algorithms to find an algorithm that outperforms existing algorithms in most cases.
Abstract: In [3] we introduced an adaptive algorithm for computing the intersection of k sorted sets within a factor of at most 8k comparisons of the information-theoretic lower bound under a model that deals with an encoding of the shortest proof of the answer This adaptive algorithm performs better for "burstier" inputs than a straightforward worst-case optimal method Indeed, we have shown that, subject to a reasonable measure of instance difficulty, the algorithm adapts optimally up to a constant factor This paper explores how this algorithm behaves under actual data distributions, compared with standard algorithms We present experiments for searching 114 megabytes of text from the World Wide Web using 5,000 actual user queries from a commercial search engine From the experiments, it is observed that the theoretically optimal adaptive algorithm is not always the optimal in practice, given the distribution of WWW text data We then proceed to study several improvement techniques for the standard algorithms These techniques combine improvements suggested by the observed distribution of the data as well as the theoretical results from [3] We perform controlled experiments on these techniques to determine which ones result in improved performance, resulting in an algorithm that outperforms existing algorithms in most cases

74 citations


Journal ArticleDOI
TL;DR: It is shown that even if an upper bound of D on the distance to the target is known in advance, then the competitive ratio of any search strategy is at least 1+2m m /(m−1) m−1 − O (1/ log 2 D) .

54 citations


Proceedings ArticleDOI
09 Jan 2001
TL;DR: This work proves a linear lower bound on the size of any index that reports the location (if any) of a substring in the text in time proportional to the length of the pattern.
Abstract: Most information-retrieval systems preprocess the data to produce an auxiliary index structure. Empirically, it has been observed that there is a tradeoff between query response time and the size of the index. When indexing a large corpus, such as the web, the size of the index is an important consideration. In this case it would be ideal to produce an index that is substantially smaller than the text.In this work we prove a linear lower bound on the size of any index that reports the location (if any) of a substring in the text in time proportional to the length of the pattern. In other words, an index supporting linear-time substring searches requires about as much space as the original text. Here “time” is measured in the number of bit probes to the text; an arbitrary amount of computation may be done on an arbitrary amount of the index. Our lower bound applies to inverted word indices as well.

48 citations


Proceedings Article
01 Jan 2001

48 citations


Proceedings ArticleDOI
09 Jan 2001
TL;DR: The natural question of whether all NP-complete problems have a common restriction under which they are polynomially solvable is explored and a polynomial-time algorithm is given to determine whether a regular language is universally easy.
Abstract: We explore the natural question of whether all NP-complete problems have a common restriction under which they are polynomially solvable. More precisely, we study what languages are universally easy in that their intersection with any NP-complete problem is in P. In particular, we give a polynomial-time algorithm to determine whether a regular language is universally easy. While our approach is language-theoretic, the results bear directly on finding polynomial-time solutions to very broad and useful classes of problems.

6 citations


Journal ArticleDOI
TL;DR: Lower bounds for on-line searching problems in two special classes of simple polygons called streets and generalized streets are presented and prove a lower bound of on the competitive ratio of any deterministic search strategy—which can be shown to be tight.
Abstract: We present lower bounds for on-line searching problems in two special classes of simple polygons called streets and generalized streets. In streets we assume that the location of the target is known to the robot in advance and prove a lower bound of on the competitive ratio of any deterministic search strategy—which can be shown to be tight. For generalized streets we show that if the location of the target is not known, then there is a class of orthogonal generalized streets for which the competitive ratio of any search strategy is at least in the L2-metric—again matching the competitive ratio of the best known algorithm. We also show that if the location of the target is known, then the competitive ratio for searching in generalized streets in the L1-metric is at least 9 which is tight as well. The former result is based on a lower bound on the average competitive ratio of searching on the real line if an upper bound of D to the target is given. We show that in this case the average competitive ratio is at least 9-O(1/logD).

5 citations