Why is the term linear search used?

The terminology linear search comes from the fact that the boxes are linearly ordered, and must ideally be checked in that order.

Why is Acoord a distribution over deterministic search algorithms?

Because the authors allow coordination, any randomized search algorithm is centralized, and thus can be seen as a distribution over deterministic search algorithms.

what is the probability that a non-coordinating algorithm did not check box bx?

In essence, the authors show that if a non-coordinating algorithm is c-competitive, then it is also c-competitive under disordering of the boxes.

What is the definition of a functional view of an algorithm?

Definition 3. Given a non-coordinating search algorithm A, the functional view of A is the function N : N+ × N → [0, 1] defined as N(x, t) = Pr[Bx was not yet checked by time t by searcher si] where si is an arbitrary searcher performing A.

what is the function f(x) = (x)1/(k?

To calculate α, the authors use what the authors know from Eq. (13) about what N looks like, and the authors rely on the refinement of Eq. (11) given by the Presentation Lemma, i.e., that for all t, ∫∞ 1 (1 − N(x, t)) dx = t. Again, fix a t and examine the function f(x) = αρ(x)−1/(k−1).

What is the probability that none of the k searchers checked x by time t?

the probability that none of the k searchers checked x by time t is N(x, t)k, and thus, by Eq. (6),T(A, x) = ∞∑ t=0 N(x, t)k.

(Open Access) Parallel Bayesian Search with No Coordination (2019) | Pierre Fraigniaud

Q: What are the contributions mentioned in the paper "Parallel bayesian search with no coordination" ?

In this paper, the authors investigate the “ price of non-coordinating ”, in term of search performance, and they show that this price is actually quite small. Specifically, the authors consider a parallel version of a classical Bayesian search problem, where set of k ≥ 1 searchers are looking for a treasure placed in one of the boxes indexed by positive integers, according to some distribution p. The authors show that there is a very simple non-coordinating algorithm which has expected running time at most 4 ( 1− 1 k+1 ) 2 OPT + 10, where OPT is the expected running time of the best fully coordinated algorithm. The authors prove that, under this restriction, their algorithm has the best possible competitive ratio with respect to OPT. For the case where a complete description of the distribution p is given to the search algorithm, the authors describe an optimal non-coordinating algorithm for Bayesian search. ∗This work has received funding from the European Research Council ( ERC ) under the European Union ’ s Horizon 2020 research and innovation programme ( grant agreement No 648032 ).

Q: How does the algorithm perform in a probabilistic setting?

it turns out that the two settings (Bayesian search and linear search) are highly related, as far as order-invariant algorithms are concerned: an order-invariant algorithm that works well against a treasure placed arbitrarily would also work well in any probabilistic setting (under the assumption that the indices of the boxes are ordered according to their relative likelihood).

HAL Id: hal-01865469

https://hal.archives-ouvertes.fr/hal-01865469

Preprint submitted on 31 Aug 2018

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Parallel Bayesian Search with no Coordination

Pierre Fraigniaud, Amos Korman, Yoav Rodeh

To cite this version:

Pierre Fraigniaud, Amos Korman, Yoav Rodeh. Parallel Bayesian Search with no Coordination . 2018.

�hal-01865469�

Parallel Bayesian Search with no Coordination

∗

Pierre Fraigniaud

†

Amos Korman

‡

Yoav Rodeh

Abstract

Coordinating the actions of agents (e.g., volunteers analyzing radio signals in SETI@home) yields

eﬃcient search algorithms. However, such an eﬃciency is often at the cost of implementing complex

coordination mechanisms which may be expensive in term of communication and/or computation

overheads. Instead, non-coordinating algorithms, in which each agent operates independently from

the others, are typically very simple, and easy to implement. They are also inherently robust to slight

misbehaviors, or even crashes of agents. In this paper, we investigate the “price of non-coordinating”,

in term of search performance, and we show that this price is actually quite small. Speciﬁcally, we

consider a parallel version of a classical Bayesian search problem, where set of

k ≥

1 searchers are

looking for a treasure placed in one of the boxes indexed by positive integers, according to some

distribution

. Each searcher can open a random box at each step, and the objective is to ﬁnd

the treasure in a minimum number of steps. We show that there is a very simple non-coordinating

algorithm which has expected running time at most 4(1

−

k+1

)

OPT

+ 10, where

OPT

is the expected

running time of the best fully coordinated algorithm. Our algorithm does not even use the precise

description of the distribution

, but only the relative likelihood of the boxes. We prove that, under

this restriction, our algorithm has the best possible competitive ratio with respect to

OPT

. For the

case where a complete description of the distribution

is given to the search algorithm, we describe

an optimal non-coordinating algorithm for Bayesian search. This latter algorithm can be twice as

fast as our former algorithm in practical scenarios such as uniform distributions. All these results

provide a complete characterization of non-coordinating Bayesian search. The take-away message

is that, for their simplicity and robustness, non-coordinating algorithms are viable alternatives to

complex coordinating mechanisms subject to signiﬁcant overheads. Most of these results apply as

well to linear search, in which the indices of the boxes reﬂect their relative importance, and where

important boxes must be visited ﬁrst.

∗

This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020

research and innovation programme (grant agreement No 648032).

†

IRIF, CNRS and University Paris Diderot, Paris, France. E-mail: Pierre.Fraigniaud@irif.fr.

‡

IRIF, CNRS and University Paris Diderot, Paris, France. E-mail: Amos.Korman@irif.fr.

Weizmann Institute of Science. E-mail: yoav.rodeh@gmail.com.

1 Introduction

BOINC [

] (Berkeley Open Infrastructure for Network Computing) is a platform for volunteer computing

supporting dozens of projects including the famous SETI@home analyzing radio signals for identifying

signs of extra terrestrial intelligence. Most projects maintained at BOINC use parallel search mechanisms

where a central server controls, and distributes the work to volunteers. The framework in this paper is an

abstraction for projects operated at platforms similar to BOINC with hundreds of thousands distributed

searchers. We address the following question: how to distribute the work among the searchers with

respect to the amount of coordination between them provided by the central server? This paper drives to

the conclusion that no coordination might actually be a quite viable strategy, both eﬃcient and robust.

Speciﬁcally, we consider a parallel variant of the classical Bayesian search problem, typically attributed

to Blackwell [

]. A treasure is placed according to some distribution

in one of inﬁnitely many boxes,

indexed by the positive integers. The search for the treasure is performed in parallel by

k ≥

1 agents,

also called searchers, which means that at each time step each searcher can “peek” into a box to check

whether the treasure is present there. The goal is to minimize the expected time until the ﬁrst searcher

ﬁnds the treasure. We assume that the number

of searchers is known to the algorithm. We will

consider two cases, whether

is given to the algorithm, or not. However, in the latter case, we assume

that the algorithm is aware of the relative likelihood of the boxes. In both cases, we can assume, w.l.o.g.,

that the boxes B

, x ≥ 1, are ordered so that p is non-increasing, i.e., p(x + 1) ≤ p(x) for every x ≥ 1.

Let

, . . . , s

be the

searchers at hand. If coordination is allowed, let

coord

be the algorithm that

lets searcher

peek into box

(t−1)k+i

at time

. A simple application of the rearrangement inequality

shows that

coord

is an optimal algorithm, that is, it minimizes the expected time until one searcher

ﬁnds the treasure. This time is

x≥1

(

)

dx/ke

since the box

is opened at time

dx/ke

coord

, and

this box has probability

(

) to contain the treasure. In particular, the optimal expected time to ﬁnd

the treasure with a single searcher is

x≥1

x p

(

). Therefore, if coordination is allowed,

searchers

essentially allow to ﬁnd the treasure

times faster than one searcher alone, in expectation. (Speciﬁcally,

the speedup resulting from using

searchers approaches

when the expectation of the distribution

grows to inﬁnity). However, as simple as this algorithm is,

coord

is very sensitive to faults of all sorts. For

example, if one searcher crashes at some point during the execution then the searchers may completely

miss the treasure, unless the protocol employs some mechanism for detecting such faults. Indeed, in

coord

, each box is eventually opened by just one searcher. Namely, box

(t−1)k+i

is opened only by

searcher s

, for every t ≥ 1 and 1 ≤ i ≤ k.

In this paper, we highlight the usefulness of a class of search algorithms, called non-coordinating,

which is inherently robust. In such algorithms, all searchers operate independently, executing the same

protocol, diﬀering only in the outcome of the ﬂips of their private random coins. A canonical example is

the case of multiple random walkers that search a graph [

]. Although many search problems cannot be

eﬃciently parallelized without coordination, when such parallelization can be achieved, the beneﬁt can

potentially be high, not only in terms of saving in communication and overhead in computation, but also

in terms of robustness. To get some intuition, observe that when executing a non-coordinating algorithm,

the correct operation as well as the running time can only improve if more searchers than planned are

actually being used. Suppose for instance that an oblivious adversary is allowed to crash at most

out of the

searchers at arbitrary times during the execution. To overcome the presence of

faults,

one can simply run the non-coordinating algorithm that is designed for the case of

k − k

searchers. If

the running time of the non-coordinating algorithm for

searchers without crashes is

(

), then the

running time of the new robust (non-coordinating) algorithm would be at most

(

k − k

). Note that

even when coordination is allowed, one cannot expect to obtain robustness at a cost less than

(

k − k

)

in the worst case, where

(

) denotes the cost of an optimal coordinating algorithm for

searchers

without crashes, since the number of searchers that remain alive is in the worst case

k − k

. Hence, if

T (·) and

T (·) are close, we get robustness almost for free by using a non-coordinating algorithm.

In this paper, we are interested in computing how much we lose in term of performance when using

non-coordinating algorithms. Speciﬁcally, let

k ≥

1, and let us denote by

(

A, x

) the expected time for

an algorithm

to ﬁnd the treasure with

searchers running in parallel if this treasure is placed at box

. Further, given a distribution

over the placement of the treasure in the boxes, let

p,k

(

) denote the

expected time for A to ﬁnd the treasure when it is placed in one of the boxes according to p. We have

p,k

(A) =

x≥1

p(x)T

(A, x). (1)

With this notation, the expected running time of the optimal coordinating algorithm is

p,k

(

coord

) =

x≥1

(

)

dx/ke

. We are interested in comparing these two terms, i.e.,

p,k

(

) for a non-coordinating

algorithm

versus

p,k

(

coord

), the complexity of the best search algorithm with full coordination. For

this purpose, we use competitive analysis, and say that an algorithm

-competitive for

searchers

looking for a treasure placed according to p if there is a constant b such that

p,k

(A) ≤ c T

p,k

coord

) + b.

We show that there is a non-coordinating algorithm with small competitive ratio, hence establishing

that indeed one does not lose much in using non-coordinating algorithms.

Before going into the details of our results, let us observe that although the random placement of the

treasure is the common setting in Bayesian search problems, yielding Eq.

(1)

for deﬁning the complexity

of a search algorithm, there is another abstract search setting which deserves to be investigated, that we

call linear search. Indeed, searching for a proper divisor of a given number

, the typical approach to

solve the problem consists of enumerating the candidate divisors in increasing order, from 2 to

√

, and

checking them one by one. This is because the probability that a random number is divisible by a given

prime is inversely proportional to this prime. Similarly, in cryptography, an attack is better proceeded

by systematically checking smaller keys than longer ones, because the time to check a key is typically

exponential in its size. There are thus several contexts in which the search space can be ordered in a

way such that, given that the previous trials were not successful, the next candidate according to the

order is either the most preferable, or most likely to be valid, or the easiest to check. This led us to

consider another measure of complexity, comparing the search time of an algorithm

to the search time

of the algorithm with one searcher opening the boxes sequentially in order of their indices, namely

(A) = max

x≥1

(A, x)/x. (2)

Again, we say that a search algorithm

-competitive for

searchers looking for a treasure arbitrarily

placed in one box if there is a constant b such that

(A) ≤ c T

coord

) + b.

In the linear search setting, the aforementioned algorithm

coord

is also optimal. We show that, as for

Bayesian search, one does not lose much in using non-coordinating algorithms in linear search.

1.1 Our Results

First, we design and analyze an optimal non-coordinating algorithm for Bayesian search, where

is given.

Our algorithm, called

, has optimal expected running time among all non-coordinating algorithms.

Speciﬁcally, for every distribution p, every k ≥ 1, and every non-coordinating algorithm A,

p,k

) ≤ T

p,k

(A).

A remarkable property satisﬁed by our non-coordinating algorithm

is that, at any time

t >

1 during

its execution, all boxes that received a positive probability to be checked at some time

< t

, are now

going to be checked at time

with equal probability. The design of

is complex. However, when

is the uniform distribution over a ﬁnite domain,

becomes simple to describe: at each step, each

searcher running

chooses a box uniformly among those it did not check at previous step. This natural

algorithm for the uniform setting is optimal among all non-coordinating algorithms, and is shown to be

at most 2 times slower than A

coord

Next, we focus on the notion of order-invariant algorithms, that is, algorithms assuming only the

knowledge of the relative likelihood of the boxes (and not knowing the exact probability of ﬁnding the

treasure in each box). Such algorithms are appealing because they are “universal”, in the sense that,

once the boxes have been reordered such that

is not less likely to contain the treasure than

x+1

, any

order-invariant algorithm acts the same for all distributions. We present a very simple yet highly eﬃcient

non-coordinating order-invariant algorithm, called

order

. In this algorithm, at step

, each searcher

checks a box uniformly chosen among those it did not check yet in

{

, . . . , d

(

+ 1)

}

. The performance

order

is essentially at most 4 times the expected running time of the best fully coordinated algorithm

coord

. Precisely, for every distribution p, and every k ≥ 1,

p,k

order

) ≤ 4



1 −

k + 1



p,k

coord

) + 10. (3)

Ignoring the constant additive term, the aforementioned upper bound implies that the cost paid for

not coordinating is just at most

for two searchers,

for three searchers, and approaches 4 as the

number of searchers goes to inﬁnity. In fact we show that these costs are tight in a very strong sense, as,

for any given number of searchers, there is no non-coordinating order-invariant algorithm that achieves a

better competitive ratio. Speciﬁcally, for every distribution

, every

k ≥

1, and every order-invariant

non-coordinating algorithm A, if there exist b and c such that T

p,k

(A) ≤ c T

p,k

coord

) + b, then

c ≥ 4



1 −

k + 1



. (4)

Algorithm

order

remembers all the boxes it checked, and so each searcher needs memory linear in the

running time of the algorithm. We also consider

obliv

which at step

chooses one box uniformly at

random in

{

, . . . , kd

, hence potentially choosing many times the same box at diﬀerent time steps.

This algorithm uses memory that is just logarithmic in its running time, but performs almost as well as

order

for large number of searchers. Precisely, for every distribution p, and every k ≥ 1,

p,k

obliv

) ≤ 4 T

p,k

coord

) + 2. (5)

All the aforementioned upper bound results on order-invariant algorithms are actually established by

considering the linear search setting, where the treasure is placed at an arbitrary box, and boxes are

ordered by importance, that is, when focussing on the complexity

(

) =

max

x≥1

(

A, x

)

of any

algorithm

(cf. Eq.

(2)

). Indeed, it turns out that the two settings (Bayesian search and linear search)

are highly related, as far as order-invariant algorithms are concerned: an order-invariant algorithm that

works well against a treasure placed arbitrarily would also work well in any probabilistic setting (under

the assumption that the indices of the boxes are ordered according to their relative likelihood). In fact,

in the linear search setting,

order

and

obliv

have the same competitive ratio as those mentioned in

Eq.

(3)

and

(5)

, respectively. Moreover, the lower bound of Eq.

(4)

also holds in the linear search setting,

i.e., A

order

has also optimal competitive ratio in this latter setting.

Parallel Bayesian Search with No Coordination

Figures

Citations

Weighted Group Search on a Line

Multi-round cooperative search games with multiple players

Probabilistically Faulty Searching on a Half-Line

Algorithms for p-Faulty Search on a Half-Line

Probabilistically Faulty Searching on a Half-Line

References

Theory of Optimal Search

The Theory of Search Games and Rendezvous

Searching in the Plane

Distributed Computing by Oblivious Mobile Robots

On the linear search problem

Related Papers (5)

Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks

Learning Optimal Search Algorithms from Data.

The famine of forte: Few search problems greatly favor your algorithm

Potential search: a bounded-cost search algorithm

A simple approach for finding the globally optimal Bayesian network structure

Frequently Asked Questions (9)

Q1. What are the contributions mentioned in the paper "Parallel bayesian search with no coordination" ?

Q2. Why is the term linear search used?

Q3. How is the time to check a key in cryptography?

Q4. Why is Acoord a distribution over deterministic search algorithms?

Q5. what is the probability that a non-coordinating algorithm did not check box bx?

Q6. What is the definition of a functional view of an algorithm?

Q7. what is the function f(x) = (x)1/(k?

Q8. What is the probability that none of the k searchers checked x by time t?

Q9. How does the algorithm perform in a probabilistic setting?