scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A simple approach for adapting continuous load balancing processes to discrete settings

TL;DR: A general method that converts a wide class of continuous neighborhood load balancing algorithms into a discrete version that achieves asymptotically lower discrepancies and presents a randomized version of the algorithm balancing the load if the initial load on every node is large enough.
Abstract: We introduce a general method that converts a wide class of continuous neighborhood load balancing algorithms into a discrete version. Assume that initially the tasks are arbitrarily distributed among the nodes of a graph. In every round every node is allowed to communicate and exchange load with an arbitrary subset of its neighbors. The goal is to balance the load as evenly as possible. Continuous load balancing algorithms that are allowed to split tasks arbitrarily can balance the load perfectly, so that every node has exactly the same load. Discrete load balancing algorithms are not allowed to split tasks and therefore cannot balance the load perfectly. In this paper we consider the problem in a very general setting, where the tasks can have arbitrary weights and the nodes can have different speeds. Given a neighborhood load balancing algorithm that balances the load perfectly in t rounds, we convert the algorithm into a discrete version. This new algorithm is deterministic and balances the load in t rounds so that the difference between the average and the maximum load is at most 2d•wmax, where d is the maximum degree of the network and wmax is the maximum weight of any task. Compared to the previous methods that work for general graphs [12], our method achieves asymptotically lower discrepancies (e.g. O(1) vs. O(log n) for constant-degree expanders and O(r) vs. O(n1/r) for r-dimensional tori) in the same number of rounds. For the case of uniform weights we present a randomized version of our algorithm balancing the load so that the difference between the minimum and the maximum load is at most O√dlog n) if the initial load on every node is large enough.

Summary (2 min read)

1 Introduction

  • In this paper the authors consider the problem of neighbourhood load balancing in arbitrary networks.
  • The tasks can have arbitrary weights; the weight of ∗This paper is an extended version of [6].
  • Neighbourhood load balancing algorithms usually work in synchronous rounds.
  • These matchings are then used periodically (periodic matching model).
  • Here all the nodes balance their load with all their neighbours.

1.1 New Results

  • In every round the discrete algorithm imitates the continuous algorithm as closely as possible by trying to send the same amount of load over every edge as the continuous algorithm.
  • That would incur communication overhead proportional to the number of dummy tokens.
  • Furthermore, let T be the time it takes for the continuous process to balance the load (more or less) completely (see Section 3 for details).
  • An additive algorithm, starting with a load distribution D = D1 +D2, transmits the same amount of tasks over every edge as the sum of the amounts it would transmit in 1The discrete version of the algorithm has to know the continuous flow f ce (t) for every edge e = (u,v).
  • Algorithm 1 achieves a final max-min discrepancy independent of n and graph expansion, and in particular, the only algorithm achieving constant max-min discrepancy for all constant-degree graphs.

2 Existing Algorithms and Techniques

  • The authors give an overview of the results on continuous (Section 2.1) and discrete neighbourhood load balancing (Section 2.2) only.
  • The authors will not consider these models here any further.
  • When not stated otherwise, the results are for the uniform case without speeds and weights.
  • In the following the authors will consider the results both in the discrete and the continuous settings.

2.1 Continuous Load Balancing

  • The first diffusion algorithm (also called first order schedule, FOS) was independently introduced by Cybenko [15] and Boillat [12].
  • Their results were later generalized to the case of non-uniform speeds in [20].
  • To introduce the FOS process the authors first need some additional notation.
  • The SOS method is inspired by a numerical iterative method called successive over-relaxation.
  • The model was originally introduced in [30], together with a distributed edge-colouring algorithm (see also [35, 36]) that can be used to construct the matchings.

2.2 Discrete Load Balancing

  • As far as the authors know, existing papers consider only discrete algorithms in the uniform task model.
  • (7) For FOS schemes, [34] left it as an open question to analyze the potential drop when the potential is smaller than O(d2n2).
  • All the edges are assigned weights proportional to their scheduled load transfer.
  • When the continuous flow is rounded down, the final discrepancy is Ω(d · diam(G)) for a discrete FOS process [26, 27] and Ω(diam(G)) for a discrete process in the matching model [27].

2.3 Improved Processes for Discrete Load Balancing

  • The next three subsections discuss three different approaches that were used in order to reduce the difference (caused by the rounding error) in the load distribution between discrete and continuous balancing processes.
  • The authors combine the approach of [37] with analysis techniques for randomized algorithms to show improved discrepancy bounds for general graphs.
  • Note that it is possible to get similar results if the excess tokens are sent to neighbours chosen randomly with replacement or if the neighbours are chosen in a roundrobin fashion with a random starting point [5].
  • Note that this algorithm might also create negative load on some of the nodes.

3 Notation and Basic Facts

  • Initially there are in total m tasks which are assigned arbitrarily to the n nodes of the graph G. Tasks may be of different integer weights and the maximum task weight is denoted by wmax.
  • Consider a continuous process A. For the transformations introduced by Algorithm 1 and Algorithm 2, the authors require initial load vectors that do not lead to negative load in the continuous case; that is, they need to ensure that when executing A, the outgoing demand of a node never exceeds its available load.
  • Consider a load balancing process A. Let x′, and x′′ be nonnegative load vectors.
  • The next lemma shows that the class of additive terminating processes includes several well known existing processes.

4 Deterministic Flow Imitation

  • The authors present and analyze an algorithm that transforms a continuous process A into its discrete counterpart which they call D(A).
  • The authors also note that in actual implementation they do not need to create and transfer workload units and consume communication bandwidth for each dummy token.
  • For other algorithms, the result in part (1) of the above theorem automatically holds, and the condition in part (2) can be translated as having sufficient initial load.
  • Now, the result can be obtained using Observation 4.

5 Randomized Flow Imitation

  • Instead of always rounding down the flow that has to be sent over an edge, Algorithm 2 uses randomized rounding.
  • Then each of the random variables Ei, j(t) can assume at most two different values and rounding up or down is independent of other edges (see part (3) of Observation 9).
  • The next lemma provides the two main ingredients for proving the Theorem 8.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An improved load-balancing algorithm is proposed that will be effectively executed within the constructed FSW, where nodes consider the capacity and calculate the average effective-load, and compared with two significant diffusion methods presented in the literature.

44 citations


Additional excerpts

  • ...Neighborhood load balancing algorithms Akbari et al., 2012) are diffusion algorithm that have the advantage hat they are very simple and that the vertices do not need any global nformation to base their balancing decisions on....

    [...]

Proceedings ArticleDOI
23 Jul 2013
TL;DR: Viewing the parallel rotor walk as a load balancing process, it is proved that the rotor walk falls in the class of bounded-error diffusion processes introduced in [11], which gives discrepancy bounds of O(log3/2 n) and O(1) for hypercube and r-dimensional torus with r=O(1), respectively, which improve over the best existing bounds.
Abstract: We study the parallel rotor walk process, which works as follows: Consider a graph along with an arbitrary distribution of tokens over its nodes. Every node is equipped with a rotor that points to its neighbours in a fixed circular order. In each round, every node distributes all of its tokens using the rotor. One token is allocated to the neighbour pointed at by the rotor, then the rotor moves to the subsequent neighbour, and so on, until no token remains.The process can be considered as a deterministic analogue of a process in which tokens perform one independent random walk step in each round. We compare the distribution of tokens in the rotor walk process with expected distribution in the random walk model. The similarity between the two processes is measured by their discrepancy, which is the maximum difference between the corresponding distribution entries over all rounds and nodes. We analyze a lazy variation of rotor walks that simulates a random walk with loop probability of 1/2 on each node, and each node sends not all its tokens, but every other token in each round.Viewing the rotor walk as a load balancing process, we prove that the rotor walk falls in the class of bounded-error diffusion processes introduced in [11]. This gives us discrepancy bounds of O(log3/2n) and O(1) for hypercube and r-dimensional torus with r=O(1), respectively, which improve over the best existing bounds of O(log2n) and O(n1/r). Also, as a result of switching to the load balancing view, we observe that the existing load balancing results can be translated to rotor walk discrepancy bounds not previously noticed in the rotor walk literature.We also use the idea of rotor walks to propose and analyze a randomized rounding discrete load balancing process that achieves the same balancing quality as similar protocols [11, 3], but uses fewer number of random bits compared to [3], and avoids the negative load problem of [11].

35 citations

Proceedings ArticleDOI
21 Jul 2015
TL;DR: In this article, the authors consider the problem of deterministic load balancing of tokens in the discrete model, where each node exchanges some of its tokens with each of its neighbors in the network.
Abstract: We consider the problem of deterministic load balancing of tokens in the discrete model. A set of n processors is connected into a d-regular undirected network. In every time step, each processor exchanges some of its tokens with each of its neighbors in the network. The goal is to minimize the discrepancy between the number of tokens on the most-loaded and the least-loaded processor as quickly as possible. Rabani et al. (1998) present a general technique for the analysis of a wide class of discrete load balancing algorithms. Their approach is to characterize the deviation between the actual loads of a discrete balancing algorithm with the distribution generated by a related Markov chain. The Markov chain can also be regarded as the underlying model of a continuous diffusion algorithm. Rabani et al. showed that after time T = O(log (Kn)/μ), any algorithm of their class achieves a discrepancy of O(d log n/μ), where μ is the spectral gap of the transition matrix of the graph, and K is the initial load discrepancy in the system.In this work we identify some natural additional conditions on deterministic balancing algorithms, resulting in a class of algorithms reaching a smaller discrepancy. This class contains well-known algorithms, e.g., the rotor-router. Specifically, we introduce the notion of cumulatively fair load-balancing algorithms where in any interval of consecutive time steps, the total number of tokens sent out over an edge by a node is the same (up to constants) for all adjacent edges. We prove that algorithms which are cumulatively fair and where every node retains a sufficient part of its load in each step, achieve a discrepancy of O(d√log n/μ ,d√n) in time O(T). We also show that in general neither of these assumptions may be omitted without increasing discrepancy. We then show by a combinatorial potential reduction argument that any cumulatively fair scheme satisfying some additional assumptions achieves a discrepancy of O(d) almost as quickly as the continuous diffusion process. This positive result applies to some of the simplest and most natural discrete load balancing schemes.

20 citations

Posted Content
TL;DR: In this paper, the authors consider the problem of balancing load items (tokens) in networks, and show that for any regular network in the matching model, all nodes have the same load up to an additive constant in (asymptotically) the same number of rounds as required in the continuous case.
Abstract: We consider the problem of balancing load items (tokens) in networks. Starting with an arbitrary load distribution, we allow nodes to exchange tokens with their neighbors in each round. The goal is to achieve a distribution where all nodes have nearly the same number of tokens. For the continuous case where tokens are arbitrarily divisible, most load balancing schemes correspond to Markov chains, whose convergence is fairly well-understood in terms of their spectral gap. However, in many applications, load items cannot be divided arbitrarily, and we need to deal with the discrete case where the load is composed of indivisible tokens. This discretization entails a non-linear behavior due to its rounding errors, which makes this analysis much harder than in the continuous case. We investigate several randomized protocols for different communication models in the discrete case. As our main result, we prove that for any regular network in the matching model, all nodes have the same load up to an additive constant in (asymptotically) the same number of rounds as required in the continuous case. This generalizes and tightens the previous best result, which only holds for expander graphs, and demonstrates that there is almost no difference between the discrete and continuous cases. Our results also provide a positive answer to the question of how well discrete load balancing can be approximated by (continuous) Markov chains, which has been posed by many researchers.

19 citations

Journal ArticleDOI
TL;DR: A deterministic and randomized version of the algorithm that balances the load up to a discrepancy of $$\mathscr {O}(\sqrt{d \log n})$$O(dlogn) provided that the initial load on every node is large enough.
Abstract: We consider the neighbourhood load balancing problem. Given a network of processors and an arbitrary distribution of tasks over the network, the goal is to balance load by exchanging tasks between neighbours. In the continuous model, tasks can be arbitrarily divided and perfectly balanced state can always be reached. This is not possible in the discrete model where tasks are non-divisible. In this paper we consider the problem in a very general setting, where the tasks can have arbitrary weights and the nodes can have different speeds. Given a continuous load balancing algorithm that balances the load perfectly in $$T$$T rounds, we convert the algorithm into a discrete version. This new algorithm is deterministic and balances the load in $$T$$T rounds so that the difference between the average and the maximum load is at most $$2d\cdot w_{\max }$$2d·wmax, where d is the maximum degree of the network and $$w_{\max }$$wmax is the maximum weight of any task. For general graphs, these bounds are asymptotically lower compared to the previous results. The proposed conversion scheme can be applied to a wide class of continuous processes, including first and second order diffusion, dimension exchange, and random matching processes. For the case of identical tasks, we present a randomized version of our algorithm that balances the load up to a discrepancy of $$\mathscr {O}(\sqrt{d \log n})$$O(dlogn) provided that the initial load on every node is large enough.

14 citations

References
More filters
Journal ArticleDOI
TL;DR: A modification of the protocol in 2006 that yields faster convergence to equilibrium, together with a matching lower bound, and a non-trivial extension to weighted tasks are made.
Abstract: We consider the problem of dynamically reallocating (or re-routing) m weighted tasks among a set of n uniform resources (one may think of the tasks as selfish players). We assume an arbitrary initial placement of tasks, and we study the performance of distributed, natural reallocation algorithms. We are interested in the time it takes the system to converge to an equilibrium (or get close to an equilibrium). Our main contributions are (i) a modification of the protocol in 2006 that yields faster convergence to equilibrium, together with a matching lower bound, and (ii) a non-trivial extension to weighted tasks.

22 citations

Proceedings ArticleDOI
23 Jan 2011
TL;DR: In this paper, several upper bounds on the load discrepancy for general networks are proved, which depend on some expansion properties of the network, that is, the second largest eigenvalue, and a novel measure which is referred to as refined local divergence.
Abstract: We present a new randomized diffusion-based algorithm for balancing indivisible tasks (tokens) on a network. Our aim is to minimize the discrepancy between the maximum and minimum load. The algorithm works as follows. Every vertex distributes its tokens as evenly as possible among its neighbors and itself. If this is not possible without splitting some tokens, the vertex redistributes its excess tokens among all its neighbors randomly (without replacement).In this paper we prove several upper bounds on the load discrepancy for general networks. These bounds depend on some expansion properties of the network, that is, the second largest eigenvalue, and a novel measure which we refer to as refined local divergence. We then apply these general bounds to obtain results for some specific networks. For constant-degree expanders and torus graphs, these yield exponential improvements on the discrepancy bounds compared to the algorithm of Rabani, Sinclair, and Wanka [14]. For hypercubes we obtain a polynomial improvement.In contrast to previous papers, our algorithm is vertex-based and not edge-based. This means excess tokens are assigned to vertices instead to edges, and the vertex reallocates all of its excess tokens by itself. This approach avoids nodes having "negative loads" (like in [8, 10]), but causes additional dependencies for the analysis.

22 citations

Proceedings ArticleDOI
16 Jul 2012
TL;DR: In this article, the authors considered neighborhood load balancing in the context of selfish clients, where the objective of the user is to allocate his/her task to a processor with minimum load, defined as the weight of its tasks divided by its speed.
Abstract: In this paper we consider neighborhood load balancing in the context of selfish clients. We assume that a network of n processors is given, with m tasks assigned to the processors. The processors may have different speeds and the tasks may have different weights. Every task is controlled by a selfish user. The objective of the user is to allocate his/her task to a processor with minimum load, where the load of a processor is defined as the weight of its tasks divided by its speed. We investigate a concurrent probabilistic protocol which works in sequential rounds. In each round every task is allowed to query the load of one randomly chosen neighboring processor. If that load is smaller than the load of the task's current processor, the task will migrate to that processor with a suitably chosen probability. Using techniques from spectral graph theory we obtain upper bounds on the expected convergence time towards approximate and exact Nash equilibria that are significantly better than previous results for this protocol. We show results for uniform tasks on non-uniform processors and the general case where the tasks have different weights and the machines have speeds. To the best of our knowledge, these are the first results for this general setting.

21 citations

Proceedings ArticleDOI
25 Jul 2010
TL;DR: This paper presents a fully distributed, randomized algorithm for discrete load balancing that balances the load up to an additive constant error on any graph in time O(log (Kn)/(1-λ2)), where K is the initial imbalance and λ2 is the second largest eigenvalue of the diffusion matrix.
Abstract: We consider the problem of diffusion-based load balancing on a distributed network with n processors. If the load is arbitrarily divisible, then the convergence is fairly well captured in terms of the second largest eigenvalue of the diffusion matrix. As for many applications load can not be arbitrarily divided, we consider a model where load consists of indivisible, unit-size tokens. Quantifying by how much this integrality assumption worsens the efficiency of load balancing algorithms is a natural question which has been posed by many authors [9, 15, 16, 6, 19, 17]. In this paper we show essentially that discrete load balancing is almost as easy as continuous load balancing. More precisely, we present a fully distributed, randomized algorithm for discrete load balancing that balances the load up to an additive constant error on any graph in time O(log (Kn)/(1-λ2)), where K is the initial imbalance and λ2 is the second largest eigenvalue of the diffusion matrix. This improves and tightens a result of Elsasser, Monien, Schamberger (2006) who proved a runtime bound of O((log (K) + (log n)2) / (1-λ2)). We also develop a load balancing algorithm based on routing that achieves a runtime of O(D ⋅ log n), where D is the diameter of the graph.

21 citations

Journal ArticleDOI
TL;DR: The rotor router model is a popular deterministic analogue of a random walk on a graph that can be asymptotically faster, slower or equally fast as the classic random walk.
Abstract: The rotor router model is a popular deterministic analogue of a random walk on a graph Instead of moving to a random neighbor, the neighbors are served in a fixed order We examine how quickly this "deterministic random walk" covers all vertices (or all edges) We present general techniques to derive upper bounds for the vertex and edge cover time and derive matching lower bounds for several important graph classes Depending on the topology, the deterministic random walk can be asymptotically faster, slower or equally fast as the classic random walk We also examine the short term behavior of deterministic random walks, that is, the time to visit a fixed small number of vertices or edges

20 citations