scispace - formally typeset
Search or ask a question
Author

Mikkel Thorup

Bio: Mikkel Thorup is an academic researcher from University of Copenhagen. The author has contributed to research in topics: Time complexity & Hash function. The author has an hindex of 63, co-authored 297 publications receiving 16344 citations. Previous affiliations of Mikkel Thorup include Max Planck Society & University of Copenhagen Faculty of Science.


Papers
More filters
Proceedings ArticleDOI
26 Mar 2000
TL;DR: Surprisingly it turned out that for the proposed AT&T WorldNet backbone, weight settings that performed within a few percent from that of the optimal general routing where the flow for each demand is optimally distributed over all paths between source and destination.
Abstract: Open shortest path first (OSPF) is the most commonly used intra-domain Internet routing protocol. Traffic flow is routed along shortest paths, splitting flow at nodes where several outgoing links are on shortest paths to the destination. The weights of the links, and thereby the shortest path routes, can be changed by the network operator. The weights could be set proportional to their physical distances, but often the main goal is to avoid congestion, i.e., overloading of links, and the standard heuristic recommended by Cisco is to make the weight of a link inversely proportional to its capacity. Our starting point was a proposed AT&T WorldNet backbone with demands projected from previous measurements. The desire was to optimize the weight setting based on the projected demands. We showed that optimizing the weight settings for a given set of demands is NP-hard, so we resorted to a local search heuristic. Surprisingly it turned out that for the proposed AT&T WorldNet backbone, we found weight settings that performed within a few percent from that of the optimal general routing where the flow for each demand is optimally distributed over all paths between source and destination. This contrasts the common belief that OSPF routing leads to congestion and it shows that for the network and demand matrix studied we cannot get a substantially better load balancing by switching to the proposed more flexible multi-protocol label switching (MPLS) technologies. Our techniques were also tested on synthetic internetworks, based on a model of Zegura et al., (1996), for which we did not always get quite as close to the optimal general routing.

1,200 citations

Journal ArticleDOI
TL;DR: A system of techniques is presented for optimizing open shortest path first (OSPF) or intermediate system-intermediate system (IS-IS) weights for intradomain routing in a changing world, the goal being to avoid overloaded links.
Abstract: A system of techniques is presented for optimizing open shortest path first (OSPF) or intermediate system-intermediate system (IS-IS) weights for intradomain routing in a changing world, the goal being to avoid overloaded links. We address predicted periodic changes in traffic as well as problems arising from link failures and emerging hot spots.

694 citations

Journal ArticleDOI
TL;DR: The most impressive feature of the data structure is its constant query time, hence the name "oracle", and it provides faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.
Abstract: Let G = (V,E) be an undirected weighted graph with vVv = n and vEv = m. Let k ≥ 1 be an integer. We show that G = (V,E) can be preprocessed in O(kmn1/k) expected time, constructing a data structure of size O(kn1p1/k), such that any subsequent distance query can be answered, approximately, in O(k) time. The approximate distance returned is of stretch at most 2k−1, that is, the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and 2k−1. A 1963 girth conjecture of Erdos, implies that Ω(n1p1/k) space is needed in the worst case for any real stretch strictly smaller than 2kp1. The space requirement of our algorithm is, therefore, essentially optimal. The most impressive feature of our data structure is its constant query time, hence the name "oracle". Previously, data structures that used only O(n1p1/k) space had a query time of Ω(n1/k).Our algorithms are extremely simple and easy to implement efficiently. They also provide faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.

618 citations

Proceedings ArticleDOI
06 Jul 2001
TL;DR: The most impressive feature of the data structure is its constant query time, hence the name ``oracle', which provides faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.
Abstract: Let G=(V,E) be an undirected weighted graph with |V|=n and |E|=m. Let k\ge 1 be an integer. We show that G=(V,E) can be preprocessed in O(kmn^{1/k}) expected time, constructing a data structure of size O(kn^{1+1/k}), such that any subsequent distance query can be answered, approximately, in O(k) time. The approximate distance returned is of stretch at most 2k-1, i.e., the quotient obtained by dividing the estimated distance by the actual distance lies between 1 and 2k-1. We show that a 1963 girth conjecture of Erd{\H{o}}s, implies that ω(n^{1+1/k}) space is needed in the worst case for any real stretch strictly smaller than 2k+1. The space requirement of our algorithm is, therefore, essentially optimal. The most impressive feature of our data structure is its constant query time, hence the name oracle. Previously, data structures that used only O(n^{1+1/k}) space had a query time of ω(n^{1/k}) and a slightly larger, non-optimal, stretch. Our algorithms are extremely simple and easy to implement efficiently. They also provide faster constructions of sparse spanners of weighted graphs, and improved tree covers and distance labelings of weighted or unweighted graphs.}

563 citations

Proceedings ArticleDOI
03 Jul 2001
TL;DR: Several compact routing schemes for general weighted undirected networks are described, which achieve a near-optimal tradeoff between the size of the routing tables used and the resulting stretch.
Abstract: We describe several compact routing schemes for general weighted undirected networks. Our schemes are simple and easy to implement. The routing tables stored at the nodes of the network are all very small. The headers attached to the routed messages, including the name of the destination, are extremely short. The routing decision at each node takes constant time. Yet, the stretch of these routing schemes, i.e., the worst ratio between the cost of the path on which a packet is routed and the cost of the cheapest path from source to destination, is a small constant. Our schemes achieve a near-optimal tradeoff between the size of the routing tables used and the resulting stretch. More specifically, we obtain: A routing scheme that uses only O (n 1/2) bits of memory at each node of an n-node network that has stretch 3. The space is optimal, up to logarithmic factors, in the sense that every routing scheme with stretch n2), and every routing scheme with stretch n3/2). The headers used are only (1 + O(1)) log2> n-bits long and each routing decision takes constant time. A variant of this scheme with [log2 n] -bit headers makes routing decisions in O(log log n) time. Also, for every integer k > 2, a general handshaking based routing scheme that uses O (n1/k) bits of memory at each node that has stretch 2k - 1. A conjecture of Erdos from 1963, settled for k = 3, 5, implies that the routing tables are of near-optimal size relative to the stretch. The handshaking is similar in spirit to a DNS lookup in TCP/IP. Headers are O(log2 n) bits long and each routing decision takes constant time. Without handshaking, the stretch of the scheme increases to 4k - 5. One ingredient used to obtain the routing schemes mentioned above, may be of independent practical and theoretical interest: A shortest path routing scheme for trees of arbitrary degree and diameter that assigns each vertex of an n-node tree a (1 + O(1)) log2 n-bit label. Given the label of a source node and the label of a destination it is possible to compute, in constant time, the port number of the edge from the source that heads in the direction of the destination. The general scheme for k > 2 also uses a clustering technique introduced recently by the authors. The clusters obtained using this technique induce a sparse and low stretch tree cover of the network. This essentially reduces routing in general networks into routing problems in trees that could be solved using the above technique.

560 citations


Cited by
More filters
MonographDOI
01 Jan 2006
TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.
Abstract: Planning algorithms are impacting technical disciplines and industries around the world, including robotics, computer-aided design, manufacturing, computer graphics, aerospace applications, drug design, and protein folding. This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms. The treatment is centered on robot motion planning but integrates material on planning in discrete spaces. A major part of the book is devoted to planning under uncertainty, including decision theory, Markov decision processes, and information spaces, which are the “configuration spaces” of all sensor-based planning problems. The last part of the book delves into planning under differential constraints that arise when automating the motions of virtually any mechanical system. Developed from courses taught by the author, the book is intended for students, engineers, and researchers in robotics, artificial intelligence, and control theory as well as computer graphics, algorithms, and computational biology.

6,340 citations

Proceedings ArticleDOI
06 Jun 2010
TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Abstract: Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distribution-related details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.

3,840 citations

Book
01 Jan 2001
TL;DR: The complexity class P is formally defined as the set of concrete decision problems that are polynomial-time solvable, and encodings are used to map abstract problems to concrete problems.
Abstract: problems To understand the class of polynomial-time solvable problems, we must first have a formal notion of what a "problem" is. We define an abstract problem Q to be a binary relation on a set I of problem instances and a set S of problem solutions. For example, an instance for SHORTEST-PATH is a triple consisting of a graph and two vertices. A solution is a sequence of vertices in the graph, with perhaps the empty sequence denoting that no path exists. The problem SHORTEST-PATH itself is the relation that associates each instance of a graph and two vertices with a shortest path in the graph that connects the two vertices. Since shortest paths are not necessarily unique, a given problem instance may have more than one solution. This formulation of an abstract problem is more general than is required for our purposes. As we saw above, the theory of NP-completeness restricts attention to decision problems: those having a yes/no solution. In this case, we can view an abstract decision problem as a function that maps the instance set I to the solution set {0, 1}. For example, a decision problem related to SHORTEST-PATH is the problem PATH that we saw earlier. If i = G, u, v, k is an instance of the decision problem PATH, then PATH(i) = 1 (yes) if a shortest path from u to v has at most k edges, and PATH(i) = 0 (no) otherwise. Many abstract problems are not decision problems, but rather optimization problems, in which some value must be minimized or maximized. As we saw above, however, it is usually a simple matter to recast an optimization problem as a decision problem that is no harder. Encodings If a computer program is to solve an abstract problem, problem instances must be represented in a way that the program understands. An encoding of a set S of abstract objects is a mapping e from S to the set of binary strings. For example, we are all familiar with encoding the natural numbers N = {0, 1, 2, 3, 4,...} as the strings {0, 1, 10, 11, 100,...}. Using this encoding, e(17) = 10001. Anyone who has looked at computer representations of keyboard characters is familiar with either the ASCII or EBCDIC codes. In the ASCII code, the encoding of A is 1000001. Even a compound object can be encoded as a binary string by combining the representations of its constituent parts. Polygons, graphs, functions, ordered pairs, programs-all can be encoded as binary strings. Thus, a computer algorithm that "solves" some abstract decision problem actually takes an encoding of a problem instance as input. We call a problem whose instance set is the set of binary strings a concrete problem. We say that an algorithm solves a concrete problem in time O(T (n)) if, when it is provided a problem instance i of length n = |i|, the algorithm can produce the solution in O(T (n)) time. A concrete problem is polynomial-time solvable, therefore, if there exists an algorithm to solve it in time O(n) for some constant k. We can now formally define the complexity class P as the set of concrete decision problems that are polynomial-time solvable. We can use encodings to map abstract problems to concrete problems. Given an abstract decision problem Q mapping an instance set I to {0, 1}, an encoding e : I → {0, 1}* can be used to induce a related concrete decision problem, which we denote by e(Q). If the solution to an abstract-problem instance i I is Q(i) {0, 1}, then the solution to the concreteproblem instance e(i) {0, 1}* is also Q(i). As a technicality, there may be some binary strings that represent no meaningful abstract-problem instance. For convenience, we shall assume that any such string is mapped arbitrarily to 0. Thus, the concrete problem produces the same solutions as the abstract problem on binary-string instances that represent the encodings of abstract-problem instances. We would like to extend the definition of polynomial-time solvability from concrete problems to abstract problems by using encodings as the bridge, but we would like the definition to be independent of any particular encoding. That is, the efficiency of solving a problem should not depend on how the problem is encoded. Unfortunately, it depends quite heavily on the encoding. For example, suppose that an integer k is to be provided as the sole input to an algorithm, and suppose that the running time of the algorithm is Θ(k). If the integer k is provided in unary-a string of k 1's-then the running time of the algorithm is O(n) on length-n inputs, which is polynomial time. If we use the more natural binary representation of the integer k, however, then the input length is n = ⌊lg k⌋ + 1. In this case, the running time of the algorithm is Θ (k) = Θ(2), which is exponential in the size of the input. Thus, depending on the encoding, the algorithm runs in either polynomial or superpolynomial time. The encoding of an abstract problem is therefore quite important to our under-standing of polynomial time. We cannot really talk about solving an abstract problem without first specifying an encoding. Nevertheless, in practice, if we rule out "expensive" encodings such as unary ones, the actual encoding of a problem makes little difference to whether the problem can be solved in polynomial time. For example, representing integers in base 3 instead of binary has no effect on whether a problem is solvable in polynomial time, since an integer represented in base 3 can be converted to an integer represented in base 2 in polynomial time. We say that a function f : {0, 1}* → {0,1}* is polynomial-time computable if there exists a polynomial-time algorithm A that, given any input x {0, 1}*, produces as output f (x). For some set I of problem instances, we say that two encodings e1 and e2 are polynomially related if there exist two polynomial-time computable functions f12 and f21 such that for any i I , we have f12(e1(i)) = e2(i) and f21(e2(i)) = e1(i). That is, the encoding e2(i) can be computed from the encoding e1(i) by a polynomial-time algorithm, and vice versa. If two encodings e1 and e2 of an abstract problem are polynomially related, whether the problem is polynomial-time solvable or not is independent of which encoding we use, as the following lemma shows. Lemma 34.1 Let Q be an abstract decision problem on an instance set I , and let e1 and e2 be polynomially related encodings on I . Then, e1(Q) P if and only if e2(Q) P. Proof We need only prove the forward direction, since the backward direction is symmetric. Suppose, therefore, that e1(Q) can be solved in time O(nk) for some constant k. Further, suppose that for any problem instance i, the encoding e1(i) can be computed from the encoding e2(i) in time O(n) for some constant c, where n = |e2(i)|. To solve problem e2(Q), on input e2(i), we first compute e1(i) and then run the algorithm for e1(Q) on e1(i). How long does this take? The conversion of encodings takes time O(n), and therefore |e1(i)| = O(n), since the output of a serial computer cannot be longer than its running time. Solving the problem on e1(i) takes time O(|e1(i)|) = O(n), which is polynomial since both c and k are constants. Thus, whether an abstract problem has its instances encoded in binary or base 3 does not affect its "complexity," that is, whether it is polynomial-time solvable or not, but if instances are encoded in unary, its complexity may change. In order to be able to converse in an encoding-independent fashion, we shall generally assume that problem instances are encoded in any reasonable, concise fashion, unless we specifically say otherwise. To be precise, we shall assume that the encoding of an integer is polynomially related to its binary representation, and that the encoding of a finite set is polynomially related to its encoding as a list of its elements, enclosed in braces and separated by commas. (ASCII is one such encoding scheme.) With such a "standard" encoding in hand, we can derive reasonable encodings of other mathematical objects, such as tuples, graphs, and formulas. To denote the standard encoding of an object, we shall enclose the object in angle braces. Thus, G denotes the standard encoding of a graph G. As long as we implicitly use an encoding that is polynomially related to this standard encoding, we can talk directly about abstract problems without reference to any particular encoding, knowing that the choice of encoding has no effect on whether the abstract problem is polynomial-time solvable. Henceforth, we shall generally assume that all problem instances are binary strings encoded using the standard encoding, unless we explicitly specify the contrary. We shall also typically neglect the distinction between abstract and concrete problems. The reader should watch out for problems that arise in practice, however, in which a standard encoding is not obvious and the encoding does make a difference. A formal-language framework One of the convenient aspects of focusing on decision problems is that they make it easy to use the machinery of formal-language theory. It is worthwhile at this point to review some definitions from that theory. An alphabet Σ is a finite set of symbols. A language L over Σ is any set of strings made up of symbols from Σ. For example, if Σ = {0, 1}, the set L = {10, 11, 101, 111, 1011, 1101, 10001,...} is the language of binary representations of prime numbers. We denote the empty string by ε, and the empty language by Ø. The language of all strings over Σ is denoted Σ*. For example, if Σ = {0, 1}, then Σ* = {ε, 0, 1, 00, 01, 10, 11, 000,...} is the set of all binary strings. Every language L over Σ is a subset of Σ*. There are a variety of operations on languages. Set-theoretic operations, such as union and intersection, follow directly from the set-theoretic definitions. We define the complement of L by . The concatenation of two languages L1 and L2 is the language L = {x1x2 : x1 L1 and x2 L2}. The closure or Kleene star of a language L is the language L*= {ε} L L L ···, where Lk is the language obtained by

2,817 citations

Book
03 Sep 2011
TL;DR: The question the authors are trying to ask is: how many units of water can they send from the source to the sink per unit of time?
Abstract: 1 Defining Network Flow A flow network is a directed graph G = (V,E) in which each edge (u, v) ∈ E has non-negative capacity c(u, v) ≥ 0. We require that if (u, v) ∈ E, then (v, u) / ∈ E. That is, if an edge exists, then the edge between the same vertices going the reverse direction does not exist. Every flow network has a source s and a sink t, and we assume that for every v ∈ V , there is some path s→ · · · → v → · · · → t. Note that this implies that flow networks are connected. Informally, the intuition behind network flow is to think of the edges as pipes and the weights on the edges as the capacity its corresponding pipe per unit of time. The question we are trying to ask is: how many units of water can we send from the source to the sink per unit of time? Formally, a flow in G is a function f : V × V → R that satisfies the following: • Capacity constraint. For all u, v ∈ V , we require 0 ≤ f(u, v) ≤ c(u, v). Our pipe cannot hold more than is allowed as dictated by its capacity. • Flow conservation. For u ∈ V − {s, t}, we require ∑

2,426 citations