scispace - formally typeset
Search or ask a question

Showing papers on "Distributed algorithm published in 1999"


Journal ArticleDOI
TL;DR: An optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates to solve the dual problem using a gradient projection algorithm.
Abstract: We propose an optimization approach to flow control where the objective is to maximize the aggregate source utility over their transmission rates. We view network links and sources as processors of a distributed computation system to solve the dual problem using a gradient projection algorithm. In this system, sources select transmission rates that maximize their own benefits, utility minus bandwidth cost, and network links adjust bandwidth prices to coordinate the sources' decisions. We allow feedback delays to be different, substantial, and time varying, and links and sources to update at different times and with different frequencies. We provide asynchronous distributed algorithms and prove their convergence in a static environment. We present measurements obtained from a preliminary prototype to illustrate the convergence of the algorithm in a slowly time-varying environment. We discuss its fairness property.

2,101 citations


Journal ArticleDOI
TL;DR: A distributed heuristic algorithm that was inspired by the observation of the behavior of ant colonies is described and its use for the quadratic assignment problem is proposed.
Abstract: In recent years, there has been growing interest in algorithms inspired by the observation of natural phenomena to define computational procedures that can solve complex problems. We describe a distributed heuristic algorithm that was inspired by the observation of the behavior of ant colonies, and we propose its use for the quadratic assignment problem. The results obtained in solving several classical instances of the problem are compared with those obtained from other evolutionary heuristics to evaluate the quality of the proposed system.

787 citations


Journal ArticleDOI
TL;DR: The performance evaluations show that CEDAR is a robust and adaptive QoS routing algorithm that reacts quickly and effectively to the dynamics of the network while still approximating the performance of link-state routing for stable networks.
Abstract: We present CEDAR, a core-extraction distributed ad hoc routing algorithm for quality-of-service (QoS) routing in ad hoc network environments, CEDAR has three key components: (a) the establishment and maintenance of a self-organizing routing infrastructure called the core for performing route computations; (b) the propagation of the link-state of high bandwidth and stable links in the core through increase/decrease waves; and (c) a QoS-route computation algorithm that is executed at the core nodes using only locally available state. The performance evaluations show that CEDAR is a robust and adaptive QoS routing algorithm that reacts quickly and effectively to the dynamics of the network while still approximating the performance of link-state routing for stable networks.

716 citations


Journal ArticleDOI
01 Dec 1999
TL;DR: The results of computer simulation under a more realistic model give convincing indication that the algorithm, if implemented on physical robots, will be robust against sensor and control error.
Abstract: We present a distributed algorithm for converging autonomous mobile robots with limited visibility toward a single point. Each robot is an omnidirectional mobile processor that repeatedly: 1) observes the relative positions of those robots that are visible; 2) computes its new position based on the observation using the given algorithm; 3) moves to that position. The robots' visibility is limited so that two robots can see each other if and only if they are within distance V of each other and there are no other robots between them. Our algorithm is memoryless in the sense that the next position of a robot is determined entirely from the positions of the robots that it can see at that moment. The correctness of the algorithm is proved formally under an abstract model of the robot system in which: 1) each robot is represented by a point that does not obstruct the view of other robots; 2) the robots' motion is instantaneous; 3) there are no sensor and control error; 4) the issue of collision is ignored. The results of computer simulation under a more realistic model give convincing indication that the algorithm, if implemented on physical robots, will be robust against sensor and control error.

585 citations


Proceedings ArticleDOI
31 May 1999
TL;DR: A novel and efficient distributed algorithm, called -link matching, which performs just enough computation at each node to determine the subset of links to which an event should be forwarded and yields higher throughput than flooding when subscriptions are selective.
Abstract: The publish/subscribe (or pub/sub) paradigm is an increasingly popular model for interconnecting applications in a distributed environment. Many existing pub/sub systems are based on pre-defined subjects, and hence are able to exploit multicast technologies to provide scalability and availability. An emerging alternative to subject-based systems, known as content-based systems, allow information consumers to request events based on the content of published events. This model is considerably more flexible than subject-based pub/sub. However, it was previously not known how to efficiently multicast published events to interested content-based subscribers within a large and geographically distributed network of broker (or router) machines. We develop and evaluate a novel and efficient distributed algorithm for this purpose, called -link matching". Link matching performs just enough computation at each node to determine the subset of links to which an event should be forwarded. We show via simulations that: link matching yields higher throughput than flooding when subscriptions are selective; and the overall CPU utilization of link matching is comparable to that of centralized matching.

582 citations


Journal ArticleDOI
TL;DR: A simple randomized algorithm for accessing shared objects that tends to satisfy each access request with a nearby copy is designed, based on a novel mechanism to maintain and distribute information about object locations, and requires only a small amount of additional memory at each node.
Abstract: Consider a set of shared objects in a distributed network, where several copies of each object may exist at any given time. To ensure both fast access to the objects as well as efficient utilization of network resources, it is desirable that each access request be satisfied by a copy ``close'' to the requesting node. Unfortunately, it is not clear how to achieve this goal efficiently in a dynamic, distributed environment in which large numbers of objects are continuously being created, replicated, and destroyed. In this paper we design a simple randomized algorithm for accessing shared objects that tends to satisfy each access request with a nearby copy. The algorithm is based on a novel mechanism to maintain and distribute information about object locations, and requires only a small amount of additional memory at each node. We analyze our access scheme for a class of cost functions that captures the hierarchical nature of wide-area networks. We show that under the particular cost model considered (i) the expected cost of an individual access is asymptotically optimal, and (ii) if objects are sufficiently large, the memory used for objects dominates the additional memory used by our algorithm with high probability. We also address dynamic changes in both the network and the set of object copies.

557 citations


Proceedings ArticleDOI
30 Aug 1999
TL;DR: The key technique is called Dynamic Packet State (DPS), which provides a lightweight and robust mechanism for routers to coordinate actions and implement distributed algorithms and an implementation of the proposed algorithms that has minimum incompatibility with IPv4.
Abstract: Existing approaches for providing guaranteed services require routers to manage per flow states and perform per flow operations [9, 21]. Such a stateful network architecture is less scalable and robust than stateless network architectures like the original IP and the recently proposed Diffserv [3]. However, services provided with current stateless solutions, Diffserv included, have lower flexibility, utilization, and/or assurance level as compared to the services that can be provided with per flow mechanisms.In this paper, we propose techniques that do not require per flow management (either control or data planes) at core routers, but can implement guaranteed services with levels of flexibility, utilization, and assurance similar to those that can be provided with per flow mechanisms. In this way we can simultaneously achieve high quality of service, high scalability and robustness. The key technique we use is called Dynamic Packet State (DPS), which provides a lightweight and robust mechanism for routers to coordinate actions and implement distributed algorithms. We present an implementation of the proposed algorithms that has minimum incompatibility with IPv4.

534 citations


Journal ArticleDOI
TL;DR: This paper presents and studies black-burst (BB) contention, which is a distributed MAC scheme that provides QoS real-time access to ad hoc CSMA wireless networks and provides conditions for the scheme to be stable.
Abstract: Carrier sense multiple access (CSMA) is one of the most pervasive medium access control (MAC) schemes in ad hoc, wireless networks. However, CSMA and its current variants do not provide quality-of-service (QoS) guarantees for real-time traffic support. This paper presents and studies black-burst (BB) contention, which is a distributed MAC scheme that provides QoS real-time access to ad hoc CSMA wireless networks. With this scheme, real-time nodes contend for access to the channel with pulses of energy-so called BBs-the durations of which are a function of the delay incurred by the nodes until the channel became idle. It is shown that real-time packets are not subject to collisions and that they have access priority over data packets. When operated in an ad hoc wireless LAN, BB contention further guarantees bounded and typically very small real-time delays. The performance of the network can approach that attained under ideal time division multiplexing (TDM) via a distributed algorithm that groups real-time packet transmissions into chains. A general analysis of BB contention is given, contemplating several modes of operation. The analysis provides conditions for the scheme to be stable. Its results are complemented with simulations that evaluate the performance of an ad hoc wireless LAN with a mixed population of data and real-time nodes.

525 citations


Proceedings ArticleDOI
21 Mar 1999
TL;DR: It is proved that fixed size window control can achieve fair bandwidth sharing according to any of these criteria, provided scheduling at each link is performed in an appropriate manner.
Abstract: This paper concerns the design of distributed algorithms for sharing network bandwidth resources among contending flows. The classical fairness notion is the so-called max-min fairness; Kelly (see Europ. Trans. Telecom. vol.8 p.33-37, 1997) has previously introduced the alternative proportional fairness criterion; we introduce a third criterion, which is naturally interpreted in terms of the delays experienced by ongoing transfers. We prove that fixed size window control can achieve fair bandwidth sharing according to any of these criteria, provided scheduling at each link is performed in an appropriate manner. We next consider a distributed random scheme where each traffic source varies its sending rate randomly, based on binary feedback information from the network. We show how to select the source behaviour so as to achieve an equilibrium distribution concentrated around the considered fair rate allocations. This stochastic analysis is then used to assess the asymptotic behaviour of deterministic rate adoption procedures.

355 citations


Journal ArticleDOI
Bharat T. Doshi1, Subrahmanyam Dravida1, P. Harshavardhana1, Oded Hauser1, Yufei Wang1 
TL;DR: This paper reports test results for large carrier-scale networks that indicate that subsecond restoration, high capacity efficiency, and scalability can be achieved without fault isolation and with moderate processing.
Abstract: The explosion of data traffic and the availability of enormous bandwidth via dense wavelength division multiplexing (DWDM) and optical amplifier (OA) technologies make it important to study optical layer networking and restoration. This paper is concerned with fast distributed restoration and provisioning for generic mesh-based optical networks. We consider two problems of practical importance: determining the best restoration route for each wavelength demand, given the network topology and the capacities and primary routes of all demands, and determining primary and restoration routes for each wavelength demand to minimize network capacity and cost. The approach we propose for both problems is based on precomputing. For each problem, we describe specific algorithms used for computing routes. We also describe endpoint-based failure detection, message flows, and cross-connect actions for execution of fast restorations. Finally, we report test results for large carrier-scale networks that include both the computational performance of the optimization algorithms and the restoration speed obtained by simulation. Our results indicate that subsecond restoration, high capacity efficiency, and scalability can be achieved without fault isolation and with moderate processing. We also discuss methods for scaling algorithms to problems with very large numbers of demands. The wavelength routing and restoration algorithms, the failure detection, and the message exchange and activation architectures we propose are collectively known as WaveStar™ advanced routing platform.

312 citations


Journal ArticleDOI
TL;DR: This work makes a formalization of these algorithms, and a timely and topic survey of their most important traditional and recent technical issues, and presents a useful summaries on their main applications.
Abstract: In this work we review the most important existing developments and future trends in the class of Parallel Genetic Algorithms (PGAs) PGAs are mainly subdivided into coarse and fine grain PGAs, the coarse grain models being the most popular ones An exceptional characteristic of PGAs is that they are not just the parallel version of a sequential algorithm intended to provide speed gains Instead, they represent a new kind of meta-heuristics of higher efficiency and efficacy thanks to their structured population and parallel execution The good robustness of these algorithms on problems of high complexity has led to an increasing number of applications in the fields of artificial intelligence, numeric and combinatorial optimization, business, engineering, etc We make a formalization of these algorithms, and present a timely and topic survey of their most important traditional and recent technical issues Besides that, useful summaries on their main applications plus Internet pointers to important web sites are included in order to help new researchers to access this growing area

Journal ArticleDOI
TL;DR: The dR*-tree is introduced, a distributed spatial index structure in which the data is spread among multiple computers and the indexes of the data are replicated on every computer in the ‘shared-nothing’ architecture with multiple computers interconnected through a network.
Abstract: The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we present PDBSCAN, a parallel version of this algorithm. We use the ‘shared-nothing’ architecture with multiple computers interconnected through a network. A fundamental component of a shared-nothing system is its distributed data structure. We introduce the dRa-tree, a distributed spatial index structure in which the data is spread among multiple computers and the indexes of the data are replicated on every computer. We implemented our method using a number of workstations connected via Ethernet (10 Mbit). A performance evaluation shows that PDBSCAN offers nearly linear speedup and has excellent scaleup and sizeup behavior.

Proceedings ArticleDOI
22 Feb 1999
TL;DR: It is asserted that system software, not the programmer, should manage the task of distributed decomposition and presented Coign, an automatic distributed partitioning system that significantly eases the development of distributed applications.
Abstract: Although successive generations of middleware (such as RPC, CORBA, and DCOM) have made it easier to connect distributed programs, the process of distributed application decomposition has changed little: programmers manually divide applications into sub-programs and manually assign those sub-programs to machines. Often the techniques used to choose a distribution are ad hoc and create one-time solutions biased to a specific combination of users, machines, and networks. We assert that system software, not the programmer, should manage the task of distributed decomposition. To validate our assertion we present Coign, an automatic distributed partitioning system that significantly eases the development of distributed applications. Given an application (in binary form) built from distributable COM components, Coign constructs a graph model of the application's inter-component communication through scenario-based profiling. Later, Coign applies a graph-cutting algorithm to partition the application across a network and minimize execution delay due to network communication. Using Coign, even an end user (without access to source code) can transform a non-distributed application into an optimized, distributed application. Coign has automatically distributed binaries from over 2 million lines of application code, including Microsoft '5 PhotoDraw 2000 image processor. To our knowledge, Coign is the first system to automatically partition and distribute binary applications.

Proceedings ArticleDOI
03 Aug 1999
TL;DR: This work presents mechanisms that allow an application to guide resource selection during the co-allocation process and describes the implementation of co-allocators based on these mechanisms and the results of microbenchmark studies and large-scale application experiments that provide insights into the costs and practical utility of the techniques.
Abstract: Applications designed to execute on "computational grids" frequently require the simultaneous co-allocation of multiple resources in order to meet performance requirements. For example, several computers and network elements may be required in order to achieve real-time reconstruction of experimental data, while a large numerical simulation may require simultaneous access to multiple supercomputers. Motivated by these concerns, we have developed a general resource management architecture for Grid environments, in which resource co-allocation is an integral component. We examine the co-allocation problem in detail and present mechanisms that allow an application to guide resource selection during the co-allocation process; these mechanisms address issues relating to the allocation, monitoring, control, and configuration of distributed computations. We describe the implementation of co-allocators based on these mechanisms and present the results of microbenchmark studies and large-scale application experiments that provide insights into the costs and practical utility of our techniques.

Proceedings ArticleDOI
19 Sep 1999
TL;DR: A distributed algorithm is presented that partitions the nodes of a fully mobile network (multi-hop network) into clusters, thus giving the network a hierarchical organization and is proven to be adaptive to changes in the network topology due to nodes' mobility and to nodes addition/removal.
Abstract: A distributed algorithm is presented that partitions the nodes of a fully mobile network (multi-hop network) into clusters, thus giving the network a hierarchical organization. The algorithm is proven to be adaptive to changes in the network topology due to nodes' mobility and to nodes addition/removal. A new weight-based mechanism is introduced for the efficient cluster formation/maintenance that allows the cluster organization to be configured for specific applications and adaptive to changes in the network status, not available in previous solutions. Specifically, new and flexible criteria are defined that allow the choice of the nodes that coordinate the clustering process based on mobility parameters and/or their current status. Simulation results are provided that demonstrate up to an 85% reduction on the communication overhead associated with the cluster maintenance with respect to techniques used in clustering algorithms previously proposed.

Proceedings ArticleDOI
01 May 1999
TL;DR: This paper proposes a very simple algorithm called Name-Dropper whereby all machines learn about each other within O(log’ n) rounds with high probability, where n is the number of machines in the network.
Abstract: In large distributed networks of computers, it is often the case that a subset of machines wants to cooperate to perform a task. Before they can do so, these machines need to learn of the existence of each other. In this paper we are interested in distributed algorithms whereby machines in a network learn of other machines in the network by making queries to machines they already know. The algorithms should be efficient both in terms of the time required and in terms of the total network communication required until all machines have discovered all other machines. We propose a very simple algorithm called Name-Dropper whereby all machines learn about each other within O(log’ n) rounds with high probability, where n is the number of machines in the network. The total number of connections required is O(n log2 n) and the total number of pointers which must be communicated is O(n2 log2 n), with high probability. Each of the preceding bounds is optimal to within polylogarithmic factors.

Proceedings ArticleDOI
03 Aug 1999
TL;DR: The overall goal is to provide the NASA scientific and engineering communities a substantial increase in their ability to solve problems that depend on use of large-scale and/or dispersed resources: aggregated computing, diverse data archives, laboratory instruments and engineering test facilities, and human collaborators.
Abstract: Information Power Grid (IPG) is the name of NASA's project to build a fully distributed computing and data management environment-a Grid. The IPG project has near, medium, and long-term goals that represent a continuum of engineering, development, and research topics. The overall goal is to provide the NASA scientific and engineering communities a substantial increase in their ability to solve problems that depend on use of large-scale and/or dispersed resources: aggregated computing, diverse data archives, laboratory instruments and engineering test facilities, and human collaborators. The approach involves infrastructure and services than can locate, aggregate, integrate, and manage resources from across the NASA enterprise. An important aspect of IPG is to produce a common view of these resources, and at the same time provide for distributed management and local control. In addition to addressing the overall goal of enhanced science and engineering, there is a potential important side effect. With a large collection of resources that have common use interfaces and a common management approach, the potential exists for a considerable pool of computing capability that could relatively easily, for example, be called on in extraordinary situations such as crisis response.

Book ChapterDOI
21 Sep 1999
TL;DR: This paper explores the possibility of exploiting a distributed-memory execution environment, such as a network of workstations interconnected by a standard LAN, to extend the size of the verification problems that can be successfully handled by SPIN.
Abstract: The main limiting factor of the model checker SPIN is currently the amount of available physical memory. This paper explores the possibility of exploiting a distributed-memory execution environment, such as a network of workstations interconnected by a standard LAN, to extend the size of the verification problems that can be successfully handled by SPIN. A distributed version of the algorithm used by SPIN to verify safety properties is presented, and its compatibility with the main memory and complexity reduction mechanisms of SPIN is discussed. Finally, some preliminary experimental results are presented.

Proceedings ArticleDOI
29 Aug 1999
TL;DR: In this paper, a new technique based on genetic algorithms for the optimal sizing and siting of distributed generation resources in MV distribution networks is presented Tests on two networks of 43 and 93 buses are also provided to show the efficiency of the proposed technique.
Abstract: In this paper, a new technique, based on genetic algorithms, for the optimal sizing and siting of distributed generation resources in MV distribution networks is presented Tests on two networks of 43 and 93 buses are also provided to show the efficiency of the proposed technique

Journal ArticleDOI
TL;DR: This paper introduces a new model and methodological approach for dealing with the probabilistic nature of mobile networks based on the theory of random graphs, and shows that it is possible to construct a randomized distributed algorithm which provides connectivity with high probability, requiring exponentially fewer connections than the number of connections needed for an algorithm with a worst case deterministic guarantee.
Abstract: This paper introduces a new model and methodological approach for dealing with the probabilistic nature of mobile networks based on the theory of random graphs. Probabilistic dependence between the random links prevents the direct application of the theory of random graphs to communication networks. The new model, termed Random Network Model, generalizes conventional random graph models to allow for the inclusion of link dependencies in a mobile network. The new Random Network Model is obtained through the superposition of Kolmogorov complexity and random graph theory, making in this way random graph theory applicable to mobile networks. To the best of the authors' knowledge, it is the first application of random graphs to the field of mobile networks and a first general modeling framework for dealing with ad-hoc network mobility. The application of this methodology makes it possible to derive results with proven properties. The theory is demonstrated by addressing the issue of the establishment of a connected virtual backbone among mobile clusterheads in a peer-to-peer mobile wireless network. Using the Random Network Model, we show that it is possible to construct a randomized distributed algorithm which provides connectivity with high probability, requiring exponentially fewer connections (peer-to-peer logical links) per clusterhead than the number of connections needed for an algorithm with a worst case deterministic guarantee.

Journal ArticleDOI
TL;DR: A very natural randomized algorithm for distributed vertex coloring of graphs under the assumption that the random choices of processors are mutually independent, the execution time will be O(log n ) rounds almost always.

Proceedings ArticleDOI
03 Oct 1999
TL;DR: An algorithm is proposed that guarantees delivery to highly mobile agents using a technique similar to a distributed snapshot, and the very structure of the algorithm makes it amenable not only to guarantee message delivery to a specific mobile agent, but also to provide multicast communication to a group of agents.
Abstract: The provision of a reliable communication infrastructure for mobile agents is still an open research issue. The challenge to reliability we address in this work does not come from the possibility of faults, but rather from the mere presence of mobility, which slightly complicates the problem of ensuring the delivery of information, even in a fault-free network. For instance, the asynchronous nature of message passing and agent migration may cause situations where messages forever chase a mobile agent that moves frequently from one host to another. Current solutions rely on conventional technologies that either do not provide a solution for the aforementioned problem, because they were not designed with mobility in mind, or enforce continuous connectivity with the message source, which in many cases defeats the very purpose of using mobile agents. In this paper, we propose an algorithm that guarantees delivery to highly mobile agents using a technique that is similar to a distributed snapshot. A number of enhancements to this basic idea are discussed, which limit the scope of message delivery by allowing dynamic creation of the connectivity graph. Notably, the very structure of our algorithm makes it amenable not only to guarantee message delivery to a given mobile agent, but also to provide multicast communication to a group of agents-another open problem in research on mobile agents. After presenting our algorithm and its properties, we discuss its implementability by analyzing the requirements on the underlying mobile agent platform, and we argue about its applicability.


Journal ArticleDOI
TL;DR: The deterministic broadcast protocols introduced in this paper overcome the above limitations by using a novel mobility-transparent schedule, thus providing a delivery (time) guarantee without the need to recompute the schedules when topology changes.
Abstract: Broadcast (distributing a message from a source node to all other nodes) is a fundamental problem in distributed computing. Several solutions for solving this problem in mobile wireless networks are available, in which mobility is dealt with either by the use of randomized retransmissions or, in the case of deterministic delivery protocols, by using conflict-free transmission schedules. Randomized solutions can be used only when unbounded delays can be tolerated. Deterministic conflict-free solutions require schedule recomputation when topology changes, thus becoming unstable when the topology rate of change exceeds the schedule recomputation rate. The deterministic broadcast protocols we introduce in this paper overcome the above limitations by using a novel mobility-transparent schedule, thus providing a delivery (time) guarantee without the need to recompute the schedules when topology changes. We show that the proposed protocol is simple and easy to implement, and that it is optimal in networks in which assumptions on the maximum number of the neighbors of a node can be made.

Proceedings ArticleDOI
01 Jan 1999
TL;DR: This paper proposes a Web cluster architecture in which the Domain Name System (DNS) server, which dispatches the user requests among the servers through the URL name to the IP address mapping mechanism, is integrated with a redirection request mechanism based on HTTP.
Abstract: Replication of information among multiple World Wide Web servers is necessary to support high request rates to popular Web sites. A clustered Web server organization is preferable to multiple independent mirrored servers because it maintains a single interface to the users and has the potential to be more scalable, fault-tolerant and better load-balanced. In this paper, we propose a Web cluster architecture in which the Domain Name System (DNS) server, which dispatches the user requests among the servers through the URL name to the IP address mapping mechanism, is integrated with a redirection request mechanism based on HTTP. This should alleviate the side-effect of caching the IP address mapping at intermediate name servers. We compare many alternative mechanisms, including synchronous vs. asynchronous activation and centralized vs. distributed decisions on redirection. Moreover, we analyze the reassignment of entire domains or individual client requests, different types of status information and different server selection policies for redirecting requests. Our results show that the combination of centralized and distributed dispatching policies allows the Web server cluster to handle high load skews in the WWW environment.

Book ChapterDOI
04 Mar 1999
TL;DR: Algorithmic problems in a distributed setting where the participants cannot be assumed to follow the algorithm but rather their own self-interest are considered.
Abstract: This paper considers algorithmic problems in a distributed setting where the participants cannot be assumed to follow the algorithm but rather their own self-interest Such scenarios arise, in particular, when computers or users aim to cooperate or trade over the Internet As such participants, termed agents, are capable of manipulating the algorithm, the algorithm designer should ensure in advance that the agents’ interests are best served by behaving correctly

Journal ArticleDOI
TL;DR: A new comparison-based model for distributed fault diagnosis in multicomputer systems with a weak reliable broadcast capability and a polynomial-time diagnosis algorithm is described, which diagnoses all fault situations with low latency and very little overhead.
Abstract: This paper describes a new comparison-based model for distributed fault diagnosis in multicomputer systems with a weak reliable broadcast capability. The classical problems of diagnosability and diagnosis are both considered under this broadcast comparison model. A characterization of diagnosable systems is given, which leads to a polynomial-time diagnosability algorithm. A polynomial-time diagnosis algorithm for t-diagnosable systems is also given. A variation of this algorithm, which allows dynamic fault occurrence and incomplete diagnostic information, has been implemented in the COmmon Spaceborne Multicomputer Operating System (COSMOS). Results produced using a simulator for the JPL MAX multicomputer system running COSMOS show that the algorithm diagnoses all fault situations with low latency and very little overhead. These simulations demonstrate the practicality of the proposed diagnosis model and algorithm for multicomputer systems having weak reliable broadcast. This includes systems with fault-tolerant hardware for broadcast, as well as those where reliable broadcast is implemented in software.

Journal ArticleDOI
TL;DR: The proposed multichannel topology-transparent algorithm has the flexibility to allow the growth of the network, i.e., the network can add more mobile nodes without recomputation of transmission schedules for existing nodes and a minimum throughput is guaranteed.
Abstract: Many transmission scheduling algorithms have been proposed to maximize spatial reuse and minimize the time division multiple access (TDMA) frame length in multihop packet radio networks. Almost all existing algorithms assume exact network topology information and require recomputations when the network topology changes. In addition, existing work focuses on single channel TDMA systems. In this paper, we propose a multichannel topology-transparent algorithm based on latin squares. The proposed algorithm has the flexibility to allow the growth of the network, i.e., the network can add more mobile nodes without recomputation of transmission schedules for existing nodes. At the same time, a minimum throughput is guaranteed. We analyze the efficiency of this algorithm and examine the topology-transparent characteristics and the sensitivity on design parameters by analytical and simulation techniques.

Proceedings ArticleDOI
M.S. Eby, W.E. Kelly1
07 Mar 1999
TL;DR: The results of this work suggest that potential field algorithms are an extremely robust solution to the problem of CD and R, and show that these algorithms can be adapted to a situation requiring distributed computation and resolution.
Abstract: Many of the nation's airspace users desire more freedom in selecting and modifying their routes. This desire has been expressed in the free flight concept, which has gained increased attention in the last few years. Free flight offers the potential for more efficient routes, decreased fuel costs, and less dependence on air traffic control. The greatest challenge, however, is maintaining the safe separation between aircraft. This problem is often referred to as conflict detection and resolution (CD and R). This paper describes a technique by which aircraft may simultaneously and independently determine collision-free flight operational environment. The technique derived from potential-field has demonstrated tremendous robustness in of scenarios ranging from simple two-aircraft conflicts and contrived geometric formations to complex, randomized multi-aircraft conflicts. Communication failures and restrictive maneuverability constraints have also been considered. The results of this work suggest that potential field algorithms are an extremely robust solution to the problem of CD and R. The results also show that these algorithms can be adapted to a situation requiring distributed computation and resolution. The advantage of a distributed approach is the decreased reliance on a central command authority.

Journal ArticleDOI
TL;DR: This work designed and implemented the Matchmaking resource management framework, and describes the use of matchmaking in Condor, presenting several examples that illustrate its flexibility and expressiveness.
Abstract: Federated distributed systems present new challenges to resource management Conventional resource managers are based on a relatively static resource model and a centralized allocator that assigns resources to customers Distributed environments, particularly those built to support high-throughput computing (HTC), are often characterized by distributed management and distributed ownership Distributed management introduces resource heterogeneity: Not only the set of available resources, but even the set of resource types is constantly changing Distributed ownership introduces policy heterogeneity: Each resource may have its own idiosyncratic allocation policy To address these problems, we designed and implemented the Matchmaking resource management framework Customers and resources are all described by classified advertisements (classads) written in a simple but powerful formal language that describes their attributes and allocation policies A Matchmaker server uses a policy-independent matching operation to discover pairings It notifies the parties to the match, which use a separate, bilateral claiming protocol to confirm the allocation The resulting framework is robust, scalable and flexible, and can evolve with changing resources Matchmaking is the core of the Condor High Throughput Computing System developed at the University of Wisconsin – Madison Condor is a production-quality system used by scientists and engineers at sites around the world Condor derives much of its flexibility, robustness and efficiency from the matchmaking architecture We describe the use of matchmaking in Condor, presenting several examples that illustrate its flexibility and expressiveness