scispace - formally typeset
Search or ask a question
Author

N.B. Beck

Bio: N.B. Beck is an academic researcher from Purdue University. The author has contributed to research in topics: Heuristics & Data management. The author has an hindex of 5, co-authored 6 publications receiving 2114 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: It is shown that for the cases studied here, the relatively simple Min?min heuristic performs well in comparison to the other techniques, and one even basis for comparison and insights into circumstances where one technique will out-perform another.

1,757 citations

Proceedings ArticleDOI
12 Apr 1999
TL;DR: A collection of eleven heuristics from the literature has been selected, implemented, and analyzed under one set of common assumptions and provides one even basis for comparison and insights into circumstances where one technique will outperform another.
Abstract: Heterogeneous computing (HC) environments are well suited to meet the computational demands of large, diverse groups of tasks (i.e., a meta-task). The problem of mapping (defined as matching and scheduling) these tasks onto the machines of an HC environment has been shown, in general, to be NP-complete, requiring the development of heuristic techniques. Selecting the best heuristic to use in a given environment, however, remains a difficult problem, because comparisons are often clouded by different underlying assumptions in the original studies of each heuristic. Therefore, a collection of eleven heuristics from the literature has been selected, implemented, and analyzed under one set of common assumptions. The eleven heuristics examined are opportunistic load balancing, user-directed assignment, fast greedy, min-min, max-min, greedy, genetic algorithm, simulated annealing, genetic simulated annealing, tabu, and A*. This study provides one even basis for comparison and insights into circumstances where one technique will outperform another. The evaluation procedure is specified, the heuristics are defined, and then selected results are compared.

259 citations

Proceedings ArticleDOI
20 Oct 1998
TL;DR: A new taxonomy for classifying mapping heuristics for HC environments is proposed (Purdue HC Taxonomy), defined in three major parts: the models used for applications and communication requests; the models use for target hardware platforms; and the characteristics of mappingHeuristics.
Abstract: The problem of mapping (defined as matching and scheduling) tasks and communications onto multiple machines and networks in a heterogeneous computing (HC) environment has been shown to be NP-complete, in general, requiring the development of heuristic techniques. Many different types of mapping heuristics have been developed in recent years. However, selecting the best heuristic to use in any given scenario remains a difficult problem. Factors making this selection difficult are discussed. Motivated by these difficulties, a new taxonomy for classifying mapping heuristics for HC environments is proposed (Purdue HC Taxonomy). The taxonomy is defined in three major parts: the models used for applications and communication requests; the models used for target hardware platforms; and the characteristics of mapping heuristics, Each part of the taxonomy is described, with examples given to help clarify the taxonomy. The benefits and uses of this taxonomy are also discussed.

102 citations

Journal ArticleDOI
TL;DR: Three multiple-source shortest-path algorithm-based heuristics for finding a near-optimal schedule of the communication steps for staging the data are presented and are shown to perform well with respect to upper and lower bounds.
Abstract: Providing up-to-date input to users' applications is an important data management problem for a distributed computing environment, where each data storage location and intermediate node may have specific data available, storage limitations, and communication links available. Sites in the network request data items and each request has an associated deadline and priority. In a military situation, the data staging problem involves positioning data for facilitating a faster access time when it is needed by programs that will aid in decision making. This work concentrates on solving a basic version of the data staging problem in which all parameter values for the communication system and the data request information represent the best known information collected so far and stay fixed throughout the scheduling process. The network is assumed to be oversubscribed and not all requests for data items can be satisfied. A mathematical model for the basic data staging problem is introduced. Then, three multiple-source shortest-path algorithm-based heuristics for finding a near-optimal schedule of the communication steps for staging the data are presented. Each heuristic can be used with each of four cost criteria developed. Thus, 12 implementations are examined. In addition, two different weightings for the relative importance of different priority levels are considered. The performance of the proposed heuristics are evaluated and compared by simulations. The proposed heuristics are shown to perform well with respect to upper and lower bounds. Furthermore, the heuristics and a complex cost criterion allow more highest priority messages to be received than a simple-cost-based heuristic that schedules all highest priority messages first.

37 citations

Proceedings ArticleDOI
30 Mar 1998
TL;DR: This research, based on the simplified static model, serves as a necessary step toward solving the more realistic and complicated version of the data staging problem involving dynamic scheduling, fault tolerance, and determining where to stage data.
Abstract: Data staging is an important data management problem for a distributed heterogeneous networking environment, where each data storage location and intermediate node may have specific data available, storage limitations, and communication links. Sites in the network request data items and each item is associated with a specific deadline and priority. It is assumed that not all requests can be satisfied by their deadline. The work concentrates on solving a basic version of the data staging problem in which all parameter values for the communication system and the data request information represent the best known information collected so far and stay fixed throughout the scheduling process. A mathematical model for the basic data staging problem is introduced. Then, a multiple-source shortest-path algorithm based heuristic for finding a suboptimal schedule of the communication steps for data staging is presented. A simulation study is provided, which evaluates the performance of the proposed heuristic. The results show the advantages of the proposed heuristic over two random based scheduling techniques. This research, based on the simplified static model, serves as a necessary step toward solving the more realistic and complicated version of the data staging problem involving dynamic scheduling, fault tolerance, and determining where to stage data.

31 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Two novel scheduling algorithms for a bounded number of heterogeneous processors with an objective to simultaneously meet high performance and fast scheduling time are presented, called the Heterogeneous Earliest-Finish-Time (HEFT) algorithm and the Critical-Path-on-a-Processor (CPOP) algorithm.
Abstract: Efficient application scheduling is critical for achieving high performance in heterogeneous computing environments. The application scheduling problem has been shown to be NP-complete in general cases as well as in several restricted cases. Because of its key importance, this problem has been extensively studied and various algorithms have been proposed in the literature which are mainly for systems with homogeneous processors. Although there are a few algorithms in the literature for heterogeneous processors, they usually require significantly high scheduling costs and they may not deliver good quality schedules with lower costs. In this paper, we present two novel scheduling algorithms for a bounded number of heterogeneous processors with an objective to simultaneously meet high performance and fast scheduling time, which are called the Heterogeneous Earliest-Finish-Time (HEFT) algorithm and the Critical-Path-on-a-Processor (CPOP) algorithm. The HEFT algorithm selects the task with the highest upward rank value at each step and assigns the selected task to the processor, which minimizes its earliest finish time with an insertion-based approach. On the other hand, the CPOP algorithm uses the summation of upward and downward rank values for prioritizing tasks. Another difference is in the processor selection phase, which schedules the critical tasks onto the processor that minimizes the total execution time of the critical tasks. In order to provide a robust and unbiased comparison with the related work, a parametric graph generator was designed to generate weighted directed acyclic graphs with various characteristics. The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithms significantly surpass previous approaches in terms of both quality and cost of schedules, which are mainly presented with schedule length ratio, speedup, frequency of best results, and average scheduling time metrics.

2,961 citations

Journal ArticleDOI
TL;DR: It is shown that for the cases studied here, the relatively simple Min?min heuristic performs well in comparison to the other techniques, and one even basis for comparison and insights into circumstances where one technique will out-perform another.

1,757 citations

Journal ArticleDOI
TL;DR: In this article, an abstract model and a comprehensive taxonomy for describing resource management architectures is developed, which is used to identify approaches followed in the implementation of existing resource management systems for very large-scale network computing systems known as Grids.
Abstract: The resource management system is the central component of distributed network computing systems. There have been many projects focused on network computing that have designed and implemented resource management systems with a variety of architectures and services. In this paper, an abstract model and a comprehensive taxonomy for describing resource management architectures is developed. The taxonomy is used to identify approaches followed in the implementation of existing resource management systems for very large-scale network computing systems known as Grids. The taxonomy and the survey results are used to identify architectural approaches and issues that have not been fully explored in the research. Copyright © 2001 John Wiley & Sons, Ltd.

993 citations

Journal ArticleDOI
TL;DR: The taxonomy provides end users with a mechanism by which they can assess the suitability of workflow in general and how they might use these features to make an informed choice about which workflow system would be a good choice for their particular application.

903 citations

Posted Content
TL;DR: A taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids is proposed that highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.
Abstract: With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.

851 citations