scispace - formally typeset
Search or ask a question

Showing papers on "Scheduling (computing) published in 1991"


Book
01 Apr 1991
TL;DR: General topics in the theory of probability metrics relations between compound, simple and primary distances applications of minimal p.
Abstract: General topics in the theory of probability metrics relations between compound, simple and primary distances applications of minimal p. distances ideal metrics.

842 citations


Journal ArticleDOI
TL;DR: A queuing-theoretical formulation of the imprecise scheduling problem is presented and workload models that quantify the tradeoff between result quality and computation time are reviewed.
Abstract: The imprecise computation technique, which prevents timing faults and achieves graceful degradation by giving the user an approximate result of acceptable quality whenever the system cannot produce the exact result in time, is considered. Different approaches for scheduling imprecise computations in hard real-time environments are discussed. Workload models that quantify the tradeoff between result quality and computation time are reviewed. Scheduling algorithms that exploit this tradeoff are described. These include algorithms for scheduling to minimize total error, scheduling periodic jobs, and scheduling parallelizable tasks. A queuing-theoretical formulation of the imprecise scheduling problem is presented. >

582 citations


Journal ArticleDOI
TL;DR: Investigation of schedulability tests for sets of periodic processes whose deadlines are permitted to be less than their period finds that such a relaxation enables sporadic processes to be directly incorporated without alteration to the process model.

535 citations


Journal ArticleDOI
TL;DR: The model is motivated by applications in which the objective is to minimize the wait for service in a stochastic and dynamically changing environment, a departure from classical vehicle routing problems where one seeks to minimize total travel time in a static, deterministic environment.
Abstract: We propose and analyze a generic mathematical model for dynamic, stochastic vehicle routing problems, the dynamic traveling repairman problem (DTRP). The model is motivated by applications in which the objective is to minimize the wait for service in a stochastic and dynamically changing environment. This is a departure from classical vehicle routing problems where one seeks to minimize total travel time in a static, deterministic environment. Potential areas of application include repair, inventory, emergency service and scheduling problems. The DTRP is defined as follows: Demands for service arrive in time according to a Poisson process, are independent and uniformly distributed in a Euclidean service region, and require an independent and identically distributed amount of on-site service by a vehicle. The problem is to find a policy for routing the service vehicle that minimizes the average time demands spent in the system. We propose and analyze several policies for the DTRP. We find a provably optima...

447 citations


Journal ArticleDOI
TL;DR: An integer linear programming (ILP) model for the scheduling problem in high-level synthesis is presented and a scheduling problem called feasible scheduling, which provides a paradigm for exploring the solution space, is constructed.
Abstract: An integer linear programming (ILP) model for the scheduling problem in high-level synthesis is presented. In addition to time-constrained scheduling and resource-constrained scheduling, a scheduling problem called feasible scheduling, which provides a paradigm for exploring the solution space, is constructed. Extensive consideration is given to the following applications: scheduling with chaining, multicycle operations by nonpipelined function units, and multicycle operations by pipelined function units; functional pipelining; loop folding; mutually exclusive operations; scheduling under bus constraint; and minimizing lifetimes of variables. The complexity of the number of variables in the formulation is O(s*n) where s and n are the number of control steps and operations, respectively. Since the as soon as possible (ASAP), as late as possible (ALAP), and list scheduling techniques are used to reduce the solution space, the formulation becomes very efficient. A solution to a practical problem, such as the fifth-order filter, can be found optimally in a few seconds. >

434 citations


Patent
Frank H. Levinson1
01 Nov 1991
TL;DR: In this article, an information broadcasting system provides a large number of subscribers access to a large amount of information using one or more satellite transmission channels, which can also use cable television transmission channels.
Abstract: An information broadcasting system provides a large number of subscribers access to a large amount of information using one or more satellite transmission channels. The system can also use cable television transmission channels. A program supplier station stores an information database and tags all the information in the database with indices so as to form a single hierarchical structure which encompasses the entire information database. Portions of the information database are transmitted often, at least once per day, in order to provide the basic subscriber with information need to access the remainder of the database. The information provided by the basic subscriber service, which will typically include at least 50 gigabytes of data, is available to all subscribers without requiring two way communications between the subscribers and the program supplier station. Using a tiered system for scheduling transmission of the 50 gigabytes or so of information included in the basic subscriber service, as well as an intelligent subscriber request anticipation scheme for retrieving information before the subscriber asks for it, the present invention provides subscribers with reasonably quick access to all the contents of the large database while using only a modest amount of bandwidth. Furthermore, by reserving a portion of the system's bandwidth for satisfying requests for access to information not provided with the basic subscriber service, timely access to a virtually unlimited amount of information can be provided, using the same modest transmission bandwidth, to those subscribers willing to pay additional fees for that service.

372 citations


Journal ArticleDOI
E.L. Hahne1
TL;DR: The results suggest that the transmission capacity not used by the small window session will be approximately fairly divided among the large window sessions, and the worst-case performance of round-robin scheduling with windows is shown to approach limits that are perfectly fair in the max-min sense.
Abstract: The author studies a simple strategy, proposed independently by E.L. Hahne and R.G. Gallager (1986) and M.G.H. Katevenis (1987), for fairly allocating link capacity in a point-to-point packet network with virtual circuit routing. Each link offers its packet transmission slots to its user sessions by polling them in round-robin order. In addition, window flow control is used to prevent excessive packet queues at the network nodes. As the window size increases, the session throughput rates are shown to approach limits that are perfectly fair in the max-min sense. If each session has periodic input (perhaps with jitter) or has such heavy demand that packets are always waiting to enter the network, then a finite window size suffices to produce perfectly fair throughput rates. The results suggest that the transmission capacity not used by the small window session will be approximately fairly divided among the large window sessions. The focus is on the worst-case performance of round-robin scheduling with windows. >

337 citations


Journal ArticleDOI
Raul Camposano1
TL;DR: A novel path-based scheduling algorithm that yields solutions with the minimum number of control steps, taking into account arbitrary constraints that limit the amount of operations in each control step, is presented.
Abstract: A novel path-based scheduling algorithm is presented. It yields solutions with the minimum number of control steps, taking into account arbitrary constraints that limit the amount of operations in each control step. The result is a finite state machine that implements the control. Although the complexity of the algorithm is proportional to the number of paths in the control-flow graph, it is shown to be practical for large examples with thousands of nodes. >

320 citations


Journal ArticleDOI
Alan Burns1
TL;DR: Recent results in the application of scheduling theory to hard real-time systems are reviewed in this paper and problems presented by different application requirements and characteristics are analyzed.
Abstract: Recent results in the application of scheduling theory to hard real-time systems are reviewed in this paper. The review takes the form of an analysis of the problems presented by different application requirements and characteristics. Issues covered include uniprocessor and multiprocessor systems, periodic and aperiodic processes, static and dynamic algorithms, transient overloads and resource usage. Protocols that limit and reduce blocking are discussed. Consideration is also given to scheduling Ada tasks.

268 citations


Book ChapterDOI
03 Jun 1991
TL;DR: The scope of applicability for the abstract model of timed transition systems is explored and it is demonstrated that the model can represent a wide variety of phenomena that routinely occur in conjunction with the timed execution of concurrent processes.
Abstract: We incorporate time into an interleaving model of concurrency. In timed transition systems, the qualitative fairness requirements of traditional transition system are replaced (and superseded) by quantitative lower-bound and upperbound timing constraints on transitions. The purpose of this paper is to explore the scope of applicability for the abstract model of timed transition systems. We demonstrate that the model can represent a wide variety of phenomena that routinely occur in conjunction with the timed execution of concurrent processes. Our treatment covers both processes that are executed in parallel on separate processors and communicate either through shared variables or by message passing, and processes that time-share a limited number of processors under a given scheduling policy. Often it is this scheduling policy that determines if a system meets its real-time requirements. Thus we explicitly address such questions as time-outs, interrupts, static and dynamic priorities.

265 citations


Journal ArticleDOI
TL;DR: Experimental results indicate that no performance improvements can be obtained over the scheduler versions using a one-dimensional workload descriptor, and the best single workload descriptor is the number of tasks in the run queue.
Abstract: A task scheduler based on the concept of a stochastic learning automation, implemented on a network of Unix workstations, is described. Creating an artificial, executable workload, a number of experiments were conducted to determine the effect of different workload descriptions. These workload descriptions characterize the load at one host and determine whether a newly created task is to be executed locally or remotely. Six one-dimensional workload descriptors are examined. Two workload descriptions that are more complex are also considered. It is shown that the best single workload descriptor is the number of tasks in the run queue. The use of the worst workload descriptor, the 1-min load average, resulted in an increase of the mean response time of over 32%, compared to the best descriptor. The two best workload descriptors, the number of tasks in the run queue and the system call rate, are combined to measure a host's load. Experimental results indicate that no performance improvements over the scheduler versions using a one-dimensional workload descriptor can be obtained. >

Proceedings ArticleDOI
04 Dec 1991
TL;DR: The authors present and evaluate an extension of the AED algorithm called hierarchical earliest deadline (HED), which is designed to handle applications that assign different values to transactions and where the goal is to maximize the total value of the in-time transactions.
Abstract: A new priority assignment algorithm called adaptive earliest deadline (AED) is given that stabilizes the overload performance of the earliest deadline policy in a real-time database system (RTDBS) environment. The AED algorithm uses a feedback control mechanism to achieve this objective and does not require knowledge of transaction characteristics. Using a detailed simulation model, the authors compare the performance of AED with respect to earliest deadline and other fixed priority schemes. They also present and evaluate an extension of the AED algorithm called hierarchical earliest deadline (HED), which is designed to handle applications that assign different values to transactions and where the goal is to maximize the total value of the in-time transactions. >

Patent
04 Apr 1991
TL;DR: In this article, a critical path for executing a task such as evaluating a database query is determined, and the minimum time to execute the task assuming infinite resources such as processors and memory buffers is calculated.
Abstract: A method of controlling the allocation of resources in a parallel processor computer. A critical path for executing a task such as evaluating a database query is determined. The minimum time to execute the task assuming infinite resources such as processors and memory buffers is calculated. Resources are scheduled against subtasks so as to execute the task in the calculated minimum time. The number of resources would be required to meet the schedule is determined and if the computer has that many resources the schedule is carried out. Otherwise a revised execution time is calculated, preferably by using as a scaling factor the ratio between the number of required resources and the number of available resources. Then the schedule is adjusted so that the task can be executed in the revised time and the number of resources that would be required to meet the adjusted schedule is determined. If the computer has that many resources the schedule is carried out, otherwise the process is repeated. Preferably the process is halted if two iterations result in the same number of resources being needed.

Journal ArticleDOI
TL;DR: An efficient real-time scheduling algorithm is introduced that substantially increases the schedulable region without incurring prohibitive complexity costs and is compared with the ones generated by the static priority scheduling algorithm and a variant of the minimum laxity threshold algorithm.
Abstract: Whether or not the introduction of traffic classes improves upon the performance of ATM networks is discussed within the framework provided by a class of networks that guarantees quality of service. To provide a meaningful comparison the authors define the concept of a schedulable region, a region in the space of loads for which the quality of service is guaranteed. The authors show the dependence of the schedulable region on the scheduling algorithm employed, quality of service parameters, and traffic statistics. An efficient real-time scheduling algorithm is introduced that substantially increases the schedulable region without incurring prohibitive complexity costs. The schedulable region associated with this algorithm is compared with the ones generated by the static priority scheduling algorithm and a variant of the minimum laxity threshold algorithm. The size and shape of the schedulable region is explored by means of simulations. >

Proceedings ArticleDOI
02 Apr 1991
TL;DR: This paper uses detailed simulation studies to evaluate the performance of several different scheduling strategies, and shows that in situations where the number of processes exceeds thenumber of processors, regular priority-based scheduling in conjunction with busy-waiting synchronization primitives results in extremely poor processor utilization.
Abstract: Shared-memory multiprocessors are frequently used as compute servers with multiple parallel applications executing at the same time. In such environments, the efficiency of a parallel application can be significantly affected by the operating system scheduling policy. In this paper, we use detailed simulation studies to evaluate the performance of several different scheduling strategies, These include regular priority scheduling, coscheduling or gang scheduling, process control with processor partitioning, handoff scheduling, and affinity-based scheduling. We also explore tradeoffs between the use of busy-waiting and blocking synchronization primitives and their interactions with the scheduling strategies. Since effective use of caches is essential to achieving high performance, a key focus is on the impact of the scheduling strategies on the caching behavior of the applications.Our results show that in situations where the number of processes exceeds the number of processors, regular priority-based scheduling in conjunction with busy-waiting synchronization primitives results in extremely poor processor utilization. In such situations, use of blocking synchronization primitives can significantly improve performance. Process control and gang scheduling strategies are shown to offer the highest performance, and their performance is relatively independent of the synchronization method used. However, for applications that have sizable working sets that fit into the cache, process control performs better than gang scheduling. For the applications considered, the performance gains due to handoff scheduling and processor affinity are shown to be small.

Journal ArticleDOI
TL;DR: It is shown that this protocol leads to freedom from mutual deadlock and can be used by schedulability analysis to guarantee that a set of periodic transactions using this protocol can always meet its deadlines.
Abstract: The authors examine a priority driven two-phase lock protocol called the read/write priority ceiling protocol. It is shown that this protocol leads to freedom from mutual deadlock. In addition, a high-priority transactions can be blocked by lower priority transactions for at most the duration of a single embedded transaction. These properties can be used by schedulability analysis to guarantee that a set of periodic transactions using this protocol can always meet its deadlines. Finally, the performance of this protocol is examined for randomly arriving transactions using simulation studies. >

Proceedings ArticleDOI
04 Dec 1991
TL;DR: The authors present an approximation algorithm for the period assignment problem for which some encouraging experimental results are included and an efficient algorithm to calculate the bound is provided.
Abstract: A framework is given for discussing how to adjust load in order to handle periodic processes whose timing parameters vary with time. The schedulability of adjustable periodic processes by a preemptive fixed priority scheduler is formulated in terms of a configuration selection problem. Specifically, two process transformations are introduced for the purpose of deriving a bound for the achievable utilization factor of processes whose periods are related by harmonics. This result is then generalized so that the bound is applicable to any process set and an efficient algorithm to calculate the bound is provided. When the list of allowable configurations is implicitly given by a set of scalable periodic processes, the corresponding period assignment problem is shown to be NP-complete. The authors present an approximation algorithm for the period assignment problem for which some encouraging experimental results are included. >

Proceedings ArticleDOI
01 Jun 1991
TL;DR: The probabilistic analysis of the performance of the load balancing scheme proves that each tasks in the system receives its fair share of computation time.
Abstract: A collection of local workpiles (task queues) and a simple load balancing scheme is well suited for scheduling tasks in shared memory parallel machines. Task scheduling on such machines has usually been done through a single, globally accessible, workpile. The scheme introduced in this paper achieves a balancing comparable to that of a global workpile, while minimizing the overheads. In many parallel computer architectures, each processor has some memory that it can access more efficiently, and so it is desirable that tasks do not mirgrate frequently. The load balancing is simple and distributed: Whenever a processor accesses its local workpile, it performs a balancing operation with probability inversely proportional to the size of its workpile. The balancing operation consists of examining the workpile of a random processor and exchanging tasks so as to equalize the size of the two workpiles. The probabilistic analysis of the performance of the load balancing scheme proves that each tasks in the system receives its fair share of computation time. Specifically, the expected size of each local task queue is within a small constant factor of the average, i.e. total number of tasks in the system divided by the number of processors.

Journal ArticleDOI
TL;DR: In this article, the authors consider the rescheduling of operations with release dates and multiple resources when disruptions prevent the use of a preplanned schedule, and the overall strategy is to follow the preschedule until a disruption occurs.
Abstract: This paper considers the rescheduling of operations with release dates and multiple resources when disruptions prevent the use of a preplanned schedule. The overall strategy is to follow the preschedule until a disruption occurs. After a disruption, part of the schedule is reconstructed to match up with the preschedule at some future time. Conditions are given for the optimality of this approach. A practical implementation is compared with the alternatives of preplanned static scheduling and myopic dynamic scheduling. A set of practical test problems demonstrates the advantages of the matchup approach. We also explore the solution of the matchup scheduling problem and show the advantages of an integer programming approach for allocating resources to jobs.

Journal ArticleDOI
TL;DR: A method of generating parallel target code with explicit communication for massively parallel distributed-memory machines is presented and an explicit communication metric is used to guide the selection of data layout strategies.
Abstract: A method of generating parallel target code with explicit communication for massively parallel distributed-memory machines is presented. The source programs are shared-memory parallel programs with explicit control structures. The method extracts syntactic reference patterns from a program with shared address space, selects appropriate communication routines, places these routines in appropriate locations in the target program text and sets up correct conditions for invoking these routines. An explicit communication metric is used to guide the selection of data layout strategies. >

Proceedings ArticleDOI
01 Sep 1991
TL;DR: A set of kernel mechanisms and conventions designed to accord first-class status to user-level threads are described, allowing them to be used in any reasonable way that traditional kernel-provided processes can be used, while leaving the details of their implementation touser-level code.
Abstract: It is often desirable, for reasons of clarity, portability, and efficiency, to write parallel programs in which the number of processes is independent of the number of available processors. Several modern operating systems support more than one process in an address space, but the overhead of creating and synchronizing kernel processes can be high. Many runtime environments implement lightweight processes (threads) in user space, but this approach usually results in second-class status for threads, making it difficult or impossible to perform scheduling operations at appropriate times (e.g. when the current thread blocks in the kernel). In addition, a lack of common assumptions may also make it difficult for parallel programs or library routines that use dissimilar thread packages to communicate with each other, or to synchronize access to shared data.We describe a set of kernel mechanisms and conventions designed to accord first-class status to user-level threads, allowing them to be used in any reasonable way that traditional kernel-provided processes can be used, while leaving the details of their implementation to user-level code. The key features of our approach are (1) shared memory for asynchronous communication between the kernel and the user, (2) software interrupts for events that might require action on the part of a user-level scheduler, and (3) a scheduler interface convention that facilitates interactions in user space between dissimilar kinds of threads. We have incorporated these mechanisms in the Psyche parallel operating system, and have used them to implement several different kinds of user-level threads. We argue for our approach in terms of both flexibility and performance.

Proceedings ArticleDOI
04 Dec 1991
TL;DR: This method can be used to analyze the schedulability of complex task sets which involve interrupts, certain synchronization protocols, nonpreemptible sections and, in general, any mechanism that contributes to a complex priority structure.
Abstract: The problem of fixed priority scheduling of periodic tasks where each task's execution priority may vary is considered. Periodic tasks are decomposed into serially executed subtasks, where each subtask is characterized by an execution time and a fixed priority and is permitted to have a deadline. A method for determining the schedulability of each task is presented along with its theoretical underpinnings. This method can be used to analyze the schedulability of complex task sets which involve interrupts, certain synchronization protocols, nonpreemptible sections and, in general, any mechanism that contributes to a complex priority structure. The authors introduce a simple but realistic real-time robotics application and illustrate how one uses the schedulability equations presented. >

Proceedings ArticleDOI
01 May 1991
TL;DR: A scheme for global (intra-loop) scheduling is proposed, which uses the control and data dependence information summarized in a Program Dependence Graph, to move instructions well beyond basic block boundaries.
Abstract: To improve the utilization of machine resources in superscalar processors, the instructions have to be carefully scheduled by the compiler. As internal parallelism and pipelining increases, it becomes evident that scheduling should be done beyond the basic block level. A scheme for global (intra-loop) scheduling is proposed, which uses the control and data dependence information summarized in a Program Dependence Graph, to move instructions well beyond basic block boundaries. This novel scheduling framework is based on the parametric description of the machine architecture, which spans a range of superscakis and VLIW machines, and exploits speculative execution of instructions to further enhance the performance of the general code. We have implemented our algorithms in the IBM XL family of compilers and have evaluated them on the IBM RISC System/6000 machines.

Proceedings Article
24 Aug 1991
TL;DR: The composition of anytime algorithms can be mechanized as part of a compiler for a LISP-like programming language for real-time systems that separates the arrangement of the performance components from the optimization of their scheduling, and automates the latter task.
Abstract: We present a method to construct real-time systems using as components anytime algorithms whose quality of results degrades gracefully as computation time decreases. Introducing computation time as a degree of freedom defines a scheduling problem involving the activation and interruption of the anytime components. This scheduling problem is especially complicated when trying to construct interruptible algorithms, whose total run-time is unknown in advance. We introduce a framework to measure the performance of anytime algorithms and solve the problem of constructing interruptible algorithms by a mathematical reduction to the problem of constructing contract algorithms, which require the determination of the total run-time when activated. We show how the composition of anytime algorithms can be mechanized as part of a compiler for a LISP-like programming language for real-time systems. The result is a new approach to the construction of complex real-time systems that separates the arrangement of the performance components from the optimization of their scheduling, and automates the latter task.

Proceedings ArticleDOI
01 Sep 1991
TL;DR: This work proposes split-level CPU scheduling of lightweight processes in multiple address spaces, and memory-mapped streams for data movement between address spaces that can reduce scheduling and I/O overhead by a factor of 4 to 6.
Abstract: Next-generation workstations will have hardware support for digital "continuous media" (CM) such as audio and video. CM applications handle data at high rates, with strict timing requirements, and often in small "chunks". If such applications are to run efficiently and predictably as user-level programs, an operating system must provide scheduling and IPC mechanisms that reflect these needs. We propose two such mechanisms: split-level CPU scheduling of lightweight processes in multiple address spaces, and memory-mapped streams for data movement between address spaces. These techniques reduce the the number of user/kernel interactions (system calls, signals, and preemptions). Compared with existing mechanisms, they can reduce scheduling and I/O overhead by a factor of 4 to 6.

Journal ArticleDOI
Andreas Drexl1
TL;DR: This paper considers the nonpreemptive variant of a resource-constrained project job-assignment problem, where job durations as well as costs depend upon the assigned resource, and presents a hybrid brand and bound/dynamic programming algorithm with a rather efficient Monte Carlo type heuristic upper bounding technique.
Abstract: A recurring problem in project management involves the allocation of scarce resources to the individual jobs comprising the project. In many situations such as audit scheduling, the resources correspond to individuals skilled labour. This naturally leads to an assignment type project scheduling problem, i.e. a project has to be processed by assigning one of several individuals resources to each job. In this paper we consider the nonpreemptive variant of a resource-constrained project job-assignment problem, where job durations as well as costs depend upon the assigned resource. Regarding precedence relations as well as release dates and deadlines, the question arises, to which jobs resources should be assigned in order to minimize overall costs. For solving this time-resource-cost-tradeoff problem we present a hybrid brand and bound/dynamic programming algorithm with a rather efficient Monte Carlo type heuristic upper bounding technique as well as various relaxation procedures for determining lower bounds. Computational results are presented as well.

Proceedings ArticleDOI
01 Sep 1991
TL;DR: The preemptive scheduling of sporadic tasks on a uniprocessor is considered and upper bounds on the best performance guarantee obtainable by an online algorithm in a variety of settings are derived.
Abstract: The preemptive scheduling of sporadic tasks on a uniprocessor is considered. A task may arrive at any time, and is characterized by a value that reflects its importance, an execution time that is the amount of processor time needed to completely execute the task, and a deadline by which the task is to complete execution. The goal is to maximize the sum of the values of the completed tasks. An online scheduling algorithm that achieves optimal performance when the system is underloaded and provides a nontrivial performance guarantee when the system is overloaded is designed. The algorithm is implemented using simple data structures to run at a cost of O(log n) time per task, where n bounds the number of tasks in the system at any instant. Upper bounds on the best performance guarantee obtainable by an online algorithm in a variety of settings are derived. >

Journal ArticleDOI
TL;DR: This survey is the first one that attempts to compile a large number of mathematical programming formulations for scheduling into a single paper to ease the task of model building and testing scheduling formulations.

Journal ArticleDOI
TL;DR: Analytical and numerical evidence is presented that confirm the applicability of a heuristic solution procedure for this problem, as well as providing evidence that a pacing approach versus the traditional dispatching approach is an efficient and potentially cost effective method for the control of train movements.
Abstract: Recent developments in location systems technology for railroads provide a train dispatcher with the capability to improve the operations of a rail line by pacing trains over a territory; i.e., to permit trains to travel at less than maximum velocity to minimize fuel consumption while maintaining a given level of performance. Traditional railroad dispatching models assume that the velocities of the trains moving over a dispatcher's territory are fixed at their maximum value and, thus, are incapable of dealing with a pacing situation. This paper presents a mathematical programming model for the pacing problem and describes alternative solution procedures for this model. Analytical and numerical evidence are presented that confirm the applicability of a heuristic solution procedure for this problem, as well as providing evidence that a pacing approach versus the traditional dispatching approach is an efficient and potentially cost effective method for the control of train movements.

Proceedings ArticleDOI
01 Sep 1991
TL;DR: It is concluded that on current machines processor affinity has only a very weak influence on the choice of scheduling discipline, and that the benefits of frequent processor reallocation (in response to the changing parallelism of jobs) outweigh the penalties imposed by such reallocated.
Abstract: In a shared memory multiprocessor with caches, executing tasks develop "affinity" to processors by filling their caches with data and instructions during execution. A scheduling policy that ignores this affinity may waste processing power by causing excessive cache refilling.Our work focuses on quantifying the effect of processor reallocation on the performance of various parallel applications multiprogrammed on a shared memory multiprocessor, and on evaluating how the magnitude of this cost affects the choice of scheduling policy.We first identify the components of application response time, including processor reallocation costs. Next, we measure the impact of reallocation on the cache behavior of several parallel applications executing on a Sequent Symmetry multiprocessor. We also measure, the performance of these applications under a number of alternative allocation policies. These experiments lead us to conclude that on current machines processor affinity has only a very weak influence on the choice of scheduling discipline, and that the benefits of frequent processor reallocation (in response to the changing parallelism of jobs) outweigh the penalties imposed by such reallocation. Finally, we use this experimental data to parameterize a simple analytic model, allowing us to evaluate the effect of processor affinity on future machines, those containing faster processors and larger caches.