Showing papers on "Scheduling (computing) published in 2011"

PDF

Open Access

Journal Article•DOI•

StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

[...]

Cédric Augonnet¹, Samuel Thibault¹, Raymond Namyst¹, Pierre-André Wacrenier¹•Institutions (1)

01 Feb 2011

TL;DR: StarPU as mentioned in this paper is a runtime system that provides a high-level unified execution model for numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware and easily develop and tune powerful scheduling algorithms.

...read moreread less

Abstract: In the field of HPC, the current hardware trend is to design multiprocessor architectures featuring heterogeneous technologies such as specialized coprocessors (e.g. Cell/BE) or data-parallel accelerators (e.g. GPUs). Approaching the theoretical performance of these architectures is a complex issue. Indeed, substantial efforts have already been devoted to efficiently offload parts of the computations. However, designing an execution model that unifies all computing units and associated embedded memory remains a main challenge. We therefore designed StarPU, an original runtime system providing a high-level, unified execution model tightly coupled with an expressive data management library. The main goal of StarPU is to provide numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware on the one hand, and easily develop and tune powerful scheduling algorithms on the other hand. We have developed several strategies that can be selected seamlessly at run-time, and we have analyzed their efficiency on several algorithms running simultaneously over multiple cores and a GPU. In addition to substantial improvements regarding execution times, we have obtained consistent superlinear parallelism by actually exploiting the heterogeneous nature of the machine. We eventually show that our dynamic approach competes with the highly optimized MAGMA library and overcomes the limitations of the corresponding static scheduling in a portable way. Copyright © 2010 John Wiley & Sons, Ltd.

...read moreread less

1,116 citations

Journal Article•DOI•

Simulating LTE Cellular Systems: An Open-Source Framework

[...]

Giuseppe Piro, Luigi Alfredo Grieco, Gennaro Boggia, Francesco Capozzi, Pietro Camarda - Show less +1 more

01 Feb 2011-IEEE Transactions on Vehicular Technology

TL;DR: The open-source framework LTE-Sim is presented to provide a complete performance verification of LTE networks and has been conceived to simulate uplink and downlink scheduling strategies in multicell/multiuser environments, taking into account user mobility, radio resource optimization, frequency reuse techniques, the adaptive modulation and coding module, and other aspects that are very relevant to the industrial and scientific communities.

...read moreread less

Abstract: Long-term evolution (LTE) represents an emerging and promising technology for providing broadband ubiquitous Internet access. For this reason, several research groups are trying to optimize its performance. Unfortunately, at present, to the best of our knowledge, no open-source simulation platforms, which the scientific community can use to evaluate the performance of the entire LTE system, are freely available. The lack of a common reference simulator does not help the work of researchers and poses limitations on the comparison of results claimed by different research groups. To bridge this gap, herein, the open-source framework LTE-Sim is presented to provide a complete performance verification of LTE networks. LTE-Sim has been conceived to simulate uplink and downlink scheduling strategies in multicell/multiuser environments, taking into account user mobility, radio resource optimization, frequency reuse techniques, the adaptive modulation and coding module, and other aspects that are very relevant to the industrial and scientific communities. The effectiveness of the proposed simulator has been tested and verified considering 1) the software scalability test, which analyzes both memory and simulation time requirements; and 2) the performance evaluation of a realistic LTE network providing a comparison among well-known scheduling strategies.

...read moreread less

685 citations

Proceedings Article•DOI•

Managing data transfers in computer clusters with orchestra

[...]

Mosharaf Chowdhury¹, Matei Zaharia¹, Justin Ma¹, Michael I. Jordan¹, Ion Stoica¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

15 Aug 2011

TL;DR: This work proposes a global management architecture and a set of algorithms that improve the transfer times of common communication patterns, such as broadcast and shuffle, and allow scheduling policies at the transfer level,such as prioritizing a transfer over other transfers.

...read moreread less

Abstract: Cluster computing applications like MapReduce and Dryad transfer massive amounts of data between their computation stages. These transfers can have a significant impact on job performance, accounting for more than 50% of job completion times. Despite this impact, there has been relatively little work on optimizing the performance of these data transfers, with networking researchers traditionally focusing on per-flow traffic management. We address this limitation by proposing a global management architecture and a set of algorithms that (1) improve the transfer times of common communication patterns, such as broadcast and shuffle, and (2) allow scheduling policies at the transfer level, such as prioritizing a transfer over other transfers. Using a prototype implementation, we show that our solution improves broadcast completion times by up to 4.5X compared to the status quo in Hadoop. We also show that transfer-level scheduling can reduce the completion time of high-priority transfers by 1.7X.

...read moreread less

612 citations

Proceedings Article•DOI•

Auto-scaling to minimize cost and meet application deadlines in cloud workflows

[...]

Ming Mao¹, Marty Humphrey¹•Institutions (1)

University of Virginia¹

12 Nov 2011

TL;DR: This paper presents an approach whereby the basic computing elements are virtual machines (VMs) of various sizes/costs, jobs are specified as workflows, users specify performance requirements by assigning (soft) deadlines to jobs, and the goal is to ensure all jobs are finished within their deadlines at minimum financial cost.

...read moreread less

Abstract: A goal in cloud computing is to allocate (and thus pay for) only those cloud resources that are truly needed. To date, cloud practitioners have pursued schedule-based (e.g., time-of-day) and rule-based mechanisms to attempt to automate this matching between computing requirements and computing resources. However, most of these "auto-scaling" mechanisms only support simple resource utilization indicators and do not specifically consider both user performance requirements and budget concerns. In this paper, we present an approach whereby the basic computing elements are virtual machines (VMs) of various sizes/costs, jobs are specified as workflows, users specify performance requirements by assigning (soft) deadlines to jobs, and the goal is to ensure all jobs are finished within their deadlines at minimum financial cost. We accomplish our goal by dynamically allocating/deallocating VMs and scheduling tasks on the most cost-efficient instances. We evaluate our approach in four representative cloud workload patterns and show cost savings from 9.8% to 40.4% compared to other approaches.

...read moreread less

556 citations

Proceedings Article•DOI•

ARIA: automatic resource inference and allocation for mapreduce environments

[...]

Abhishek Verma¹, Ludmila Cherkasova², Roy H. Campbell¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, Hewlett-Packard²

14 Jun 2011

TL;DR: This work designs a MapReduce performance model and implements a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines and validate the approach using a set of realistic applications.

...read moreread less

Abstract: MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control resource allocations to different applications for achieving their performance goals. Currently, there is no job scheduler for MapReduce environments that given a job completion deadline, could allocate the appropriate amount of resources to the job so that it meets the required Service Level Objective (SLO). In this work, we propose a framework, called ARIA, to address this problem. It comprises of three inter-related components. First, for a production job that is routinely executed on a new dataset, we build a job profile that compactly summarizes critical performance characteristics of the underlying application during the map and reduce stages. Second, we design a MapReduce performance model, that for a given job (with a known profile) and its SLO (soft deadline), estimates the amount of resources required for job completion within the deadline. Finally, we implement a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines.We validate our approach using a set of realistic applications. The new scheduler effectively meets the jobs' SLOs until the job demands exceed the cluster resources. The results of the extensive simulation study are validated through detailed experiments on a 66-node Hadoop cluster.

...read moreread less

494 citations

Journal Article•DOI•

A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems

[...]

Mohand Mezmaz¹, Nouredine Melab², Y. Kessaci², Young Choon Lee³, El-Ghazali Talbi⁴, Albert Y. Zomaya³, Daniel Tuyttens¹ - Show less +3 more•Institutions (4)

University of Mons¹, Lille University of Science and Technology², University of Sydney³, King Saud University⁴

01 Nov 2011-Journal of Parallel and Distributed Computing

TL;DR: This work proposes a new parallel bi-objective hybrid genetic algorithm that takes into account, not only makespan, but also energy consumption, and focuses on the island parallel model and the multi-start parallel model.

...read moreread less

327 citations

Proceedings Article•DOI•

GreenSlot: scheduling energy consumption in green datacenters

[...]

Íñigo Goiri¹, Kien Le¹, Md. E. Haque¹, Ryan Beauchea¹, Thu D. Nguyen¹, Jordi Guitart, Jordi Torres, Ricardo Bianchini¹ - Show less +4 more•Institutions (1)

Rutgers University¹

12 Nov 2011

TL;DR: It is concluded that green datacenters and green-energy-aware scheduling can have a significant role in building a more sustainable IT ecosystem.

...read moreread less

Abstract: In this paper, we propose GreenSlot, a parallel batch job scheduler for a datacenter powered by a photovoltaic solar array and the electrical grid (as a backup). GreenSlot predicts the amount of solar energy that will be available in the near future, and schedules the workload to maximize the green energy consumption while meeting the jobs' deadlines. If grid energy must be used to avoid deadline violations, the scheduler selects times when it is cheap. Our results for production scientific workloads demonstrate that Green-Slot can increase green energy consumption by up to 117% and decrease energy cost by up to 39%, compared to a conventional scheduler. Based on these positive results, we conclude that green datacenters and green-energy-aware scheduling can have a significant role in building a more sustainable IT ecosystem.

...read moreread less

319 citations

Journal Article•DOI•

Energy Conscious Scheduling for Distributed Computing Systems under Different Operating Conditions

[...]

Young Choon Lee¹, Albert Y. Zomaya¹•Institutions (1)

University of Sydney¹

01 Aug 2011-IEEE Transactions on Parallel and Distributed Systems

TL;DR: This work addresses the problem of scheduling precedence-constrained parallel applications on multiprocessor computer systems and presents two energy-conscious scheduling algorithms using dynamic voltage scaling (DVS) and a novel objective function and a variant from that.

...read moreread less

Abstract: Traditionally, the primary performance goal of computer systems has focused on reducing the execution time of applications while increasing throughput. This performance goal has been mostly achieved by the development of high-density computer systems. As witnessed recently, these systems provide very powerful processing capability and capacity. They often consist of tens or hundreds of thousands of processors and other resource-hungry devices. The energy consumption of these systems has become a major concern. In this paper, we address the problem of scheduling precedence-constrained parallel applications on multiprocessor computer systems and present two energy-conscious scheduling algorithms using dynamic voltage scaling (DVS). A number of recent commodity processors are capable of DVS, which enables processors to operate at different voltage supply levels at the expense of sacrificing clock frequencies. In the context of scheduling, this multiple voltage facility implies that there is a trade-off between the quality of schedules and energy consumption. To effectively balance these two performance goals, we have devised a novel objective function and a variant from that. The main difference between the two algorithms is in their measurement of energy consumption. The extensive comparative evaluations conducted as part of this work show that the performance of our algorithms is very compelling in terms of both application completion time and energy consumption.

...read moreread less

306 citations

Journal Article•DOI•

Scheduling Power Consumption With Price Uncertainty

[...]

T.T. Kim¹, H.V. Poor¹•Institutions (1)

Princeton University¹

22 Jul 2011-IEEE Transactions on Smart Grid

TL;DR: Numerical results suggest that incorporating the statistical knowledge into the scheduling policies can result in significant savings, especially for short tasks, and it is demonstrated with real price data from Commonwealth Edison that scheduling with mismatched modeling and online parameter estimation can still provide significant economic advantages to consumers.

...read moreread less

Abstract: The problem of causally scheduling power consumption to minimize the expected cost at the consumer side is considered. The price of electricity is assumed to be time-varying. The scheduler has access to past and current prices, but only statistical knowledge about future prices, which it uses to make an optimal decision in each time period. The scheduling problem is naturally cast as a Markov decision process. Algorithms to find decision thresholds for both noninterruptible and interruptible loads under a deadline constraint are then developed. Numerical results suggest that incorporating the statistical knowledge into the scheduling policies can result in significant savings, especially for short tasks. It is demonstrated with real price data from Commonwealth Edison that scheduling with mismatched modeling and online parameter estimation can still provide significant economic advantages to consumers.

...read moreread less

304 citations

Proceedings Article•DOI•

Where is the data? Why you cannot debate CPU vs. GPU performance without the answer

[...]

Chris Gregg¹, Kim Hazelwood¹•Institutions (1)

University of Virginia¹

10 Apr 2011

TL;DR: A taxonomy for future CPU/GPU comparisons is suggested, and it is argued that this is not only germane for reporting performance, but is important to heterogeneous scheduling research in general.

...read moreread less

Abstract: General purpose GPU Computing (GPGPU) has taken off in the past few years, with great promises for increased desktop processing power due to the large number of fast computing cores on high-end graphics cards. Many publications have demonstrated phenomenal performance and have reported speedups as much as 1000× over code running on multi-core CPUs. Other studies have claimed that well-tuned CPU code reduces the performance gap significantly. We demonstrate that this important discussion is missing a key aspect, specifically the question of where in the system data resides, and the overhead to move the data to where it will be used, and back again if necessary. We have benchmarked a broad set of GPU kernels on a number of platforms with different GPUs and our results show that when memory transfer times are included, it can easily take between 2 to 50× longer to run a kernel than the GPU processing time alone. Therefore, it is necessary to either include memory transfer overhead when reporting GPU performance, or to explain why this is not relevant for the application in question. We suggest a taxonomy for future CPU/GPU comparisons, and we argue that this is not only germane for reporting performance, but is important to heterogeneous scheduling research in general.

...read moreread less

303 citations

Journal Article•DOI•

Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud

[...]

Daniel Warneke¹, Odej Kao¹•Institutions (1)

Technical University of Berlin¹

01 Jun 2011-IEEE Transactions on Parallel and Distributed Systems

TL;DR: Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today's IaaS clouds for both, task scheduling and execution.

...read moreread less

Abstract: In recent years ad hoc parallel data processing has emerged to be one of the killer applications for Infrastructure-as-a-Service (IaaS) clouds. Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. However, the processing frameworks which are currently used have been designed for static, homogeneous cluster setups and disregard the particular nature of a cloud. Consequently, the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost. In this paper, we discuss the opportunities and challenges for efficient parallel data processing in clouds and present our research project Nephele. Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today's IaaS clouds for both, task scheduling and execution. Particular tasks of a processing job can be assigned to different types of virtual machines which are automatically instantiated and terminated during the job execution. Based on this new framework, we perform extended evaluations of MapReduce-inspired processing jobs on an IaaS cloud system and compare the results to the popular data processing framework Hadoop.

...read moreread less

Journal Article•DOI•

Project scheduling with finite or infinite number of activity processing modes – A survey

[...]

Jan Węglarz¹, Joanna Józefowska¹, Marek Mika¹, Grzegorz Waligóra¹•Institutions (1)

Poznań University of Technology¹

01 Feb 2011-European Journal of Operational Research

TL;DR: This paper surveys single-project, single-objective, deterministic project scheduling problems in which activities can be processed using a finite or infinite number of modes concerning resources of various categories and types.

...read moreread less

Proceedings Article•

TimeGraph: GPU scheduling for real-time multi-tasking environments

[...]

Shinpei Kato¹, Karthik Lakshmanan², Ragunathan Rajkumar², Yutaka Ishikawa¹•Institutions (2)

University of Tokyo¹, Carnegie Mellon University²

15 Jun 2011

TL;DR: TimeGraph is presented, a real-time GPU scheduler at the device-driver level for protecting important GPU workloads from performance interference and supports two priority-based scheduling policies in order to address the tradeoff between response times and throughput introduced by the asynchronous and non-preemptive nature of GPU processing.

...read moreread less

Abstract: The Graphics Processing Unit (GPU) is now commonly used for graphics and data-parallel computing. As more and more applications tend to accelerate on the GPU in multi-tasking environments where multiple tasks access the GPU concurrently, operating systems must provide prioritization and isolation capabilities in GPU resource management, particularly in real-time setups. We present TimeGraph, a real-time GPU scheduler at the device-driver level for protecting important GPU workloads from performance interference. TimeGraph adopts a new event-driven model that synchronizes the GPU with the CPU to monitor GPU commands issued from the user space and control GPU resource usage in a responsive manner. TimeGraph supports two priority-based scheduling policies in order to address the tradeoff between response times and throughput introduced by the asynchronous and non-preemptive nature of GPU processing. Resource reservation mechanisms are also employed to account and enforce GPU resource usage, which prevent misbehaving tasks from exhausting GPU resources. Prediction of GPU command execution costs is further provided to enhance isolation. Our experiments using OpenGL graphics benchmarks demonstrate that TimeGraph maintains the frame-rates of primary GPU tasks at the desired level even in the face of extreme GPU workloads, whereas these tasks become nearly unresponsive without TimeGraph support. Our findings also include that the performance overhead imposed on TimeGraph can be limited to 4-10%, and its event-driven scheduler improves throughput by about 30 times over the existing tick-driven scheduler.

...read moreread less

Proceedings Article•

A case for NUMA-aware contention management on multicore systems

[...]

Sergey Blagodurov¹, Sergey Zhuravlev¹, Mohammad Dashti¹, Alexandra Fedorova¹•Institutions (1)

Simon Fraser University¹

15 Jun 2011

TL;DR: The effects on performance imposed by resource contention and remote access latency are quantified and a new contention management algorithm is proposed and evaluated that significantly outperforms a NUMA-unaware algorithm proposed before as well as the default Linux scheduler.

...read moreread less

Abstract: On multicore systems, contention for shared resources occurs when memory-intensive threads are co-scheduled on cores that share parts of the memory hierarchy, such as last-level caches and memory controllers. Previous work investigated how contention could be addressed via scheduling. A contention-aware scheduler separates competing threads onto separate memory hierarchy domains to eliminate resource sharing and, as a consequence, to mitigate contention. However, all previous work on contention-aware scheduling assumed that the underlying system is UMA (uniform memory access latencies, single memory controller). Modern multicore systems, however, are NUMA, which means that they feature non-uniform memory access latencies and multiple memory controllers. We discovered that state-of-the-art contention management algorithms fail to be effective on NUMA systems and may even hurt performance relative to a default OS scheduler. In this paper we investigate the causes for this behavior and design the first contention-aware algorithm for NUMA systems.

...read moreread less

Journal Article•DOI•

HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds

[...]

Luiz F. Bittencourt¹, Edmundo R. M. Madeira¹•Institutions (1)

State University of Campinas¹

03 Aug 2011-Journal of Internet Services and Applications

TL;DR: This paper presents HCOC: The Hybrid Cloud Optimized Cost scheduling algorithm, which decides which resources should be leased from the public cloud and aggregated to the private cloud to provide sufficient processing power to execute a workflow within a given execution time.

...read moreread less

Abstract: Workflows have been used to represent a variety of applications involving high processing and storage demands. As a solution to supply this necessity, the cloud computing paradigm has emerged as an on-demand resources provider. While public clouds charge users in a per-use basis, private clouds are owned by users and can be utilized with no charge. When a public cloud and a private cloud are merged, we have what we call a hybrid cloud. In a hybrid cloud, the user has elasticity provided by public cloud resources that can be aggregated to the private resources pool as necessary. One question faced by the users in such systems is: Which are the best resources to request from a public cloud based on the current demand and on resources costs? In this paper we deal with this problem, presenting HCOC: The Hybrid Cloud Optimized Cost scheduling algorithm. HCOC decides which resources should be leased from the public cloud and aggregated to the private cloud to provide sufficient processing power to execute a workflow within a given execution time. We present extensive experimental and simulation results which show that HCOC can reduce costs while achieving the established desired execution time.

...read moreread less

Journal Article•DOI•

Two-Level Downlink Scheduling for Real-Time Multimedia Services in LTE Networks

[...]

Giuseppe Piro, Luigi Alfredo Grieco, Gennaro Boggia, R. Fortuna, Pietro Camarda - Show less +1 more

01 Oct 2011-IEEE Transactions on Multimedia

TL;DR: The design of a quality-of-service (QoS) aware packet scheduler for real-time downlink communications is considered, and a novel two-level scheduling algorithm is conceived based on discrete-time linear control theory.

...read moreread less

Abstract: Long-term evolution represents an emerging technology that promises a broadband and ubiquitous Internet access. But several aspects have to be considered for providing effective multimedia services to mobile users. In particular, in this work, we consider the design of a quality-of-service (QoS) aware packet scheduler for real-time downlink communications. To this aim, a novel two-level scheduling algorithm is conceived. The upper level exploits an innovative approach based on discrete-time linear control theory. Instead, at the lower level, a proportional fair scheduler has been properly tailored to our purposes. The performance and the complexity of the proposed scheme have been evaluated both theoretically and by using simulations. A comparison with recently proposed scheduling strategies has been also presented, considering several network conditions and real-time multimedia flows. Particular attention has been devoted to the evaluation of the quality-of-experience (QoE) provided to end users. Results have clearly shown that the proposed approach is able to greatly outperform the existing ones especially in the presence of real-time video flows.

...read moreread less

Proceedings Article•DOI•

Scheduling smart home appliances using mixed integer linear programming

[...]

Kin Cheong Sou¹, James Weimer¹, Henrik Sandberg¹, Karl Henrik Johansson¹•Institutions (1)

Royal Institute of Technology¹

01 Dec 2011

TL;DR: This paper considers the minimum electricity cost scheduling problem of smart home appliances, and the optimal power profile signal minimizes cost, while satisfying technical operation constraints and consumer preferences.

...read moreread less

Abstract: This paper considers the minimum electricity cost scheduling problem of smart home appliances. Operation characteristics, such as expected duration and peak power consumption of the smart appliances, can be adjusted through a power profile signal. The optimal power profile signal minimizes cost, while satisfying technical operation constraints and consumer preferences. Constraints such as enforcing uninterruptible and sequential operations are modeled in the proposed framework using mixed integer linear programming (MILP). Several realistic scenarios based on actual spot price are considered, and the numerical results provide insight into tariff design. Computational issues and extensions of the proposed scheduling framework are also discussed.

...read moreread less

Scheduling and locking in multiprocessor real-time operating systems

[...]

James H. Anderson¹, Bjorn B. Brandenburg¹•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Jan 2011

TL;DR: The experiments show that partitioned earliest-deadline first (EDF) scheduling is generally preferable in a hard real- time setting, whereas global and clustered EDF scheduling are effective in a soft real-time setting.

...read moreread less

Abstract: With the widespread adoption of multicore architectures, multiprocessors are now a standard deployment platform for (soft) real-time applications. This dissertation addresses two questions fundamental to the design of multicore-ready real-time operating systems: (1) Which scheduling policies offer the greatest flexibility in satisfying temporal constraints; and (2) which locking algorithms should be used to avoid unpredictable delays? With regard to Question 1, LITMUSRT, a real-time extension of the Linux kernel, is presented and its design is discussed in detail. Notably, LITMUSRT implements link-based scheduling, a novel approach to controlling blocking due to non-preemptive sections. Each implemented scheduler (22 configurations in total) is evaluated under consideration of overheads on a 24-core Intel Xeon platform. The experiments show that partitioned earliest-deadline first (EDF) scheduling is generally preferable in a hard real-time setting, whereas global and clustered EDF scheduling are effective in a soft real-time setting. With regard to Question 2, real-time locking protocols are required to ensure that the maximum delay due to priority inversion can be bounded a priori. Several spinlock- and semaphore-based multiprocessor real-time locking protocols for mutual exclusion (mutex), reader-writer (RW) exclusion, and k-exclusion are proposed and analyzed. A new category of RW locks suited to worst-case analysis, termed phase-fair locks, is proposed and three efficient phase-fair spinlock implementations are provided (one with few atomic operations, one with low space requirements, and one with constant RMR complexity). Maximum priority-inversion blocking is proposed as a natural complexity measure for semaphore protocols. It is shown that there are two classes of schedulability analysis, namely suspension-oblivious and suspension-aware analysis, that yield two different lower bounds on blocking. Five asymptotically optimal locking protocols are designed and analyzed: a family of mutex, RW, and k-exclusion protocols for global, partitioned, and clustered scheduling that are asymptotically optimal in the suspension-oblivious case, and a mutex protocol for partitioned scheduling that is asymptotically optimal in the suspension-aware case. A LITMUSRT-based empirical evaluation is presented that shows these protocols to be practical.

...read moreread less

Proceedings Article•DOI•

Performance analysis of a distributed resource allocation scheme for D2D communications

[...]

Marco Belleschi¹, Gabor Fodor², Andrea Abrardo¹•Institutions (2)

University of Siena¹, Ericsson²

01 Dec 2011

TL;DR: A distributed suboptimal joint mode selection and resource allocation scheme is proposed that performs close to the optimal scheme both in terms of resource efficiency and user fairness and is benchmarked with respect to the centralized optimal solution.

...read moreread less

Abstract: Device-to-device (D2D) communications underlaying a cellular infrastructure has recently been proposed as a means of increasing the cellular capacity, improving the user throughput and extending the battery lifetime of user equipments by facilitating the reuse of spectrum resources between D2D and cellular links. In network assisted D2D communications, when two devices are in the proximity of each other, the network can not only help the devices to set the appropriate transmit power and schedule time and frequency resources but also to determine whether communication should take place via the direct D2D link (D2D mode) or via the cellular base station (cellular mode). In this paper we formulate the joint mode selection, scheduling and power control task as an optimization problem that we first solve assuming the availability of a central entity. We also propose a distributed suboptimal joint mode selection and resource allocation scheme that we benchmark with respect to the centralized optimal solution. We find that the distributed scheme performs close to the optimal scheme both in terms of resource efficiency and user fairness.

...read moreread less

Proceedings Article•DOI•

RT-Xen: towards real-time hypervisor scheduling in xen

[...]

Sisu Xi¹, Justin Wilson¹, Chenyang Lu¹, Christopher Gill¹•Institutions (1)

Washington University in St. Louis¹

09 Oct 2011

TL;DR: Empirical evaluation shows that RT-Xen can provide effective real-time scheduling to guest Linux operating systems at a 1ms quantum, while incurring only moderate overhead for all the fixed-priority server algorithms.

...read moreread less

Abstract: As system integration becomes an increasingly important challenge for complex real-time systems, there has been a significant demand for supporting real-time systems in virtualized environments. This paper presents RT-Xen, the first real-time hypervisor scheduling framework for Xen, the most widely used open-source virtual machine monitor (VMM). RT-Xen bridges the gap between real-time scheduling theory and Xen, whose wide-spread adoption makes it an attractive platform for integrating a broad range of real-time and embedded systems. Moreover, RT-Xen provides an open-source platform for researchers and integrators to develop and evaluate real-time scheduling techniques, which to date have been studied predominantly via analysis and simulations. Extensive experimental results demonstrate the feasibility, efficiency, and efficacy of fixed-priority hierarchical real-time scheduling in RT-Xen. RT-Xen instantiates a suite of fixed-priority servers (Deferrable Server, Periodic Server, Polling Server, and Sporadic Server). While the server algorithms are not new, this empirical study represents the first comprehensive experimental comparison of these algorithms within the same virtualization platform. Our empirical evaluation shows that RT-Xen can provide effective real-time scheduling to guest Linux operating systems at a 1ms quantum, while incurring only moderate overhead for all the fixed-priority server algorithms. While more complex algorithms such as Sporadic Server do incur higher overhead, none of the overhead differences among different server algorithms are significant. Deferrable Server generally delivers better soft real-time performance than the other server algorithms, while Periodic Server incurs high deadline miss ratios in overloaded situations.

...read moreread less

Journal Article•DOI•

Optimal Packet Scheduling on an Energy Harvesting Broadcast Link

[...]

Mehmet Akif Antepli, Elif Uysal-Biyikoglu, Hakan Erkal

18 Aug 2011-IEEE Journal on Selected Areas in Communications

TL;DR: In this paper, the minimization of transmission completion time for a given number of bits per user in an energy harvesting communication system, where energy harvesting instants are known in an offline manner is considered.

...read moreread less

Abstract: The minimization of transmission completion time for a given number of bits per user in an energy harvesting communication system, where energy harvesting instants are known in an offline manner is considered. An achievable rate region with structural properties satisfied by the 2-user AWGN Broadcast Channel capacity region is assumed. It is shown that even though all data are available at the beginning, a non-negative amount of energy from each energy harvest is deferred for later use such that the transmit power starts at its lowest value and rises as time progresses. The optimal scheduler ends the transmission to both users at the same time. Exploiting the special structure in the problem, the iterative offline algorithm, FlowRight, from earlier literature, is adapted and proved to solve this problem. The solution has polynomial complexity in the number of harvests used, and is observed to converge quickly on numerical examples.

...read moreread less

Book Chapter•DOI•

IRIS – A Tool for Strategic Security Allocation in Transportation Networks

[...]

Jason Tsai¹, Shyamsunder Rathi², Christopher Kiekintveld, Fernando Ordóñez¹, Milind Tambe¹ - Show less +1 more•Institutions (2)

University of Southern California¹, Brocade Communications Systems²

01 Jan 2011

TL;DR: Intelligent Randomization In Scheduling (IRIS) system, a software scheduling assistant for the Federal Air Marshals that provide law enforcement aboard U.S. commercial flights, is implemented, with FAMS as leaders that commit to a flight coverage schedule and terrorists as followers that attempt to attack a flight.

...read moreread less

Abstract: Security is a concern of major importance to governments and companies throughout the world. With limited resources, complete coverage of potential points of attack is not possible. Deterministic allocation of available law enforcement agents introduces predictable vulnerabilities that can be exploited by adversaries. Strategic randomization is a game theoretic alternative that we implement in Intelligent Randomization In Scheduling (IRIS) system, a software scheduling assistant for the Federal Air Marshals (FAMs) that provide law enforcement aboard U.S. commercial flights. In IRIS, we model the problem as a Stackelberg game, with FAMS as leaders that commit to a flight coverage schedule and terrorists as followers that attempt to attack a flight. The FAMS domain presents three challenges unique to transportation network security that we address in the implementation of IRIS. First, with tens of thousands of commercial flights per day, the size of the Stackelberg game we need to solve is tremendous. We use ERASERC, the fastest known algorithm for solving this class of Stackelberg games. Second, creating the game itself becomes a challenge due to number of payoffs we must enter for these large games. To address this, we create an attribute-based preference elicitation system to determine reward values. Third, the complex scheduling constraints in transportation networks make it computationally prohibitive to model the game by explicitly modeling all combinations of valid schedules. Instead, we model the leader’s strategy space by incorporating a representation of the underlying scheduling constraints. The scheduling assistant has been delivered to the FAMS and is currently undergoing testing and review for possible incorporation into their scheduling practices. In this paper, we discuss the design choices and challenges encountered during the implementation of IRIS.

...read moreread less

Proceedings Article•DOI•

Heterogeneity-aware resource allocation and scheduling in the cloud

[...]

Gunho Lee¹, Byung-Gon Chun², H. Katz¹•Institutions (2)

University of California, Berkeley¹, Yahoo!²

14 Jun 2011

TL;DR: This paper rethink resource allocation and job scheduling on a data analytics system in the cloud to embrace the heterogeneity of the underlying platforms and workloads and proposes a metric of share in a heterogeneous cluster to realize a scheduling scheme that achieves high performance and fairness.

...read moreread less

Abstract: Data analytics are key applications running in the cloud computing environment. To improve performance and cost-effectiveness of a data analytics cluster in the cloud, the data analytics system should account for heterogeneity of the environment and workloads. In addition, it also needs to provide fairness among jobs when multiple jobs share the cluster. In this paper, we rethink resource allocation and job scheduling on a data analytics system in the cloud to embrace the heterogeneity of the underlying platforms and workloads. To that end, we suggest an architecture to allocate resources to a data analytics cluster in the cloud, and propose a metric of share in a heterogeneous cluster to realize a scheduling scheme that achieves high performance and fairness.

...read moreread less

Journal Article•DOI•

A Delay-Efficient Algorithm for Data Aggregation in Multihop Wireless Sensor Networks

[...]

Xiaohua Xu¹, Xiang-Yang Li¹, Xufei Mao¹, Shaojie Tang¹, Shiguang Wang¹ - Show less +1 more•Institutions (1)

Illinois Institute of Technology¹

01 Jan 2011-IEEE Transactions on Parallel and Distributed Systems

TL;DR: An efficient distributed algorithm is proposed that produces a collision-free schedule for data aggregation in WSNs and it is theoretically proved that the delay of the aggregation schedule generated by the algorithm is at most 16R + Δ - 14 time slots.

...read moreread less

Abstract: Data aggregation is a key functionality in wireless sensor networks (WSNs). This paper focuses on data aggregation scheduling problem to minimize the delay (or latency). We propose an efficient distributed algorithm that produces a collision-free schedule for data aggregation in WSNs. We theoretically prove that the delay of the aggregation schedule generated by our algorithm is at most 16R + Δ - 14 time slots. Here, R is the network radius and Δ is the maximum node degree in the communication graph of the original network. Our algorithm significantly improves the previously known best data aggregation algorithm with an upper bound of delay of 24D + 6Δ + 16 time slots, where D is the network diameter (note that D can be as large as 2R). We conduct extensive simulations to study the practical performances of our proposed data aggregation algorithm. Our simulation results corroborate our theoretical results and show that our algorithms perform better in practice. We prove that the overall lower bound of delay for data aggregation under any interference model is max{log n,R}, where n is the network size. We provide an example to show that the lower bound is (approximately) tight under the protocol interference model when rI = r, where rI is the interference range and r is the transmission range. We also derive the lower bound of delay under the protocol interference model when r <; rI <; 3r and rI ≥ 3r.

...read moreread less

Proceedings Article•DOI•

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

[...]

Abusayeed Saifullah¹, Kunal Agrawal¹, Chenyang Lu¹, Christopher Gill¹•Institutions (1)

Washington University in St. Louis¹

29 Nov 2011

TL;DR: A new task decomposition method is proposed that decomposes each parallel task into a set of sequential tasks and achieves a resource augmentation bound of 2.62 when the decomposed tasks are scheduled using global EDF and partitioned deadline monotonic scheduling, respectively.

...read moreread less

Abstract: Multi-core processors offer a significant performance increase over single core processors. Therefore, they have the potential to enable computation-intensive real-time applications with stringent timing constraints that cannot be met on traditional single-core processors. However, most results in traditional multiprocessor real-time scheduling are limited to sequential programming models and ignore intra-task parallelism. In this paper, we address the problem of scheduling periodic parallel tasks with implicit deadlines on multi-core processors. We first consider a synchronous task model where each task consists of segments, each segment having an arbitrary number of parallel threads that synchronize at the end of the segment. We propose a new task decomposition method that decomposes each parallel task into a set of sequential tasks. We prove that our task decomposition achieves a resource augmentation bound of 2.62 and 3.42 when the decomposed tasks are scheduled using global EDF and partitioned deadline monotonic scheduling, respectively. Finally, we extend our analysis to directed a cyclic graph tasks. We show how these tasks can be converted into synchronous tasks such that the same transformation can be applied and the same augmentation bounds hold.

...read moreread less

Journal Article•DOI•

On combining shortest-path and back-pressure routing over multihop wireless networks

[...]

Lei Ying¹, Sanjay Shakkottai², Aneesh Reddy², Shihuan Liu¹•Institutions (2)

Iowa State University¹, University of Texas at Austin²

01 Jun 2011-IEEE ACM Transactions on Networking

TL;DR: A new routing/scheduling back-pressure algorithm that not only guarantees network stability (throughput optimality), but also adaptively selects a set of optimal routes based on shortest-path information in order to minimize average path lengths between each source and destination pair is proposed.

...read moreread less

Abstract: Back-pressure-type algorithms based on the algorithm by Tassiulas and Ephremides have recently received much attention for jointly routing and scheduling over multihop wireless networks. However, this approach has a significant weakness in routing because the traditional back-pressure algorithm explores and exploits all feasible paths between each source and destination. While this extensive exploration is essential in order to maintain stability when the network is heavily loaded, under light or moderate loads, packets may be sent over unnecessarily long routes, and the algorithm could be very inefficient in terms of end-to-end delay and routing convergence times. This paper proposes a new routing/scheduling back-pressure algorithm that not only guarantees network stability (throughput optimality), but also adaptively selects a set of optimal routes based on shortest-path information in order to minimize average path lengths between each source and destination pair. Our results indicate that under the traditional back-pressure algorithm, the end-to-end packet delay first decreases and then increases as a function of the network load (arrival rate). This surprising low-load behavior is explained due to the fact that the traditional back-pressure algorithm exploits all paths (including very long ones) even when the traffic load is light. On the other-hand, the proposed algorithm adaptively selects a set of routes according to the traffic load so that long paths are used only when necessary, thus resulting in much smaller end-to-end packet delays as compared to the traditional back-pressure algorithm .

...read moreread less

Journal Article•DOI•

Pareto-based discrete artificial bee colony algorithm for multi-objective flexible job shop scheduling problems

[...]

Junqing Li¹, Quan-Ke Pan², Quan-Ke Pan¹, Kaizhou Gao¹•Institutions (2)

Liaocheng University¹, Huazhong University of Science and Technology²

12 Jan 2011-The International Journal of Advanced Manufacturing Technology

TL;DR: A hybrid Pareto-based discrete artificial bee colony algorithm for solving the multi-objective flexible job shop scheduling problem and comparisons with other recently published algorithms show the efficiency and effectiveness of the proposed algorithm.

...read moreread less

Abstract: This paper presents a hybrid Pareto-based discrete artificial bee colony algorithm for solving the multi-objective flexible job shop scheduling problem. In the hybrid algorithm, each solution corresponds to a food source, which composes of two components, i.e., the routing component and the scheduling component. Each component is filled with discrete values. A crossover operator is developed for the employed bees to learn valuable information from each other. An external Pareto archive set is designed to record the non-dominated solutions found so far. A fast Pareto set update function is introduced in the algorithm. Several local search approaches are designed to balance the exploration and exploitation capability of the algorithm. Experimental results on the well-known benchmark instances and comparisons with other recently published algorithms show the efficiency and effectiveness of the proposed algorithm.

...read moreread less

Proceedings Article•DOI•

Modeling and synthesizing task placement constraints in Google compute clusters

[...]

Bikash Sharma¹, Victor Chudnovsky², Joseph L. Hellerstein², Rasekh Rifaat², Chita R. Das¹ - Show less +1 more•Institutions (2)

Pennsylvania State University¹, Google²

26 Oct 2011

TL;DR: This paper develops methodologies for incorporating task placement constraints and machine properties into performance benchmarks of large compute clusters and provides a simple model of the performance impact of constraints in that task scheduling delays increase with UM.

...read moreread less

Abstract: Evaluating the performance of large compute clusters requires benchmarks with representative workloads. At Google, performance benchmarks are used to obtain performance metrics such as task scheduling delays and machine resource utilizations to assess changes in application codes, machine configurations, and scheduling algorithms. Existing approaches to workload characterization for high performance computing and grids focus on task resource requirements for CPU, memory, disk, I/O, network, etc. Such resource requirements address how much resource is consumed by a task. However, in addition to resource requirements, Google workloads commonly include task placement constraints that determine which machine resources are consumed by tasks. Task placement constraints arise because of task dependencies such as those related to hardware architecture and kernel version. This paper develops methodologies for incorporating task placement constraints and machine properties into performance benchmarks of large compute clusters. Our studies of Google compute clusters show that constraints increase average task scheduling delays by a factor of 2 to 6, which often results in tens of minutes of additional task wait time. To understand why, we extend the concept of resource utilization to include constraints by introducing a new metric, the Utilization Multiplier (UM). UM is the ratio of the resource utilization seen by tasks with a constraint to the average utilization of the resource. UM provides a simple model of the performance impact of constraints in that task scheduling delays increase with UM. Last, we describe how to synthesize representative task constraints and machine properties, and how to incorporate this synthesis into existing performance benchmarks. Using synthetic task constraints and machine properties generated by our methodology, we accurately reproduce performance metrics for benchmarks of Google compute clusters with a discrepancy of only 13% in task scheduling delay and 5% in resource utilization.

...read moreread less

Proceedings Article•DOI•

Multicell coordination via joint scheduling, beamforming and power spectrum adaptation

[...]

Wei Yu¹, Taesoo Kwon², Changyong Shin²•Institutions (2)

University of Toronto¹, Samsung²

10 Apr 2011

TL;DR: System-level simulation results show that coordination at the transmission strategy and resource allocation level can already significantly improve the overall network throughput as compared to a conventional network design with fixed transmit power and per-cell zero-forcing beamforming.

...read moreread less

Abstract: The mitigation of intercell interference is a central issue for future-generation wireless cellular networks where frequencies are reused aggressively and where hierarchical cellular structures may heavily overlap. The paper examines the benefit of coordinating transmission strategies and resource allocation schemes across multiple cells for interference mitigation. For a multicell network serving multiple users per cell sectors and where both the base-stations and the remote users are equipped with multiple antennas, this paper proposes a joint proportionally fair scheduling, spatial multiplexing, and power spectrum adaptation method that coordinates multiple base-stations with an objective of optimizing the overall network utility. The proposed scheme optimizes the user schedule, transmit and receive beamforming vectors, and transmit power spectra jointly, while taking into consideration both the intercell and intracell interference and the fairness among the users. The proposed system is shown to significantly improve the overall network throughput while maintaining fairness as compared to a conventional network with per-cell zero-forcing beamforming and with fixed transmit power spectrum. The proposed system goes toward the vision of a fully coordinated multicell network, whereby transmission strategies and resource allocation schemes (rather than transmit signals) are coordinated across the base-stations as a first step.

...read moreread less

Journal Article•

Load Balanced Min-Min Algorithm for Static Meta-Task Scheduling in Grid Computing

[...]

T. Kokilavani, D. I. George Amalarethinam

30 Apr 2011-International Journal of Computer Applications

TL;DR: A Load Balanced Min-Min (LBMM) algorithm is proposed that reduces the makespan and increases the resource utilization in grid computing and it is shown that the proposed method has two-phases.

...read moreread less

Abstract: Grid computing has become a real alternative to traditional supercomputing environments for developing parallel applications that harness massive computational resources. However, the complexity incurred in building such parallel Grid-aware applications is higher than the traditional parallel computing environments. It addresses issues such as resource discovery, heterogeneity, fault tolerance and task scheduling. Load balanced task scheduling is very important problem in complex grid environment. So task scheduling which is one of the NP-Complete problems becomes a focus of research scholars in grid computing area. The traditional Min-Min algorithm is a simple algorithm that produces a schedule that minimizes the makespan than the other traditional algorithms in the literature. But it fails to produce a load balanced schedule. In this paper a Load Balanced Min-Min (LBMM) algorithm is proposed that reduces the makespan and increases the resource utilization. The proposed method has two-phases. In the first phase the traditional Min-Min algorithm is executed and in the second phase the tasks are rescheduled to use the unutilized resources effectively.

...read moreread less

Collapse