Journal ArticleDOI
Improving utilization of heterogeneous clusters
Reads0
Chats0
TLDR
This article presents a novel way of expressing computational capacity, more adequate for heterogeneous clusters, and also advocates for task migration in order to further improve the utilization.Abstract:
This work has been supported by the Spanish Science and Technology Commission under contracts TIN2016-76635-C2-2-R and TIN2016-81840-REDT (CAPAP-H6 network) and the
European HiPEAC Network of Excellenceread more
Citations
More filters
Journal ArticleDOI
Performance and energy task migration model for heterogeneous clusters
TL;DR: A set of linear regression models to predict the impact of task migration on different objectives, like performance and energy consumption, using a small set of parameters that are easily measurable are presented.
Proceedings ArticleDOI
A Simulator for Intelligent Workload Managers in Heterogeneous Clusters
TL;DR: In this paper, a simulation framework for the study of workload managers based on deep reinforcement learning (DRL) techniques is proposed to simulate heterogeneous clusters based on multicore architectures, taking into account the contention in shared memory access and energy consumption.
Book ChapterDOI
Task Scheduler for Heterogeneous Data Centres Based on Deep Reinforcement Learning
TL;DR: In this paper , the authors leverage machine learning techniques to develop a workload manager that will improve the efficiency of modern data centres, which can not only choose the most adequate job for scheduling, but also determine the best compute resources for its execution.
References
More filters
Journal ArticleDOI
Exploiting process lifetime distributions for dynamic load balancing
TL;DR: The measurements indicate that the distribution of lifetimes for a UNIX process is Pareto (heavy-tailed), with a consistent functional form over a variety of workloads, and it is shown how to apply this distribution to derive a preemptive migration policy that requires no hand-tuned parameters.
Proceedings ArticleDOI
DMTCP: Transparent checkpointing for cluster computations and the desktop
TL;DR: DMTCP as mentioned in this paper is a transparent user-level checkpointing package for distributed applications, which is used for the runCMS experiment of the Large Hadron Collider at CERN, and it can be incorporated and distributed as a checkpoint-restart module within some larger package.
Proceedings ArticleDOI
Managing cost, performance, and reliability tradeoffs for energy-aware server provisioning
TL;DR: ACES's energy savings are close to the optimal and it delivers power proportionality while balancing the tradeoff between energy savings and reliability costs.
Journal ArticleDOI
A Survey of Task Allocation and Load Balancing in Distributed Systems
TL;DR: This survey mainly categorizes and reviews the representative studies on task allocation and load balancing according to the general characteristics of varying distributed systems and makes a comprehensive taxonomy on them.
Proceedings ArticleDOI
Lifetime or energy: Consolidating servers with reliability control in virtualized cloud datacenters
TL;DR: This paper proposes a Reliability-Aware server Consolidation stratEgy, named RACE, to address when and how to perform energy-efficient server consolidation in a reliability-friendly and profitable way, and develops an utility model that unifies multiple constraints on performance SLAs, reliability factors, and energy costs in a holistic manner.
Related Papers (5)
Foundations for the integration of scheduling techniques into compilers for parallel languages
Wolf Zimmermann,Welf Löwe +1 more