Improving utilization of heterogeneous clusters

doi:10.1007/S11227-020-03175-4

Journal ArticleDOI

Improving utilization of heterogeneous clusters

Esteban Stafford, +1 more

- 27 Jan 2020 -

The Journal of Supercomputing

- Vol. 76, Iss: 11, pp 8787-8800

Chats0

TLDR

This article presents a novel way of expressing computational capacity, more adequate for heterogeneous clusters, and also advocates for task migration in order to further improve the utilization.

Abstract:

This work has been supported by the Spanish Science and Technology Commission under contracts TIN2016-76635-C2-2-R and TIN2016-81840-REDT (CAPAP-H6 network) and the European HiPEAC Network of Excellence

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Performance and energy task migration model for heterogeneous clusters

Esteban Stafford, +1 more

- 23 Feb 2021 -

The Journal of Supercomputing

TL;DR: A set of linear regression models to predict the impact of task migration on different objectives, like performance and energy consumption, using a small set of parameters that are easily measurable are presented.

...read moreread less

Proceedings ArticleDOI

A Simulator for Intelligent Workload Managers in Heterogeneous Clusters

Adrian Herrera, +3 more

TL;DR: In this paper, a simulation framework for the study of workload managers based on deep reinforcement learning (DRL) techniques is proposed to simulate heterogeneous clusters based on multicore architectures, taking into account the contention in shared memory access and energy consumption.

...read moreread less

Book ChapterDOI

Task Scheduler for Heterogeneous Data Centres Based on Deep Reinforcement Learning

Mario A. Ibanez, +2 more

TL;DR: In this paper , the authors leverage machine learning techniques to develop a workload manager that will improve the efficiency of modern data centres, which can not only choose the most adequate job for scheduling, but also determine the best compute resources for its execution.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Exploiting process lifetime distributions for dynamic load balancing

Mor Harchol-Balter, +1 more

- 01 Aug 1997 -

ACM Transactions on Computer Systems

TL;DR: The measurements indicate that the distribution of lifetimes for a UNIX process is Pareto (heavy-tailed), with a consistent functional form over a variety of workloads, and it is shown how to apply this distribution to derive a preemptive migration policy that requires no hand-tuned parameters.

...read moreread less

Proceedings ArticleDOI

DMTCP: Transparent checkpointing for cluster computations and the desktop

Jason Ansel, +2 more

TL;DR: DMTCP as mentioned in this paper is a transparent user-level checkpointing package for distributed applications, which is used for the runCMS experiment of the Large Hadron Collider at CERN, and it can be incorporated and distributed as a checkpoint-restart module within some larger package.

...read moreread less

Proceedings ArticleDOI

Managing cost, performance, and reliability tradeoffs for energy-aware server provisioning

Brian Guenter, +2 more

TL;DR: ACES's energy savings are close to the optimal and it delivers power proportionality while balancing the tradeoff between energy savings and reliability costs.

...read moreread less

Journal ArticleDOI

A Survey of Task Allocation and Load Balancing in Distributed Systems

Yichuan Jiang

- 01 Feb 2016 -

IEEE Transactions on Parallel and Distri...

TL;DR: This survey mainly categorizes and reviews the representative studies on task allocation and load balancing according to the general characteristics of varying distributed systems and makes a comprehensive taxonomy on them.

...read moreread less

Proceedings ArticleDOI

Lifetime or energy: Consolidating servers with reliability control in virtualized cloud datacenters

Wei Deng, +5 more

TL;DR: This paper proposes a Reliability-Aware server Consolidation stratEgy, named RACE, to address when and how to perform energy-efficient server consolidation in a reliability-friendly and profitable way, and develops an utility model that unifies multiple constraints on performance SLAs, reliability factors, and energy costs in a holistic manner.

...read moreread less

Collapse

IEEE Transactions on Computers

Study of Scheduling in Programming Languages of Multi-Core Processor

Mina Hosseini-Rad, +2 more

Models and Languages for High-Performance Computing

Domenico Talia

Improving utilization of heterogeneous clusters

Citations

Performance and energy task migration model for heterogeneous clusters

A Simulator for Intelligent Workload Managers in Heterogeneous Clusters

Task Scheduler for Heterogeneous Data Centres Based on Deep Reinforcement Learning

References

Exploiting process lifetime distributions for dynamic load balancing

DMTCP: Transparent checkpointing for cluster computations and the desktop

Managing cost, performance, and reliability tradeoffs for energy-aware server provisioning

A Survey of Task Allocation and Load Balancing in Distributed Systems

Lifetime or energy: Consolidating servers with reliability control in virtualized cloud datacenters

Related Papers (5)

Linguistic support for heterogeneous parallel processing: a survey and an approach

Foundations for the integration of scheduling techniques into compilers for parallel languages

A Survey of Proposed Architectures for the Execution of Functional Languages

Study of Scheduling in Programming Languages of Multi-Core Processor

Models and Languages for High-Performance Computing