Showing papers by "Richard Harper published in 2022"

PDF

Open Access

Journal Article•DOI•

Horus: Interference-Aware and Prediction-Based Scheduling in Deep Learning Systems

[...]

Gingfung Yeung¹, Damian Borowiec¹, Renyu Yang², Adrian Friday¹, Richard Harper¹, Peter Garraghan¹ - Show less +2 more•Institutions (2)

Lancaster University¹, University of Leeds²

01 Jan 2022-IEEE Transactions on Parallel and Distributed Systems

TL;DR: In this article, an interference-aware and prediction-based resource manager for DL systems is proposed, which proactively predicts GPU utilization of heterogeneous DL jobs extrapolated from the DL model's computation graph features, removing the need for online profiling and isolated reserved GPUs.

...read moreread less

Abstract: To accelerate the training of Deep Learning (DL) models, clusters of machines equipped with hardware accelerators such as GPUs are leveraged to reduce execution time. State-of-the-art resource managers are needed to increase GPU utilization and maximize throughput. While co-locating DL jobs on the same GPU has been shown to be effective, this can incur interference causing slowdown. In this article we propose Horus: an interference-aware and prediction-based resource manager for DL systems. Horus proactively predicts GPU utilization of heterogeneous DL jobs extrapolated from the DL model’s computation graph features, removing the need for online profiling and isolated reserved GPUs. Through micro-benchmarks and job co-location combinations across heterogeneous GPU hardware, we identify GPU utilization as a general proxy metric to determine good placement decisions, in contrast to current approaches which reserve isolated GPUs to perform online profiling and directly measure GPU utilization for each unique submitted job. Our approach promotes high resource utilization and makespan reduction; via real-world experimentation and large-scale trace driven simulation, we demonstrate that Horus outperforms other DL resource managers by up to 61.5 percent for GPU resource utilization, 23.7–30.7 percent for makespan reduction and 68.3 percent in job wait time reduction.

...read moreread less

32 citations

Book Chapter•DOI•

Deep Soil Carbon: Characteristics and Measurement with Particular Bearing on Kaolinitic Profiles

[...]

Richard Harper¹•Institutions (1)

Lancaster University¹

01 Jan 2022

2 citations