scispace - formally typeset
Y

Yungang Bao

Researcher at Chinese Academy of Sciences

Publications -  94
Citations -  1544

Yungang Bao is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Computer science & Cloud computing. The author has an hindex of 18, co-authored 82 publications receiving 1079 citations. Previous affiliations of Yungang Bao include Huawei.

Papers
More filters
Proceedings ArticleDOI

A software memory partition approach for eliminating bank-level interference in multicore systems

TL;DR: Main memory system is a shared resource in modern multicore machines, resulting in serious interference, which causes performance degradation in terms of throughput slowdown and unfairness.
Proceedings ArticleDOI

Who limits the resource efficiency of my datacenter: an analysis of Alibaba datacenter traces

TL;DR: A straightforward way is co-locating different workloads on the same hardware and to figure out the resource efficiency and understand the key characteristics of workloads in co-located cluster, an 8-day trace from Alibaba's production trace is analyzed.
Proceedings ArticleDOI

BestConfig: tapping the performance potential of systems via automatic configuration tuning

TL;DR: Best Config is a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload and is designed with an extensible architecture to automate the configuration tuning for general systems.
Proceedings ArticleDOI

Sketchlearn: relieving user burdens in approximate measurement with automated statistical inference

TL;DR: SketchLearn is designed, a novel sketch-based measurement framework that resolves resource conflicts by learning their statistical properties to eliminate conflicting traffic components and effectively supports network-wide measurement with limited resources.
Proceedings ArticleDOI

Fast implementation of DGEMM on Fermi GPU

TL;DR: This paper presents a thorough experience on tuning double-precision matrix-matrix multiplication (DGEM-M) on the Fermi GPU architecture and chooses an optimal algorithm with blocking in both shared memory and registers to satisfy the constraints of the Fermani memory hierarchy.