scispace - formally typeset
G

Gu-Yeon Wei

Researcher at Harvard University

Publications -  16
Citations -  438

Gu-Yeon Wei is an academic researcher from Harvard University. The author has contributed to research in topics: Hardware acceleration & Multi-core processor. The author has an hindex of 8, co-authored 16 publications receiving 296 citations.

Papers
More filters
Proceedings ArticleDOI

DeepRecSys: a system for optimizing end-to-end at-scale neural recommendation inference

TL;DR: DeepRecSched as mentioned in this paper is a recommendation inference scheduler that maximizes latency-bounded throughput by taking into account characteristics of inference query size and arrival patterns, model architectures, and underlying hardware systems.
Journal ArticleDOI

ReVIVaL: A Variation-Tolerant Architecture Using Voltage Interpolation and Variable Latency

TL;DR: ReVIVaL is presented, which combines two fine-grained post-fabrication tuning techniques---voltage interpolation(VI) and variable latency(VL) and shows that the frequency variation between chips, between cores on one chip, and between functional units within cores can be reduced to a very small range.
Journal ArticleDOI

Revival: A Variation-Tolerant Architecture Using Voltage Interpolation and Variable Latency

TL;DR: The revival technique combines the post-fabrication tuning techniques voltage interpolation and variable latency (VL) to reduce such frequency variations in process variations.
Posted Content

DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference.

TL;DR: DeepRecSched is proposed, a recommendation inference scheduler that maximizes latency-bounded throughput by taking into account characteristics of inference query size and arrival patterns, model architectures, and underlying hardware systems.
Journal ArticleDOI

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads

TL;DR: SMAUG as discussed by the authors is a DNN framework that is purpose-built for simulation of end-to-end deep learning applications, which can be used to evaluate the performance and energy efficiency of deep learning workloads.