scispace - formally typeset
C

Chengyong Wu

Researcher at Chinese Academy of Sciences

Publications -  36
Citations -  2574

Chengyong Wu is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Compiler & Cache. The author has an hindex of 17, co-authored 36 publications receiving 2277 citations.

Papers
More filters
Proceedings ArticleDOI

DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning

TL;DR: This study designs an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance and energy, and shows that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s in a small footprint.
Proceedings ArticleDOI

A software memory partition approach for eliminating bank-level interference in multicore systems

TL;DR: Main memory system is a shared resource in modern multicore machines, resulting in serious interference, which causes performance degradation in terms of throughput slowdown and unfairness.
Proceedings ArticleDOI

Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators

TL;DR: This paper proposes to expand the application scope, error tolerance as well as the energy savings of inexact computing systems through neural network architectures, and demonstrates that the proposed inexact neural network accelerator could achieve 43.91%-62.49% savings in energy consumption.
Proceedings ArticleDOI

Evaluating iterative optimization across 1000 datasets

TL;DR: It is demonstrated that it is possible to derive a robust iterative optimization strategy across data sets and found that there exists at least one combination of compiler optimizations that achieves 86% or more of the best possible speedup across all data sets using Intel's ICC (83% for GNU's GCC).
Proceedings ArticleDOI

Neuromorphic accelerators: a comparison between neuroscience and machine-learning approaches

TL;DR: This study identifies the key sources of inaccuracy of SNN+STDP which are less related to the loss of information due to spike coding than to the nature of the STDP learning algorithm, and outlines that for the category of applications which require permanent online learning and moderate accuracy, SNN-STDP hardware accelerators could be a very cost-efficient solution.