G
Guangyu Sun
Researcher at Peking University
Publications - 151
Citations - 7250
Guangyu Sun is an academic researcher from Peking University. The author has contributed to research in topics: Cache & Non-volatile memory. The author has an hindex of 35, co-authored 144 publications receiving 5930 citations. Previous affiliations of Guangyu Sun include Tsinghua University & IBM.
Papers
More filters
Proceedings ArticleDOI
Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks
TL;DR: This work implements a CNN accelerator on a VC707 FPGA board and compares it to previous approaches, achieving a peak performance of 61.62 GFLOPS under 100MHz working frequency, which outperform previous approaches significantly.
Proceedings ArticleDOI
A novel architecture of the 3D stacked MRAM L2 cache for CMPs
TL;DR: This paper stacks MRAM-based L2 caches directly atop CMPs and compares it against SRAM counterparts in terms of performance and energy, and proposes two architectural techniques: read-preemptive write buffer and SRAM-MRAM hybrid L2 cache.
Proceedings ArticleDOI
Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement
TL;DR: The experimental results show that MRAM stacking offers competitive IPC performance with a large reduction in power consumption compared to SRAM and DRAM counterparts.
Proceedings ArticleDOI
FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates
Yijin Guan,Hao Liang,Ningyi Xu,Wenqiang Wang,Shaoshuai Shi,Xi Chen,Guangyu Sun,Wei Zhang,Jason Cong +8 more
TL;DR: FP-DNN (Field Programmable DNN), an end-to-end framework that takes TensorFlow-described DNNs as input, and automatically generates the hardware implementations on FPGA boards with RTL-HLS hybrid templates, is proposed.
Journal ArticleDOI
Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks
TL;DR: This paper designs and implements Caffeine, a hardware/software co-designed library to efficiently accelerate the entire CNN and FCN on FPGAs, and integrates it into the industry-standard software deep learning framework Caffe.