K
Kaiyuan Guo
Researcher at Tsinghua University
Publications - 35
Citations - 2290
Kaiyuan Guo is an academic researcher from Tsinghua University. The author has contributed to research in topics: Field-programmable gate array & Convolutional neural network. The author has an hindex of 11, co-authored 27 publications receiving 1683 citations.
Papers
More filters
Proceedings ArticleDOI
Going Deeper with Embedded FPGA Platform for Convolutional Neural Network
Jiantao Qiu,Jie Wang,Song Yao,Kaiyuan Guo,Boxun Li,Erjin Zhou,Jincheng Yu,Tianqi Tang,Ningyi Xu,Sen Song,Yu Wang,Huazhong Yang +11 more
TL;DR: This paper presents an in-depth analysis of state-of-the-art CNN models and shows that Convolutional layers are computational-centric and Fully-Connected layers are memory-centric, and proposes a CNN accelerator design on embedded FPGA for Image-Net large-scale image classification.
Journal ArticleDOI
Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA
Kaiyuan Guo,Lingzhi Sui,Jiantao Qiu,Jincheng Yu,Wang Junbin,Song Yao,Song Han,Yu Wang,Huazhong Yang +8 more
TL;DR: This paper proposes Angel-Eye, a programmable and flexible CNN accelerator architecture, together with data quantization strategy and compilation tool, which achieves similar performance and delivers up to better energy efficiency than peer FPGA implementation on the same platform.
Journal ArticleDOI
[DL] A Survey of FPGA-based Neural Network Inference Accelerators
TL;DR: Neural networks are now widely adopted and have shown a significant advantage in machine learning over traditional algorithms based on handcrafted features and models.
Posted Content
A Survey of FPGA Based Neural Network Accelerator
TL;DR: An investigation from software to hardware, from circuit level to system level is carried out to complete analysis of FPGA-based neural network inference accelerator design and serves as a guide to future work.
Journal ArticleDOI
Software-Hardware Codesign for Efficient Neural Network Acceleration
TL;DR: An overview of DeePhi's technology flow, including compression, compilation, and hardware acceleration, is provided, which aims to achieve extremely high energy efficiency for both client and datacenter applications with convolutional neural network and recurrent neural network.