G
Geng Yuan
Researcher at Northeastern University
Publications - 81
Citations - 1174
Geng Yuan is an academic researcher from Northeastern University. The author has contributed to research in topics: Computer science & Pruning (decision trees). The author has an hindex of 13, co-authored 58 publications receiving 623 citations. Previous affiliations of Geng Yuan include Syracuse University.
Papers
More filters
Proceedings ArticleDOI
CirCNN: accelerating and compressing deep neural networks using block-circulant weight matrices
Caiwen Ding,Siyu Liao,Yanzhi Wang,Zhe Li,Ning Liu,Youwei Zhuo,Chao Wang,Xuehai Qian,Yu Bai,Geng Yuan,Xiaolong Ma,Yipeng Zhang,Jian Tang,Qinru Qiu,Xue Lin,Bo Yuan +15 more
TL;DR: The CirCNN architecture is proposed, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales, etc) and FFT can be used as the key computing kernel which ensures universal and small-footprint implementations.
Proceedings ArticleDOI
CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices
Caiwen Ding,Siyu Liao,Yanzhi Wang,Zhe Li,Ning Liu,Youwei Zhuo,Chao Wang,Xuehai Qian,Yu Bai,Geng Yuan,Xiaolong Ma,Yipeng Zhang,Jian Tang,Qinru Qiu,Xue Lin,Bo Yuan +15 more
TL;DR: CirCNN as discussed by the authors utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) and the storage complexity from O(n2) to O(nlogn) with negligible accuracy loss.
Proceedings ArticleDOI
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li,Geng Yuan,Yang Wen,Eric Hu,Georgios Evangelidis,Sergey Tulyakov,Yanzhi Wang,Jiansong Ren +7 more
TL;DR: This work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance 1 based architectures, whereby the latency-driven analysis of ViT architecture and the experimental results validate the claim: powerful vision transformer can achieve ultra-fast inference speed on the edge.
Posted Content
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design
TL;DR: This work proposes YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design, and proposes a novel block-punched pruning scheme for any kernel size.
Posted Content
Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?
Xiaolong Ma,Sheng Lin,Shaokai Ye,Zhezhi He,Linfeng Zhang,Geng Yuan,Sia Huat Tan,Zhengang Li,Deliang Fan,Xuehai Qian,Xue Lin,Kaisheng Ma,Yanzhi Wang +12 more
TL;DR: It is concluded that structured pruning has a greater potential compared to non-structured pruning and the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases.