E
En-Yu Yang
Researcher at Harvard University
Publications - 14
Citations - 535
En-Yu Yang is an academic researcher from Harvard University. The author has contributed to research in topics: Deep learning & Computer science. The author has an hindex of 5, co-authored 9 publications receiving 295 citations. Previous affiliations of En-Yu Yang include National Tsing Hua University.
Papers
More filters
Proceedings ArticleDOI
A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors
Wei-Hao Chen,K. C. Li,Wei-Yu Lin,K. C. Hsu,Pin-Yi Li,Cheng-Han Yang,Cheng-Xin Xue,En-Yu Yang,Yen-Kai Chen,Yun-Sheng Chang,Tzu-Hsiang Hsu,Ya-Chin King,Chorng-Jung Lin,Ren-Shuo Liu,Chih-Cheng Hsieh,Kea-Tiong Tang,Meng-Fan Chang +16 more
TL;DR: Many artificial intelligence (AI) edge devices use nonvolatile memory (NVM) to store the weights for the neural network (trained off-line on an AI server), and require low-energy and fast I/O accesses.
Proceedings ArticleDOI
A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors
Win-San Khwa,Jia-Jing Chen,Jiafang Li,Xin Si,En-Yu Yang,Xiaoyu Sun,Rui Liu,Pai-Yu Chen,Qiang Li,Shimeng Yu,Meng-Fan Chang +10 more
TL;DR: This work implemented a 65nm 4Kb algorithm-dependent CIM-SRAM unit-macro and in-house binary DNN structure, for cost-aware DNN AI edge processors, and resulted in the first binary-based CIM -SRAM macro with the fastest PS operation, and the highest energy-efficiency among reported CIM macros.
Proceedings ArticleDOI
EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference
Thierry Tambe,Coleman Hooper,Lillian Pentecost,Tianyu Jia,En-Yu Yang,Marco Donato,Victor Sanh,Paul N. Whatmough,Alexander M. Rush,David Brooks,Gu-Yeon Wei +10 more
TL;DR: EdgeBERT as discussed by the authors employs entropy-based early exit predication in order to perform dynamic voltage-frequency scaling (DVFS), at a sentence granularity, for minimal energy consumption while adhering to a prescribed target latency.
Proceedings ArticleDOI
Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference
Thierry Tambe,En-Yu Yang,Zishen Wan,Yuntian Deng,Vijay Janapa Reddi,Alexander M. Rush,David Brooks,Gu-Yeon Wei +7 more
TL;DR: An algorithm-hardware co-design centered around a novel floating-point inspired number format, AdaptivFloat, that dynamically maximizes and optimally clips its available dynamic range, at a layer granularity, in order to create faithful encodings of neural network parameters.
Proceedings ArticleDOI
9.8 A 25mm 2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET
Thierry Tambe,En-Yu Yang,Glenn G. Ko,Yuji Chai,Coleman Hooper,Marco Donato,Paul N. Whatmough,Alexander M. Rush,David Brooks,Gu-Yeon Wei +9 more
TL;DR: In this paper, the encoder sequence is treated as a soft-addressable memory whose positions are weighted based on the state of the decoder RNN, which learns past and future temporal information by concatenating forward and backward time steps.