Y
Yujun Lin
Researcher at Massachusetts Institute of Technology
Publications - 31
Citations - 3869
Yujun Lin is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Deep learning & Hardware acceleration. The author has an hindex of 15, co-authored 29 publications receiving 2232 citations. Previous affiliations of Yujun Lin include Yale University & Tsinghua University.
Papers
More filters
Proceedings ArticleDOI
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
TL;DR: APQ is presented, a novel design methodology for efficient deep learning deployment that designs to optimize the neural network architecture, pruning policy, and quantization policy in a joint manner and uses predictor-transfer technique to get the quantization-aware accuracy predictor.
Proceedings ArticleDOI
A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM
TL;DR: A configurable multi-precision CNN computing framework based on single bit RRAM, which consists of an RRAM computing overhead aware network quantization algorithm and a configurablemulti-pre precision CNN computing architecture based on one bit R RAM.
Proceedings Article
Lite Transformer with Long-Short Range Attention
TL;DR: Lite Transformer as mentioned in this paper proposes a Long Short Range Attention (LSRA) model, where one group of heads specialize in the local context modeling (by convolution) while another group captures the long-distance relationship (by attention).
Journal ArticleDOI
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
TL;DR: This article provides an overview of efficient deep learning methods, systems, and applications by introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design.
Proceedings ArticleDOI
Long live TIME: improving lifetime for training-in-memory engines by structured gradient sparsification
TL;DR: This work proposed an effective framework, SGS-ARS, including Structured Gradient Sparsification (SGS) and Aging-aware Row Swapping (ARS) scheme, to guarantee write balance across whole RRAM crossbars and prolong the lifetime of TIME.