scispace - formally typeset
Y

Yujun Lin

Researcher at Massachusetts Institute of Technology

Publications -  31
Citations -  3869

Yujun Lin is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Deep learning & Hardware acceleration. The author has an hindex of 15, co-authored 29 publications receiving 2232 citations. Previous affiliations of Yujun Lin include Yale University & Tsinghua University.

Papers
More filters
Proceedings ArticleDOI

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

TL;DR: APQ is presented, a novel design methodology for efficient deep learning deployment that designs to optimize the neural network architecture, pruning policy, and quantization policy in a joint manner and uses predictor-transfer technique to get the quantization-aware accuracy predictor.
Proceedings ArticleDOI

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

TL;DR: A configurable multi-precision CNN computing framework based on single bit RRAM, which consists of an RRAM computing overhead aware network quantization algorithm and a configurablemulti-pre precision CNN computing architecture based on one bit R RAM.
Proceedings Article

Lite Transformer with Long-Short Range Attention

TL;DR: Lite Transformer as mentioned in this paper proposes a Long Short Range Attention (LSRA) model, where one group of heads specialize in the local context modeling (by convolution) while another group captures the long-distance relationship (by attention).
Journal ArticleDOI

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

TL;DR: This article provides an overview of efficient deep learning methods, systems, and applications by introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design.
Proceedings ArticleDOI

Long live TIME: improving lifetime for training-in-memory engines by structured gradient sparsification

TL;DR: This work proposed an effective framework, SGS-ARS, including Structured Gradient Sparsification (SGS) and Aging-aware Row Swapping (ARS) scheme, to guarantee write balance across whole RRAM crossbars and prolong the lifetime of TIME.