scispace - formally typeset
L

Ligeng Zhu

Researcher at Massachusetts Institute of Technology

Publications -  29
Citations -  3236

Ligeng Zhu is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Deep learning & Overhead (computing). The author has an hindex of 12, co-authored 26 publications receiving 1723 citations. Previous affiliations of Ligeng Zhu include Simon Fraser University.

Papers
More filters
Proceedings Article

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

TL;DR: ProxylessNAS is presented, which can directly learn the architectures for large-scale target tasks and target hardware platforms and apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.

Deep Leakage from Gradients.

Ligeng Zhu, +1 more
TL;DR: This work shows that it is possible to obtain the private training data from the publicly shared gradients, and names this leakage as Deep Leakage from Gradient and empirically validate the effectiveness on both computer vision and natural language processing tasks.
Book ChapterDOI

Deep Leakage from Gradients

TL;DR: In this paper, the authors show that they can obtain the private training set from the publicly shared gradients, which is called deep leakage from gradient and practically validate the effectiveness of their algorithm on both computer vision and natural language processing tasks.
Posted Content

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

TL;DR: ProxylessNAS as mentioned in this paper proposes to directly learn the architectures for large-scale target tasks and target hardware platforms by training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs.
Posted Content

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

TL;DR: This work designs Hardware-Aware Transformers with neural architecture search, and trains a SuperTransformer that covers all candidates in the design space, and efficiently produces many SubTransformers with weight sharing, and performs an evolutionary search with a hardware latency constraint.