Shiwei Liu

Researcher at Fudan University

Publications - 5

Citations - 14

Shiwei Liu is an academic researcher from Fudan University. The author has contributed to research in topics: Clock rate & Convolutional neural network. The author has an hindex of 1, co-authored 3 publications receiving 4 citations.

Papers

PDF

Open Access

More filters

Journal ArticleDOI

A Communication-Aware DNN Accelerator on ImageNet Using In-Memory Entry-Counting Based Algorithm-Circuit-Architecture Co-Design in 65-nm CMOS

Haozhe Zhu, +7 more

- 07 Aug 2020 -

IEEE Journal on Emerging and Selected To...

TL;DR: This article presents a communication-aware processing-in-memory deep neural network accelerator, which implements an in-memory entry-counting scheme for low bit-width quantized multiplication-and-accumulations (MACs) to maintain good accuracy on ImageNet.

...read moreread less

Proceedings ArticleDOI

Systolic-Array Deep-Learning Acceleration Exploring Pattern-Indexed Coordinate-Assisted Sparsity for Real-Time On-Device Speech Processing

Shiwei Liu, +5 more

TL;DR: In this paper, a hardware-software co-design for efficient sparse deep neural networks (DNNs) implementation in a regular systolic array for real-time on-device speech processing is presented.

...read moreread less

DOI

A Scalable Die-to-Die Interconnect with Replay and Repair Schemes for 2.5D/3D Integration

Bo Jiao, +8 more

TL;DR: In this article , a scalable D2D interconnect with replay and repair schemes is presented for high efficiency, which can be configured down to power consumption as low as 0.55pJ/bit and 38.40Gbps throughput.

...read moreread less

Proceedings ArticleDOI

XNORAM: An Efficient Computing-in-Memory Architecture for Binary Convolutional Neural Networks with Flexible Dataflow Mapping

Shiwei Liu, +4 more

TL;DR: An energy-efficient computing-inmemory architecture for binary convolutional neural networks, called XNORAM, is proposed which achieves a throughput of 18.86 TOPS/W and 4.63 GOPS/KB utilization with only 1.3% accuracy loss comparing to the original XNOR-Net result on GPUs.

...read moreread less

Proceedings ArticleDOI

A 200M-Query-Vector/s Computing-in-RRAM ADC-less k-Nearest-Neighbor Accelerator with Time-Domain Winner-Takes-All Circuits

Chen Mu, +7 more

TL;DR: This paper proposed a computing-in-RRAM ADC-less k-nearest-neighbor accelerator with time-domain winner-takes-all circuits, which performs up to 200 million query vectors per second while consuming 0.75 mW, demonstrating 24.5 × energy performance improvement over prior works.

...read moreread less