scispace - formally typeset
Y

Yubin Li

Researcher at Tsinghua University

Publications -  7
Citations -  935

Yubin Li is an academic researcher from Tsinghua University. The author has contributed to research in topics: Speedup & Hardware acceleration. The author has an hindex of 4, co-authored 7 publications receiving 790 citations.

Papers
More filters
Proceedings ArticleDOI

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA

TL;DR: The Efficient Speech Recognition Engine (ESE) as discussed by the authors proposes a load-balance-aware pruning method that can compress the LSTM model size by 20x (10x from pruning and 2x from quantization).
Posted Content

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA

TL;DR: This work proposes a load-balance-aware pruning method that can compress the LSTM model size by 20x (10x from pruning and 2x from quantization) with negligible loss of the prediction accuracy, and proposes a scheduler that encodes and partitions the compressed model to multiple PEs for parallelism and schedule the complicated L STM data flow.
Posted Content

ESE: Efficient Speech Recognition Engine with Compressed LSTM on FPGA

TL;DR: This work proposes a load-balance-aware pruning method that can compress the LSTM model size by 20× (10× from pruning and 2× from quantization) with negligible loss of the prediction accuracy, and designs the hardware architecture, named Efficient Speech Recognition Engine (ESE) that works directly on the compressed model.
Patent

Efficient data access control device for neural network hardware acceleration system

Yubin Li, +2 more
TL;DR: In this article, the authors propose an overall design of a device that can process data receiving, bitwidth transformation and data storing, by employing the technical disclosure, neural network hardware acceleration system can avoid the data access process becomes the bottleneck in neural network computation.
Proceedings ArticleDOI

Streaming sorting network based BWT acceleration on FPGA for lossless compression

TL;DR: A novel BWT accelerator based on the streaming sorting network that achieves 14.3X speedup compared with the state-of-art work when the data block size is 4KB and a lossless data compression system based on this accelerator.