scispace - formally typeset
X

Xuehai Qian

Researcher at University of Southern California

Publications -  121
Citations -  4172

Xuehai Qian is an academic researcher from University of Southern California. The author has contributed to research in topics: Computer science & Speedup. The author has an hindex of 25, co-authored 107 publications receiving 2537 citations. Previous affiliations of Xuehai Qian include Rutgers University & Chinese Academy of Sciences.

Papers
More filters
Proceedings ArticleDOI

Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System

TL;DR: Evaluation results of Wonderland reveal that Wonderland achieves a drastic speedup over the other state-of-the-art systems, up to two orders of magnitude for certain cases.
Proceedings ArticleDOI

TIE: energy-efficient tensor train-based inference engine for deep neural network

TL;DR: A computation-efficient inference scheme for TT-format DNN, which enjoys two key merits: 1) it achieves theoretical limit of number of multiplications, thus eliminating all redundant computations; and 2) the multi-stage processing scheme reduces the intensive memory access to all tensor cores, bringing significant energy saving.
Posted Content

Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?

TL;DR: It is concluded that structured pruning has a greater potential compared to non-structured pruning and the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases.
Proceedings ArticleDOI

Neu-NoC: a high-efficient interconnection network for accelerated neuromorphic systems

TL;DR: This paper proposes Neu-NoC — a high-efficient interconnection network to reduce the redundant data traffic in neuromorphic acceleration systems and explores the data transfer ability between adjacent layers of fully-connected NNs.
Proceedings ArticleDOI

Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training

TL;DR: The proposed Prague, a high-performance heterogeneity-aware asynchronous decentralized training approach, achieves the above goal with intensive synchronization optimization by exploring the interplay between algorithm and system implementation, or statistical and hardware efficiency.