scispace - formally typeset
X

Xuehai Qian

Researcher at University of Southern California

Publications -  121
Citations -  4172

Xuehai Qian is an academic researcher from University of Southern California. The author has contributed to research in topics: Computer science & Speedup. The author has an hindex of 25, co-authored 107 publications receiving 2537 citations. Previous affiliations of Xuehai Qian include Rutgers University & Chinese Academy of Sciences.

Papers
More filters
Proceedings ArticleDOI

Exploring the hidden dimension in graph processing

TL;DR: 3D partitioning is presented, a novel category of task partition algorithms that significantly reduces network traffic for certain MLDM applications and a distributed graph engine CUBE is built, which outperforms state-of-the-art graph-parallel system PowerLyra by up to 4:7× speedup.
Proceedings ArticleDOI

Datasize-Aware High Dimensional Configurations Auto-Tuning of In-Memory Cluster Computing

TL;DR: DAC is a significant advance over the state-of-the-art because it can take the size of input dataset and 41 configuration parameters as the parameters of the performance model for a given IMC program, --- unprecedented in previous work.
Journal ArticleDOI

HEIF: Highly Efficient Stochastic Computing-Based Inference Framework for Deep Neural Networks

TL;DR: HEIF is presented, a highly efficient SC-based inference framework of the large-scale DCNNs, with broad applications including (but not limited to) LeNet-5 and AlexNet, that achieves high energy efficiency and low area/hardware cost.
Proceedings Article

Squeezing out all the value of loaded data: an out-of-core graph processing system with reduced disk I/O

TL;DR: It is shown that out-of-core graph processing systems uniquely provide the opportunities to lift the restrictions of the programming and execution model in a feasible manner, which enable efficient algorithms that require drastically less number of iterations.
Proceedings ArticleDOI

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

TL;DR: The Efficient RNN (E-RNN) framework is presented, and the alternating direction method of multipliers (ADMM) technique is used for more accurate block-circulant training, and two design explorations providing guidance on block size and reducing RNN training trials are presented.