L
Lansong Diao
Researcher at Alibaba Group
Publications - 12
Citations - 147
Lansong Diao is an academic researcher from Alibaba Group. The author has contributed to research in topics: Computer science & Speedup. The author has an hindex of 3, co-authored 7 publications receiving 37 citations.
Papers
More filters
Posted Content
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan,Yi Rong,Chen Meng,Zongyan Cao,Siyu Wang,Zhen Zheng,Chuan Wu,Guoping Long,Jun Yang,Lixue Xia,Lansong Diao,Xiaoyong Liu,Wei Lin +12 more
TL;DR: DAPPLE, a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models, is proposed, which features a novel parallelization strategy planner to solve the partition and placement problems, and explores the optimal hybrid strategies of data and pipeline Parallelism.
Proceedings ArticleDOI
DAPPLE: a pipelined data parallel approach for training large models
Shiqing Fan,Yi Rong,Chen Meng,Zongyan Cao,Siyu Wang,Zhen Zheng,Chuan Wu,Guoping Long,Jun Yang,Lixue Xia,Lansong Diao,Xiaoyong Liu,Wei Lin +12 more
TL;DR: DAPPLE as mentioned in this paper is a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models, and it features a novel parallelization strategy planner to solve the partition and placement problems.
Posted Content
FusionStitching: Boosting Memory Intensive Computations for Deep Learning Workloads.
Zhen Zheng,Pengzhan Zhao,Guoping Long,Feiwen Zhu,Kai Zhu,Wenyi Zhao,Lansong Diao,Jun Yang,Wei Lin +8 more
TL;DR: This work proposes FusionStitching, a Deep Learning compiler capable of fusing memory intensive operators, with varied data dependencies and non-homogeneous parallelism, into large GPU kernels to reduce global memory access and operation scheduling overhead automatically and tunes the optimal stitching scheme just-in-time with a domain-specific cost model efficiently.
Proceedings ArticleDOI
DISC: A Dynamic Shape Compiler for Machine Learning Workloads
Kai Zhu,Wenyi Zhao,Zhen Zheng,Tianyou Guo,Pengzhan Zhao,Junjie Bai,Jun Yang,Xiaoyong Liu,Lansong Diao,Wei Lin +9 more
TL;DR: DISC as discussed by the authors enriches a set of IR to form a fully dynamic shape representation and generates the runtime flow at compile time to support processing dynamic shape based logic, which avoids the interpretation overhead at runtime and enlarges the opportunity of host-device co-optimization.
Proceedings ArticleDOI
PAI-FCNN: FPGA Based Inference System for Complex CNN Models
Lixue Xia,Lansong Diao,Zhao Jiang,Hao Liang,Kai Chen,Li Ding,Shunli Dou,Zibin Su,Meng Sun,Jiansong Zhang,Wei Lin +10 more
TL;DR: This paper presents the design of an FPGA-based CNN inference system, PAI-FCNN, to support modern complex CNN models, which achieves better throughput and power efficiency than GPU solutions.