Z
Zhen Zheng
Researcher at Alibaba Group
Publications - 15
Citations - 287
Zhen Zheng is an academic researcher from Alibaba Group. The author has contributed to research in topics: Deep learning & Speedup. The author has an hindex of 6, co-authored 15 publications receiving 101 citations. Previous affiliations of Zhen Zheng include Tsinghua University.
Papers
More filters
Posted Content
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan,Yi Rong,Chen Meng,Zongyan Cao,Siyu Wang,Zhen Zheng,Chuan Wu,Guoping Long,Jun Yang,Lixue Xia,Lansong Diao,Xiaoyong Liu,Wei Lin +12 more
TL;DR: DAPPLE, a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models, is proposed, which features a novel parallelization strategy planner to solve the partition and placement problems, and explores the optimal hybrid strategies of data and pipeline Parallelism.
Proceedings ArticleDOI
Understanding and bridging the gaps in current GNN performance optimizations
TL;DR: An in-depth examination of the state-of-the-art GNN frameworks is provided, revealing five major gaps in the current frameworks in optimizing GNN performance, especially in handling the special complexities of GNN over traditional graph or DNN operations.
Proceedings ArticleDOI
DAPPLE: a pipelined data parallel approach for training large models
Shiqing Fan,Yi Rong,Chen Meng,Zongyan Cao,Siyu Wang,Zhen Zheng,Chuan Wu,Guoping Long,Jun Yang,Lixue Xia,Lansong Diao,Xiaoyong Liu,Wei Lin +12 more
TL;DR: DAPPLE as mentioned in this paper is a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models, and it features a novel parallelization strategy planner to solve the partition and placement problems.
Proceedings ArticleDOI
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer
Haohuan Fu,Junfeng Liao,Wei Xue,Lanning Wang,Dexun Chen,Long Gu,Jinxiu Xu,Nan Ding,Xinliang Wang,Conghui He,Shizhen Xu,Yishuang Liang,Jiarui Fang,Yuanchao Xu,Weijie Zheng,Jingheng Xu,Zhen Zheng,Wanjing Wei,Xu Ji,He Zhang,Bingwei Chen,Kaiwei Li,Xiaomeng Huang,Wenguang Chen,Guangwen Yang +24 more
TL;DR: To map the large code base of CAM to the millions of cores on the Sunway system, OpenACC-based refactoring is taken as the major approach, and source-to-source translator tools are applied to exploit the most suitable parallelism for the CPE cluster.
Proceedings ArticleDOI
Versapipe: a versatile programming framework for pipelined computing on GPU
TL;DR: This paper proposes three new execution models equipped with much improved controllability, including a hybrid model that is capable of getting the strengths of all, and leads to the development of a software programming framework named VersaPipe.