Q
Qingda Lu
Researcher at Ohio State University
Publications - 18
Citations - 1162
Qingda Lu is an academic researcher from Ohio State University. The author has contributed to research in topics: Cache & Tensor contraction. The author has an hindex of 13, co-authored 17 publications receiving 1100 citations. Previous affiliations of Qingda Lu include Intel & Louisiana State University.
Papers
More filters
Proceedings ArticleDOI
Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems
TL;DR: This paper has comprehensively evaluated several representative cache partitioning schemes with different optimization objectives, including performance, fairness, and quality of service (QoS) and provides new insights into dynamic behaviors and interaction effects.
Journal ArticleDOI
Synthesis of High-Performance Parallel Programs for a Class of ab Initio Quantum Chemistry Models
Gerald Baumgartner,Alexander A. Auer,David E. Bernholdt,Alina Bibireata,Venkatesh Choppella,Daniel Cociorva,Xiaoyang Gao,Robert W. Harrison,So Hirata,Sriram Krishnamoorthy,Sandhya Krishnan,Chi-Chung Lam,Qingda Lu,Marcel Nooijen,Russell M. Pitzer,J. Ramanujam,P. Sadayappan,A. Sibiryakov +17 more
TL;DR: This paper provides an overview of a program synthesis system for a class of quantum chemistry computations, expressible as a set of tensor contractions and arise in electronic structure modeling.
Journal ArticleDOI
Automatic code generation for many-body electronic structure methods: the tensor contraction engine‡‡
Alexander A. Auer,Gerald Baumgartner,David E. Bernholdt,Alina Bibireata,Venkatesh Choppella,Daniel Cociorva,Xiaoyang Gao,Robert W. Harrison,Sriram Krishnamoorthy,Sandhya Krishnan,Chi-Chung Lam,Qingda Lu,Marcel Nooijen,Russell M. Pitzer,J. Ramanujam,P. Sadayappan,Alexander Sibiryakov +16 more
TL;DR: An overview of the Tensor Contraction Engine (TCE), a unique effort to address issues of both productivity and performance through automatic code generation that acts like an optimizing compiler.
Proceedings ArticleDOI
Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
Qingda Lu,Christophe Alias,Uday Bondhugula,Thomas Henretty,Sriram Krishnamoorthy,J. Ramanujam,Atanas Rountev,P. Sadayappan,Yongjian Chen,Haibo Lin,Tin-Fook Ngai +10 more
TL;DR: This paper develops a compile-time framework for data locality optimization via data layout transformation using a polyhedral model and demonstrates the effectiveness of the approach on a 16-core 2D tiled CMP.
Proceedings ArticleDOI
PARDA: A Fast Parallel Reuse Distance Analysis Algorithm
TL;DR: This paper presents the first parallel algorithm to compute accurate reuse distances by analysis of memory address traces, using a tunable parameter that enables faster analysis when the maximum needed reuse distance is limited by a cache size upper bound.