scispace - formally typeset
Search or ask a question
Author

P. Sadayappan

Bio: P. Sadayappan is an academic researcher from University of Utah. The author has contributed to research in topics: Scheduling (computing) & Compiler. The author has an hindex of 60, co-authored 406 publications receiving 13239 citations. Previous affiliations of P. Sadayappan include Ohio State University & Louisiana State University.


Papers
More filters
Proceedings ArticleDOI
07 Jun 2008
TL;DR: An automatic polyhedral source-to-source transformation framework that can optimize regular programs for parallelism and locality simultaneously simultaneously and is implemented into a tool to automatically generate OpenMP parallel code from C program sections.
Abstract: We present the design and implementation of an automatic polyhedral source-to-source transformation framework that can optimize regular programs (sequences of possibly imperfectly nested loops) for parallelism and locality simultaneously. Through this work, we show the practicality of analytical model-driven automatic transformation in the polyhedral model -- far beyond what is possible by current production compilers. Unlike previous works, our approach is an end-to-end fully automatic one driven by an integer linear optimization framework that takes an explicit view of finding good ways of tiling for parallelism and locality using affine transformations. The framework has been implemented into a tool to automatically generate OpenMP parallel code from C program sections. Experimental results from the tool show very high speedups for local and parallel execution on multi-cores over state-of-the-art compiler frameworks from the research community as well as the best native production compilers. The system also enables the easy use of powerful empirical/iterative optimization for general arbitrarily nested loop sequences.

930 citations

Proceedings ArticleDOI
24 Oct 2008
TL;DR: This paper has comprehensively evaluated several representative cache partitioning schemes with different optimization objectives, including performance, fairness, and quality of service (QoS) and provides new insights into dynamic behaviors and interaction effects.
Abstract: Cache partitioning and sharing is critical to the effective utilization of multicore processors. However, almost all existing studies have been evaluated by simulation that often has several limitations, such as excessive simulation time, absence of OS activities and proneness to simulation inaccuracy. To address these issues, we have taken an efficient software approach to supporting both static and dynamic cache partitioning in OS through memory address mapping. We have comprehensively evaluated several representative cache partitioning schemes with different optimization objectives, including performance, fairness, and quality of service (QoS). Our software approach makes it possible to run the SPEC CPU2006 benchmark suite to completion. Besides confirming important conclusions from previous work, we are able to gain several insights from whole-program executions, which are infeasible from simulation. For example, giving up some cache space in one program to help another one may improve the performance of both programs for certain workloads due to reduced contention for memory bandwidth. Our evaluation of previously proposed fairness metrics is also significantly different from a simulation-based study. The contributions of this study are threefold. (1) To the best of our knowledge, this is a highly comprehensive execution- and measurement-based study on multicore cache partitioning. This paper not only confirms important conclusions from simulation-based studies, but also provides new insights into dynamic behaviors and interaction effects. (2) Our approach provides a unique and efficient option for evaluating multicore cache partitioning. The implemented software layer can be used as a tool in multicore performance evaluation and hardware design. (3) The proposed schemes can be further refined for OS kernels to improve performance.

382 citations

Journal ArticleDOI
Edoardo Aprà1, Eric J. Bylaska1, W. A. de Jong2, Niranjan Govind1, Karol Kowalski1, T. P. Straatsma3, Marat Valiev1, H. J. J. van Dam4, Yuri Alexeev5, J. Anchell6, V. Anisimov5, Fredy W. Aquino, Raymond Atta-Fynn7, Jochen Autschbach8, Nicholas P. Bauman1, Jeffrey C. Becca9, David E. Bernholdt10, K. Bhaskaran-Nair11, Stuart Bogatko12, Piotr Borowski13, Jeffery S. Boschen14, Jiří Brabec15, Adam Bruner16, Emilie Cauet17, Y. Chen18, Gennady N. Chuev19, Christopher J. Cramer20, Jeff Daily1, M. J. O. Deegan, Thom H. Dunning21, Michel Dupuis8, Kenneth G. Dyall, George I. Fann10, Sean A. Fischer22, Alexandr Fonari23, Herbert A. Früchtl24, Laura Gagliardi20, Jorge Garza25, Nitin A. Gawande1, Soumen Ghosh20, Kurt R. Glaesemann1, Andreas W. Götz26, Jeff R. Hammond6, Volkhard Helms27, Eric D. Hermes28, Kimihiko Hirao, So Hirata29, Mathias Jacquelin2, Lasse Jensen9, Benny G. Johnson, Hannes Jónsson30, Ricky A. Kendall10, Michael Klemm6, Rika Kobayashi31, V. Konkov32, Sriram Krishnamoorthy1, M. Krishnan18, Zijing Lin33, Roberto D. Lins34, Rik J. Littlefield, Andrew J. Logsdail35, Kenneth Lopata36, Wan Yong Ma37, Aleksandr V. Marenich20, J. Martin del Campo38, Daniel Mejía-Rodríguez39, Justin E. Moore6, Jonathan M. Mullin, Takahito Nakajima, Daniel R. Nascimento1, Jeffrey A. Nichols10, P. J. Nichols40, J. Nieplocha1, Alberto Otero-de-la-Roza41, Bruce J. Palmer1, Ajay Panyala1, T. Pirojsirikul42, Bo Peng1, Roberto Peverati32, Jiri Pittner15, L. Pollack, Ryan M. Richard43, P. Sadayappan44, George C. Schatz45, William A. Shelton36, Daniel W. Silverstein46, D. M. A. Smith6, Thereza A. Soares47, Duo Song1, Marcel Swart, H. L. Taylor48, G. S. Thomas1, Vinod Tipparaju49, Donald G. Truhlar20, Kiril Tsemekhman, T. Van Voorhis50, Álvaro Vázquez-Mayagoitia5, Prakash Verma, Oreste Villa51, Abhinav Vishnu1, Konstantinos D. Vogiatzis52, Dunyou Wang53, John H. Weare26, Mark J. Williamson54, Theresa L. Windus14, Krzysztof Wolinski13, A. T. Wong, Qin Wu4, Chan-Shan Yang2, Q. Yu55, Martin Zacharias56, Zhiyong Zhang57, Yan Zhao58, Robert W. Harrison59 
Pacific Northwest National Laboratory1, Lawrence Berkeley National Laboratory2, National Center for Computational Sciences3, Brookhaven National Laboratory4, Argonne National Laboratory5, Intel6, University of Texas at Arlington7, State University of New York System8, Pennsylvania State University9, Oak Ridge National Laboratory10, Washington University in St. Louis11, Wellesley College12, Maria Curie-Skłodowska University13, Iowa State University14, Academy of Sciences of the Czech Republic15, University of Tennessee at Martin16, Université libre de Bruxelles17, Facebook18, Russian Academy of Sciences19, University of Minnesota20, University of Washington21, United States Naval Research Laboratory22, Georgia Institute of Technology23, University of St Andrews24, Universidad Autónoma Metropolitana25, University of California, San Diego26, Saarland University27, Sandia National Laboratories28, University of Illinois at Urbana–Champaign29, University of Iceland30, Australian National University31, Florida Institute of Technology32, University of Science and Technology of China33, Oswaldo Cruz Foundation34, Cardiff University35, Louisiana State University36, Chinese Academy of Sciences37, National Autonomous University of Mexico38, University of Florida39, Los Alamos National Laboratory40, University of Oviedo41, Prince of Songkla University42, Ames Laboratory43, University of Utah44, Northwestern University45, Universal Display Corporation46, Federal University of Pernambuco47, CD-adapco48, Cray49, Massachusetts Institute of Technology50, Nvidia51, University of Tennessee52, Shandong Normal University53, University of Cambridge54, Advanced Micro Devices55, Technische Universität München56, Stanford University57, Wuhan University of Technology58, Stony Brook University59
TL;DR: The NWChem computational chemistry suite is reviewed, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.
Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principle-driven methodologies to model complex chemical and materials processes. Over the past few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.

342 citations

Journal ArticleDOI
Edoardo Aprà, Eric J. Bylaska, W. A. de Jong, Niranjan Govind, Karol Kowalski, T. P. Straatsma, Marat Valiev, H. J. J. van Dam, Yuri Alexeev, James L. Anchell, Victor M. Anisimov, Fredy W. Aquino, Raymond Atta-Fynn, Jochen Autschbach, Nicholas P. Bauman, Jeffrey C. Becca, David E. Bernholdt, Kiran Bhaskaran-Nair, Stuart Bogatko, Piotr Borowski, Jeffrey Scott Boschen, Jiří Brabec, Adam Bruner, Emilie Cauet, Y. Chen, Gennady N. Chuev, Christopher J. Cramer, Jeff Daily, M. J. O. Deegan, Thomas Dunning, Michel Dupuis, Kenneth G. Dyall, George I. Fann, Sean A. Fischer, Alexandr Fonari, H. Früuchtl, Laura Gagliardi, Jorge Garza, Nitin A. Gawande, Sayan Ghosh, Kurt R. Glaesemann, Andreas W. Götz, Jeff R. Hammond, Volkhard Helms, Eric D. Hermes, Kimihiko Hirao, So Hirata, Mathias Jacquelin, Lasse Jensen, Benny G. Johnson, Hannes Jónsson, Ricky A. Kendall, Michael Klemm, Rika Kobayashi, V. Konkov, Sriram Krishnamoorthy, Manojkumar Krishnan, Zijing Lin, Roberto D. Lins, Rik J. Littlefield, Andrew J. Logsdail, Kenneth Lopata, Wan Yong Ma, Aleksandr V. Marenich, J. Martin del Campo, Daniel Mejía-Rodríguez, Justin E. Moore, Jonathan M. Mullin, Takahito Nakajima, Daniel R. Nascimento, Jeffrey A. Nichols, Patrick Nichols, J. Nieplocha, A. Otero de la Roza, Bruce J. Palmer, Ajay Panyala, T. Pirojsirikul, Bo Peng, Roberto Peverati, Jiri Pittner, L. Pollack, Ryan M. Richard, P. Sadayappan, George C. Schatz, William A. Shelton, Daniel W. Silverstein, Dayle M. A. Smith, Thereza A. Soares, Duo Song, Marcel Swart, H. L. Taylor, G. S. Thomas, Vinod Tipparaju, Donald G. Truhlar, Kiril Tsemekhman, T. Van Voorhis, Álvaro Vázquez-Mayagoitia, Prakash Verma, Oreste Villa, Abhinav Vishnu, Konstantinos D. Vogiatzis, Dunyou Wang, John H. Weare, Mark J. Williamson, T. L. Windus, Krzysztof Wolinski, A. T. Wong, Qin Wu, Chan-Shan Yang, Q. Yu, Martin Zacharias, Zhiyong Zhang, Yan Zhao, Robert W. Harrison 
TL;DR: The NWChem computational chemistry suite as discussed by the authors provides tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties by using first-principledriven methodologies to model complex chemical and materials processes.
Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials processes. Over the last few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach and outlook.

314 citations

Proceedings ArticleDOI
14 Nov 2009
TL;DR: This work investigates the design and scalability of work stealing on modern distributed memory systems and demonstrates high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.
Abstract: Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. Scalable dynamic load balancing on large clusters is a challenging problem which can be addressed with distributed dynamic load balancing systems. Work stealing is a popular approach to distributed dynamic load balancing; however its performance on large-scale clusters is not well understood. Prior work on work stealing has largely focused on shared memory machines. In this work we investigate the design and scalability of work stealing on modern distributed memory systems. We demonstrate high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.

286 citations


Cited by
More filters
MonographDOI
01 Jan 2006
TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.
Abstract: Planning algorithms are impacting technical disciplines and industries around the world, including robotics, computer-aided design, manufacturing, computer graphics, aerospace applications, drug design, and protein folding. This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms. The treatment is centered on robot motion planning but integrates material on planning in discrete spaces. A major part of the book is devoted to planning under uncertainty, including decision theory, Markov decision processes, and information spaces, which are the “configuration spaces” of all sensor-based planning problems. The last part of the book delves into planning under differential constraints that arise when automating the motions of virtually any mechanical system. Developed from courses taught by the author, the book is intended for students, engineers, and researchers in robotics, artificial intelligence, and control theory as well as computer graphics, algorithms, and computational biology.

6,340 citations

Journal ArticleDOI
24 Jan 2005
TL;DR: It is shown that such an approach can yield an implementation of the discrete Fourier transform that is competitive with hand-optimized libraries, and the software structure that makes the current FFTW3 version flexible and adaptive is described.
Abstract: FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific single-instruction, multiple-data (SIMD) instructions, and how a special-purpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm.

5,172 citations

Journal ArticleDOI
Tamar Frankel1
TL;DR: The Essay concludes that practitioners theorize, and theorists practice, use these intellectual tools differently because the goals and orientations of theorists and practitioners, and the constraints under which they act, differ.
Abstract: Much has been written about theory and practice in the law, and the tension between practitioners and theorists. Judges do not cite theoretical articles often; they rarely "apply" theories to particular cases. These arguments are not revisited. Instead the Essay explores the working and interaction of theory and practice, practitioners and theorists. The Essay starts with a story about solving a legal issue using our intellectual tools - theory, practice, and their progenies: experience and "gut." Next the Essay elaborates on the nature of theory, practice, experience and "gut." The third part of the Essay discusses theories that are helpful to practitioners and those that are less helpful. The Essay concludes that practitioners theorize, and theorists practice. They use these intellectual tools differently because the goals and orientations of theorists and practitioners, and the constraints under which they act, differ. Theory, practice, experience and "gut" help us think, remember, decide and create. They complement each other like the two sides of the same coin: distinct but inseparable.

2,077 citations