scispace - formally typeset
Search or ask a question
Author

Justus A. Calvin

Bio: Justus A. Calvin is an academic researcher from Virginia Tech. The author has contributed to research in topics: Massively parallel & Matrix (mathematics). The author has an hindex of 8, co-authored 11 publications receiving 300 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The features and capabilities of MADNESS are described and some current applications in chemistry and several areas of physics are discussed.
Abstract: MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.

77 citations

Journal ArticleDOI
TL;DR: The MADNESS (multiresolution adaptive numerical environment for scientific simulation) as mentioned in this paper is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision that are based on multiresolution analysis and separated representations.
Abstract: MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision that are based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.

74 citations

Journal ArticleDOI
TL;DR: A new distributed-memory massively parallel implementation of standard and explicitly correlated (F12) coupled-cluster singles and doubles (CCSD) with canonical O(N6) computational complexity is described, based on the TiledArray tensor framework.
Abstract: A new distributed-memory massively parallel implementation of standard and explicitly correlated (F12) coupled-cluster singles and doubles (CCSD) with canonical O(N6) computational complexity is described. The implementation is based on the TiledArray tensor framework. Novel features of the implementation include (a) all data greater than O(N) is distributed in memory and (b) the mixed use of density fitting and integral-driven formulations that optionally allows to avoid storage of tensors with three and four unoccupied indices. Excellent strong scaling is demonstrated on a multicore shared-memory computer, a commodity distributed-memory computer, and a national-scale supercomputer. The performance on a shared-memory computer is competitive with the popular CCSD implementations in ORCA and Psi4. Moreover, the CCSD performance on a commodity-size cluster significantly improves on the state-of-the-art package NWChem. The large-scale parallel explicitly correlated coupled-cluster implementation makes routin...

51 citations

Proceedings ArticleDOI
15 Nov 2015
TL;DR: A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication, is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC).
Abstract: A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization. Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework).

35 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A discussion of many of the recently implemented features of GAMESS (General Atomic and Molecular Electronic Structure System) and LibCChem (the C++ CPU/GPU library associated with GAMESS) is presented, which include fragmentation methods, hybrid MPI/OpenMP approaches to Hartree-Fock, and resolution of the identity second order perturbation theory.
Abstract: A discussion of many of the recently implemented features of GAMESS (General Atomic and Molecular Electronic Structure System) and LibCChem (the C++ CPU/GPU library associated with GAMESS) is presented. These features include fragmentation methods such as the fragment molecular orbital, effective fragment potential and effective fragment molecular orbital methods, hybrid MPI/OpenMP approaches to Hartree-Fock, and resolution of the identity second order perturbation theory. Many new coupled cluster theory methods have been implemented in GAMESS, as have multiple levels of density functional/tight binding theory. The role of accelerators, especially graphical processing units, is discussed in the context of the new features of LibCChem, as it is the associated problem of power consumption as the power of computers increases dramatically. The process by which a complex program suite such as GAMESS is maintained and developed is considered. Future developments are briefly summarized.

575 citations

Journal ArticleDOI
Edoardo Aprà1, Eric J. Bylaska1, W. A. de Jong2, Niranjan Govind1, Karol Kowalski1, T. P. Straatsma3, Marat Valiev1, H. J. J. van Dam4, Yuri Alexeev5, J. Anchell6, V. Anisimov5, Fredy W. Aquino, Raymond Atta-Fynn7, Jochen Autschbach8, Nicholas P. Bauman1, Jeffrey C. Becca9, David E. Bernholdt10, K. Bhaskaran-Nair11, Stuart Bogatko12, Piotr Borowski13, Jeffery S. Boschen14, Jiří Brabec15, Adam Bruner16, Emilie Cauet17, Y. Chen18, Gennady N. Chuev19, Christopher J. Cramer20, Jeff Daily1, M. J. O. Deegan, Thom H. Dunning21, Michel Dupuis8, Kenneth G. Dyall, George I. Fann10, Sean A. Fischer22, Alexandr Fonari23, Herbert A. Früchtl24, Laura Gagliardi20, Jorge Garza25, Nitin A. Gawande1, Soumen Ghosh20, Kurt R. Glaesemann1, Andreas W. Götz26, Jeff R. Hammond6, Volkhard Helms27, Eric D. Hermes28, Kimihiko Hirao, So Hirata29, Mathias Jacquelin2, Lasse Jensen9, Benny G. Johnson, Hannes Jónsson30, Ricky A. Kendall10, Michael Klemm6, Rika Kobayashi31, V. Konkov32, Sriram Krishnamoorthy1, M. Krishnan18, Zijing Lin33, Roberto D. Lins34, Rik J. Littlefield, Andrew J. Logsdail35, Kenneth Lopata36, Wan Yong Ma37, Aleksandr V. Marenich20, J. Martin del Campo38, Daniel Mejía-Rodríguez39, Justin E. Moore6, Jonathan M. Mullin, Takahito Nakajima, Daniel R. Nascimento1, Jeffrey A. Nichols10, P. J. Nichols40, J. Nieplocha1, Alberto Otero-de-la-Roza41, Bruce J. Palmer1, Ajay Panyala1, T. Pirojsirikul42, Bo Peng1, Roberto Peverati32, Jiri Pittner15, L. Pollack, Ryan M. Richard43, P. Sadayappan44, George C. Schatz45, William A. Shelton36, Daniel W. Silverstein46, D. M. A. Smith6, Thereza A. Soares47, Duo Song1, Marcel Swart, H. L. Taylor48, G. S. Thomas1, Vinod Tipparaju49, Donald G. Truhlar20, Kiril Tsemekhman, T. Van Voorhis50, Álvaro Vázquez-Mayagoitia5, Prakash Verma, Oreste Villa51, Abhinav Vishnu1, Konstantinos D. Vogiatzis52, Dunyou Wang53, John H. Weare26, Mark J. Williamson54, Theresa L. Windus14, Krzysztof Wolinski13, A. T. Wong, Qin Wu4, Chan-Shan Yang2, Q. Yu55, Martin Zacharias56, Zhiyong Zhang57, Yan Zhao58, Robert W. Harrison59 
Pacific Northwest National Laboratory1, Lawrence Berkeley National Laboratory2, National Center for Computational Sciences3, Brookhaven National Laboratory4, Argonne National Laboratory5, Intel6, University of Texas at Arlington7, State University of New York System8, Pennsylvania State University9, Oak Ridge National Laboratory10, Washington University in St. Louis11, Wellesley College12, Maria Curie-Skłodowska University13, Iowa State University14, Academy of Sciences of the Czech Republic15, University of Tennessee at Martin16, Université libre de Bruxelles17, Facebook18, Russian Academy of Sciences19, University of Minnesota20, University of Washington21, United States Naval Research Laboratory22, Georgia Institute of Technology23, University of St Andrews24, Universidad Autónoma Metropolitana25, University of California, San Diego26, Saarland University27, Sandia National Laboratories28, University of Illinois at Urbana–Champaign29, University of Iceland30, Australian National University31, Florida Institute of Technology32, University of Science and Technology of China33, Oswaldo Cruz Foundation34, Cardiff University35, Louisiana State University36, Chinese Academy of Sciences37, National Autonomous University of Mexico38, University of Florida39, Los Alamos National Laboratory40, University of Oviedo41, Prince of Songkla University42, Ames Laboratory43, University of Utah44, Northwestern University45, Universal Display Corporation46, Federal University of Pernambuco47, CD-adapco48, Cray49, Massachusetts Institute of Technology50, Nvidia51, University of Tennessee52, Shandong Normal University53, University of Cambridge54, Advanced Micro Devices55, Technische Universität München56, Stanford University57, Wuhan University of Technology58, Stony Brook University59
TL;DR: The NWChem computational chemistry suite is reviewed, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.
Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principle-driven methodologies to model complex chemical and materials processes. Over the past few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach, and outlook.

342 citations

Journal ArticleDOI
Edoardo Aprà, Eric J. Bylaska, W. A. de Jong, Niranjan Govind, Karol Kowalski, T. P. Straatsma, Marat Valiev, H. J. J. van Dam, Yuri Alexeev, James L. Anchell, Victor M. Anisimov, Fredy W. Aquino, Raymond Atta-Fynn, Jochen Autschbach, Nicholas P. Bauman, Jeffrey C. Becca, David E. Bernholdt, Kiran Bhaskaran-Nair, Stuart Bogatko, Piotr Borowski, Jeffrey Scott Boschen, Jiří Brabec, Adam Bruner, Emilie Cauet, Y. Chen, Gennady N. Chuev, Christopher J. Cramer, Jeff Daily, M. J. O. Deegan, Thomas Dunning, Michel Dupuis, Kenneth G. Dyall, George I. Fann, Sean A. Fischer, Alexandr Fonari, H. Früuchtl, Laura Gagliardi, Jorge Garza, Nitin A. Gawande, Sayan Ghosh, Kurt R. Glaesemann, Andreas W. Götz, Jeff R. Hammond, Volkhard Helms, Eric D. Hermes, Kimihiko Hirao, So Hirata, Mathias Jacquelin, Lasse Jensen, Benny G. Johnson, Hannes Jónsson, Ricky A. Kendall, Michael Klemm, Rika Kobayashi, V. Konkov, Sriram Krishnamoorthy, Manojkumar Krishnan, Zijing Lin, Roberto D. Lins, Rik J. Littlefield, Andrew J. Logsdail, Kenneth Lopata, Wan Yong Ma, Aleksandr V. Marenich, J. Martin del Campo, Daniel Mejía-Rodríguez, Justin E. Moore, Jonathan M. Mullin, Takahito Nakajima, Daniel R. Nascimento, Jeffrey A. Nichols, Patrick Nichols, J. Nieplocha, A. Otero de la Roza, Bruce J. Palmer, Ajay Panyala, T. Pirojsirikul, Bo Peng, Roberto Peverati, Jiri Pittner, L. Pollack, Ryan M. Richard, P. Sadayappan, George C. Schatz, William A. Shelton, Daniel W. Silverstein, Dayle M. A. Smith, Thereza A. Soares, Duo Song, Marcel Swart, H. L. Taylor, G. S. Thomas, Vinod Tipparaju, Donald G. Truhlar, Kiril Tsemekhman, T. Van Voorhis, Álvaro Vázquez-Mayagoitia, Prakash Verma, Oreste Villa, Abhinav Vishnu, Konstantinos D. Vogiatzis, Dunyou Wang, John H. Weare, Mark J. Williamson, T. L. Windus, Krzysztof Wolinski, A. T. Wong, Qin Wu, Chan-Shan Yang, Q. Yu, Martin Zacharias, Zhiyong Zhang, Yan Zhao, Robert W. Harrison 
TL;DR: The NWChem computational chemistry suite as discussed by the authors provides tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties by using first-principledriven methodologies to model complex chemical and materials processes.
Abstract: Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support and guide experimental efforts and for the prediction of atomistic and electronic properties. In this regard, electronic structure packages have played a special role by using first-principledriven methodologies to model complex chemical and materials processes. Over the last few decades, the rapid development of computing technologies and the tremendous increase in computational power have offered a unique chance to study complex transformations using sophisticated and predictive many-body techniques that describe correlated behavior of electrons in molecular and condensed phase systems at different levels of theory. In enabling these simulations, novel parallel algorithms have been able to take advantage of computational resources to address the polynomial scaling of electronic structure methods. In this paper, we briefly review the NWChem computational chemistry suite, including its history, design principles, parallel tools, current capabilities, outreach and outlook.

314 citations

Posted Content
TL;DR: Mesh-TensorFlow is introduced, a language for specifying a general class of distributed tensor computations and used to implement an efficient data-parallel, model-Parallel version of the Transformer sequence-to-sequence model, surpassing state of the art results on WMT'14 English- to-French translation task and the one-billion-word language modeling benchmark.
Abstract: Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters. We introduce Mesh-TensorFlow, a language for specifying a general class of distributed tensor computations. Where data-parallelism can be viewed as splitting tensors and operations along the "batch" dimension, in Mesh-TensorFlow, the user can specify any tensor-dimensions to be split across any dimensions of a multi-dimensional mesh of processors. A Mesh-TensorFlow graph compiles into a SPMD program consisting of parallel operations coupled with collective communication primitives such as Allreduce. We use Mesh-TensorFlow to implement an efficient data-parallel, model-parallel version of the Transformer sequence-to-sequence model. Using TPU meshes of up to 512 cores, we train Transformer models with up to 5 billion parameters, surpassing state of the art results on WMT'14 English-to-French translation task and the one-billion-word language modeling benchmark. Mesh-Tensorflow is available at this https URL .

121 citations

Journal ArticleDOI
TL;DR: The reduced scaling explicitly correlated CCSD(T) method is used to examine the binding energies of several systems in the L7 benchmark data set of noncovalent interactions.
Abstract: In this work, we present a linear scaling formulation of the coupled-cluster singles and doubles with perturbative inclusion of triples (CCSD(T)) and explicitly correlated geminals. The linear scaling implementation of all post-mean-field steps utilizes the SparseMaps formalism [P. Pinski et al., J. Chem. Phys. 143, 034108 (2015)]. Even for conservative truncation levels, the method rapidly reaches near-linear complexity in realistic basis sets, e.g., an effective scaling exponent of 1.49 was obtained for n-alkanes with up to 200 carbon atoms in a def2-TZVP basis set. The robustness of the method is benchmarked against the massively parallel implementation of the conventional explicitly correlated coupled-cluster for a 20-water cluster; the total dissociation energy of the cluster (∼186 kcal/mol) is affected by the reduced scaling approximations by only ∼0.4 kcal/mol. The reduced scaling explicitly correlated CCSD(T) method is used to examine the binding energies of several systems in the L7 benchmark data set of noncovalent interactions.

118 citations