scispace - formally typeset
Search or ask a question

Showing papers by "Gregory D. Peterson published in 2018"


Journal ArticleDOI
TL;DR: Novel methods are proposed which allow iterative refinement to utilize variable precision arithmetic dynamically in a loop (i.e., a trans-precision approach) and restructure a numeric algorithm dynamically according to runtime numeric behavior and remove unnecessary accuracy checks.
Abstract: Mixed precision is a promising approach to save energy in iterative refinement algorithms since it obtains speed-up without necessitating additional cores and parallelization. However, conventional mixed precision methods utilize statically defined precision in a loop, thus hindering further speed-up and energy savings. We overcome this problem by proposing novel methods which allow iterative refinement to utilize variable precision arithmetic dynamically in a loop (i.e., a trans-precision approach). Our methods restructure a numeric algorithm dynamically according to runtime numeric behavior and remove unnecessary accuracy checks. We implemented our methods by extending one conventional mixed precision iterative refinement algorithm on an Intel Xeon E5-2650 2GHz core with MKL 2017 and XBLAS 1.0. Our dynamic precision approach demonstrates 2.0–2.6 $\times $ speed-up and 1.8–2.4 $\times $ energy savings compared with mixed precision iterative refinement when double precision solution accuracy is required for forward error and with matrix dimensions ranging from 4K to 32K.

10 citations


Journal ArticleDOI
TL;DR: An optimized GPU implementation for the induced dimension reduction algorithm is presented, which improves data locality, combines it with an efficient sparse matrix vector kernel, and investigates the potential of overlapping computation with communication as well as the possibility of concurrent kernel execution.
Abstract: In this paper, we present an optimized GPU implementation for the induced dimension reduction algorithm. We improve data locality, combine it with an efficient sparse matrix vector kernel, and investigate the potential of overlapping computation with communication as well as the possibility of concurrent kernel execution. A comprehensive performance evaluation is conducted using a suitable performance model. The analysis reveals efficiency of up to 90%, which indicates that the implementation achieves performance close to the theoretically attainable bound.

8 citations



Proceedings ArticleDOI
22 Jul 2018
TL;DR: The PaPaS framework offers a simple method for defining and managing parameter studies, while increasing resource utilization, and is being developed in Python 3 with support for distributed parallelization using SSH, batch systems, and C++ MPI.
Abstract: The current landscape of scientific research is widely based on modeling and simulation, typically with complexity in the simulation's flow of execution and parameterization properties. Execution flows are not necessarily straightforward since they may need multiple processing tasks and iterations. Furthermore, parameter and performance studies are common approaches used to characterize a simulation, often requiring traversal of a large parameter space. High-performance computers offer practical resources at the expense of users handling the setup, submission, and management of jobs. This work presents the design of PaPaS, a portable, lightweight, and generic workflow framework for conducting parallel parameter and performance studies. Workflows are defined using parameter files based on keyword-value pairs syntax, thus removing from the user the overhead of creating complex scripts to manage the workflow. A parameter set consists of any combination of environment variables, files, partial file contents, and command line arguments. PaPaS is being developed in Python 3 with support for distributed parallelization using SSH, batch systems, and C++ MPI. The PaPaS framework will run as user processes, and can be used in single/multi-node and multi-tenant computing systems. An example simulation using the BehaviorSpace tool from NetLogo and a matrix multiply using OpenMP are presented as parameter and performance studies, respectively. The results demonstrate that the PaPaS framework offers a simple method for defining and managing parameter studies, while increasing resource utilization.

1 citations


Proceedings ArticleDOI
TL;DR: PaPaS as discussed by the authors is a portable, lightweight, and generic workflow framework for conducting parallel parameter and performance studies, which can be used in single/multi-node and multi-tenant computing systems.
Abstract: The current landscape of scientific research is widely based on modeling and simulation, typically with complexity in the simulation's flow of execution and parameterization properties. Execution flows are not necessarily straightforward since they may need multiple processing tasks and iterations. Furthermore, parameter and performance studies are common approaches used to characterize a simulation, often requiring traversal of a large parameter space. High-performance computers offer practical resources at the expense of users handling the setup, submission, and management of jobs. This work presents the design of PaPaS, a portable, lightweight, and generic workflow framework for conducting parallel parameter and performance studies. Workflows are defined using parameter files based on keyword-value pairs syntax, thus removing from the user the overhead of creating complex scripts to manage the workflow. A parameter set consists of any combination of environment variables, files, partial file contents, and command line arguments. PaPaS is being developed in Python 3 with support for distributed parallelization using SSH, batch systems, and C++ MPI. The PaPaS framework will run as user processes, and can be used in single/multi-node and multi-tenant computing systems. An example simulation using the BehaviorSpace tool from NetLogo and a matrix multiply using OpenMP are presented as parameter and performance studies, respectively. The results demonstrate that the PaPaS framework offers a simple method for defining and managing parameter studies, while increasing resource utilization.

1 citations