scispace - formally typeset
Open AccessBook ChapterDOI

EXAHD: A massively parallel fault tolerant sparse grid approach for high-dimensional turbulent plasma simulations

Reads0
Chats0
TLDR
A sparse grid approach based on the sparse grid combination technique which splits the simulation grid into multiple smaller grids of varying resolution to increase the maximum resolution as well as the parallel efficiency of the current solvers is proposed.
Abstract
Plasma fusion is one of the promising candidates for an emission-free energy source and is heavily investigated with high-resolution numerical simulations. Unfortunately, these simulations suffer from the curse of dimensionality due to the five-plus-one-dimensional nature of the equations. Hence, we propose a sparse grid approach based on the sparse grid combination technique which splits the simulation grid into multiple smaller grids of varying resolution. This enables us to increase the maximum resolution as well as the parallel efficiency of the current solvers. At the same time we introduce fault tolerance within the algorithmic design and increase the resilience of the application code. We base our implementation on a manager-worker approach which computes multiple solver runs in parallel by distributing tasks to different process groups. Our results demonstrate good convergence for linear fusion runs and show high parallel efficiency up to 180k cores. In addition, our framework achieves accurate results with low overhead in faulty environments. Moreover, for nonlinear fusion runs, we show the effectiveness of the combination technique and discuss existing shortcomings that are still under investigation.

read more

Content maybe subject to copyright    Report

Citations
More filters

Electron-temperature-gradient-driven turbulence

Frank Jenko
TL;DR: In this article, collisionless electron-temperature-gradient-driven (ETG) turbulence in toroidal geometry is studied via nonlinear numerical simulations and two massively parallel, fully gyrokinetic Vlasov codes are used, both including electromagnetic effects.
Proceedings Article

Advances in Parallel Computing

TL;DR: The philosophy behind the SpiNNaker machine, the future prospects for systems with increased cognitive capabilities based on an increasing understanding of how biological brains process information, are presented.
Posted Content

A fault-tolerant domain decomposition method based on space-filling curves.

TL;DR: In this paper, a simple domain decomposition method for higher-dimensional elliptic PDEs is proposed, which involves an overlapping decomposition into local subdomain problems and a global coarse problem.
Posted Content

A dimension-oblivious domain decomposition method based on space-filling curves.

TL;DR: In this paper, a space-filling curve based domain decomposition solver for discretizations of elliptic partial differential equations is proposed, which allows for the effective use of arbitrary processor numbers independent of the dimension of the underlying partial differential equation while maintaining optimal convergence behavior.
Journal ArticleDOI

A stable and mass-conserving sparse grid combination technique with biorthogonal hierarchical basis functions for kinetic simulations

TL;DR: In this paper , two new variants of hierarchical multiscale basis functions for use with the combination technique are introduced: the biorthogonal and full weighting bases, which conserve the total mass and significantly increase accuracy for a finite-volume solution of constant advection.
References
More filters
Journal ArticleDOI

Electron temperature gradient driven turbulence

TL;DR: In this article, collisionless electron-temperature-gradient-driven (ETG) turbulence in toroidal geometry is studied via nonlinear numerical simulations via two massively parallel, fully gyrokinetic Vlasov codes.
Journal ArticleDOI

Dimension-adaptive tensor-product quadrature

TL;DR: The dimension–adaptive quadrature method is developed and presented, based on the sparse grid method, which tries to find important dimensions and adaptively refines in this respect guided by suitable error estimators, and leads to an approach which is based on generalized sparse grid index sets.
Journal ArticleDOI

A Large-Scale Study of Failures in High-Performance Computing Systems

TL;DR: Analysis of failure data collected at two large high-performance computing sites finds that average failure rates differ wildly across systems, ranging from 20-1000 failures per year, and that time between failures is modeled well by a Weibull distribution with decreasing hazard rate.
Related Papers (5)