scispace - formally typeset
Search or ask a question

Showing papers by "Tuomas Koskela published in 2017"


Journal ArticleDOI
X. Litaudon, S. Abduallev1, Mitul Abhangi, P. Abreu2  +1225 moreInstitutions (69)
TL;DR: In this paper, the authors reviewed the 2014-2016 JET results in the light of their significance for optimising the ITER research plan for the active and non-active operation, stressing the importance of the magnetic configurations and the recent measurements of fine-scale structures in the edge radial electric.
Abstract: The 2014-2016 JET results are reviewed in the light of their significance for optimising the ITER research plan for the active and non-active operation. More than 60 h of plasma operation with ITER first wall materials successfully took place since its installation in 2011. New multi-machine scaling of the type I-ELM divertor energy flux density to ITER is supported by first principle modelling. ITER relevant disruption experiments and first principle modelling are reported with a set of three disruption mitigation valves mimicking the ITER setup. Insights of the L-H power threshold in Deuterium and Hydrogen are given, stressing the importance of the magnetic configurations and the recent measurements of fine-scale structures in the edge radial electric. Dimensionless scans of the core and pedestal confinement provide new information to elucidate the importance of the first wall material on the fusion performance. H-mode plasmas at ITER triangularity (H = 1 at β N ∼ 1.8 and n/n GW ∼ 0.6) have been sustained at 2 MA during 5 s. The ITER neutronics codes have been validated on high performance experiments. Prospects for the coming D-T campaign and 14 MeV neutron calibration strategy are reviewed.

162 citations


Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate the measurement of a 2D MeV-range ion velocity distribution function by velocity-space tomography at JET, where deuterium ions were accelerated into the MeV range by third harmonic ion cyc...
Abstract: We demonstrate the measurement of a 2D MeV-range ion velocity distribution function by velocity-space tomography at JET. Deuterium ions were accelerated into the MeV-range by third harmonic ion cyc ...

50 citations


Journal ArticleDOI
TL;DR: In this article, the transition from L-mode to a stationary high QDT H-mode regime in ITER has been studied and it is shown that the transition may be caused by a low fuell.
Abstract: The dynamics for the transition from L-mode to a stationary high QDT H-mode regime in ITER is expected to be qualitatively different to present experiments Differences may be caused by a low fuell

26 citations


Journal ArticleDOI
Abstract: The measured D-D neutron rate of neutral beam heated JET baseline and hybrid H-modes in deuterium is found to be between approximately 50% and 100% of the neutron rate expected from the TRANSP code, depending on the plasma parameters. A number of candidate explanations for the shortfall, such as fuel dilution, errors in beam penetration and effectively available beam power have been excluded. As the neutron rate in JET is dominated by beamplasma interactions, the ` neutron deficit' may be caused by a yet unidentified form of fast particle redistribution. Modelling, which assumes fast particle transport to be responsible for the deficit, indicates that such redistribution would have to happen at time scales faster than both the slowing down time and the energy confinement time. Sawteeth and edge localised modes are found to make no significant contribution to the deficit. There is also no obvious correlation with magnetohydrodynamic activity measured using magnetic probes at the tokamak vessel walls. Modelling of fast particle orbits in the 3D fields of neoclassical tearing modes shows that realistically sized islands can only contribute a few percent to the deficit. In view of these results it appears unlikely that the neutron deficit results from a single physical process in the plasma.

24 citations


Journal ArticleDOI
TL;DR: The ASCOT Fusion Source Integrator (AFSI) has been used to calculate neutron production rates and spectra corresponding to the JET 19-channel neutron camera (KN3) and the time-of-flight spectrometer (TOFOR) as ideal diagnostics, without detector-related effects as discussed by the authors.
Abstract: The ASCOT Fusion Source Integrator (AFSI) has been used to calculate neutron production rates and spectra corresponding to the JET 19-channel neutron camera (KN3) and the time-of-flight spectrometer (TOFOR) as ideal diagnostics, without detector-related effects. AFSI calculates fusion product distributions in 4D, based on Monte Carlo integration from arbitrary reactant distribution functions. The distribution functions were calculated by the ASCOT Monte Carlo particle orbit following code for thermal, NBI and ICRH particle reactions. Fusion cross-sections were defined based on the Bosch-Hale model and both DD and DT reactions have been included. Neutrons generated by AFSI-ASCOT simulations have already been applied as a neutron source of the Serpent neutron transport code in ITER studies. Additionally, AFSI has been selected to be a main tool as the fusion product generator in the complete analysis calculation chain: ASCOT AFSI - SERPENT (neutron and gamma transport Monte Carlo code) - APROS (system and power plant modelling code), which encompasses the plasma as an energy source, heat deposition in plant structures as well as cooling and balance-of-plant in DEMO applications and other reactor relevant analyses. This conference paper presents the first results and validation of the AFSI DD fusion model for different auxiliary heating scenarios (NBI, ICRH) with very different fast particle distribution functions. Both calculated quantities (production rates and spectra) have been compared with experimental data from KN3 and synthetic spectrometer data from ControlRoom code. No unexplained differences have been observed. In future work, AFSI will be extended for synthetic gamma diagnostics and additionally, AFSI will be used as part of the neutron transport calculation chain to model real diagnostics instead of ideal synthetic diagnostics for quantitative benchmarking.

10 citations


Journal ArticleDOI
TL;DR: Inverted 3 He and D ion cyclotron minority heating scenarios were recently tested in JET-ILW as discussed by the authors, and direct and indirect evidence of the existence of fast particle subpopulations was found in both cases.
Abstract: Inverted 3 He and D ion cyclotron minority heating scenarios were recently tested in JET-ILW. They confirm the good heating efficiency at low concentrations of ∼3%. The 3 He minority heating scheme is only modestly affected by the change from a carbon (JET-C) to a Beryllium (JET-ILW) wall but unlike what was the case in JET-C, the intrinsic Be ions D-like particles in terms of charge-over-mass ratio do not prevent the D (or 4 He) minority regime from being exploited. Direct and indirect evidence of the existence of fast particle subpopulations was found in both cases.

6 citations


Book ChapterDOI
18 Jun 2017
TL;DR: This paper analyzes the overall performance improvements of these codes quantifying impacts of both Xeon Phi™ architecture features as well as code optimization on application performance and shows that the architectural advantage is less than the speedup obtained through application optimization.
Abstract: NERSC has partnered with over 20 representative application developer teams to evaluate and optimize their workloads on the Intel® Xeon Phi™Knights Landing processor. In this paper, we present a summary of this two year effort and will present the lessons we learned in that process. We analyze the overall performance improvements of these codes quantifying impacts of both Xeon Phi™architectural features as well as code optimization on application performance. We show that the architectural advantage, i.e. the average speedup of optimized code on KNL vs. optimized code on Haswell is about 1.1\(\times \). The average speedup obtained through application optimization, i.e. comparing optimized vs. original codes on KNL, is about 5\(\times \).

5 citations


01 Jan 2017
TL;DR: This work studies the attainable performance of Particle-In-Cell codes on the Cori KNL system by analyzing a miniature particle push application based on the fusion PIC code XGC1 and achieving good vectorization is shown to be the most beneficial optimization path with theoretical yield of up to 8x speedup on KNL.
Abstract: Author(s): Koskela, T; Deslippe, J; Friesen, B; Karthik, R | Abstract: We study the attainable performance of Particle-In-Cell codes on the Cori KNL system by analyzing a miniature particle push application based on the fusion PIC code XGC1. We start from the most basic building blocks of a PIC code and build up the complexity to identify the kernels that cost the most in performance and focus optimization efforts there. Particle push kernels operate at high AI and are not likely to be memory bandwidth or even cache bandwidth bound on KNL. Therefore, we see only minor benefits from the high bandwidth memory available on KNL, and achieving good vectorization is the most beneficial optimization path and can theoretically yield up to 8x speedup on KNL, but is in practice limited by the data layout to 4x.

2 citations


Book ChapterDOI
18 Jun 2017
TL;DR: The results of optimizing the performance of the gyrokinetic full-f fusion PIC code XGC1 on the Cori Phase Two Knights Landing system are presented and 2 speedups in single node performance are obtained due to enabling vectorization and performing memory layout optimizations.
Abstract: In this paper we present the results of optimizing the performance of the gyrokinetic full-f fusion PIC code XGC1 on the Cori Phase Two Knights Landing system. The code has undergone substantial development to enable the use of vector instructions in its most expensive kernels within the NERSC Exascale Science Applications Program. We study the single-node performance of the code on an absolute scale using the roofline methodology to guide optimization efforts. We have obtained 2\({\times }\) speedups in single node performance due to enabling vectorization and performing memory layout optimizations. On multiple nodes, the code is shown to scale well up to 4000 nodes, near half the size of the machine. We discuss some communication bottlenecks that were identified and resolved during the work.

2 citations


01 Jan 2017
TL;DR: In this article, the authors investigate if D majority ion cyclotron resonance heating (ICRH) scenarios exist that can simultaneously ensure a high ion heating efficiency needed for reaching fusion relevant temperatures and impurity chase-out.
Abstract: Since 2011 JET is equipped with a Beryllium ”ITER-like” wall (ILW) and a Tungsten divertor [1]. As it can lead to reduced core temperature and even radiative collapse, high Z core impurity accumulation has to be avoided. Hydrogen minority ion cyclotron heating at sufficiently high power (> 4MW in JET) is already well known to be an effective cure for this problem [2, 3, 4, 5, 6, 7]. In the context of exploring the available options for a D − T campaign but without actually using T , this paper reports on investigations checking if D majority ion cyclotron resonance heating (ICRH) scenarios exist that can simultaneously ensure a high ion heating efficiency needed for reaching fusion relevant temperatures and impurity chase-out.

2 citations


23 May 2017
TL;DR: This session shows, in two case studies, how the roofline feature of Intel Advisor has been utilized to optimize the performance of kernels of the XGC1 and PICSAR codes in preparation for Intel Knights Landing architecture.
Abstract: In this session we show, in two case studies, how the roofline feature of Intel Advisor has been utilized to optimize the performance of kernels of the XGC1 and PICSAR codes in preparation for Intel Knights Landing architecture. The impact of the implemented optimizations and the benefits of using the automatic roofline feature of Intel Advisor to study performance of large applications will be presented. This demonstrates an effective optimization strategy that has enabled these science applications to achieve up to 4.6 times speed-up and prepare for future exascale architectures. # Goal/Relevance of Session The roofline model [1,2] is a powerful tool for analyzing the performance of applications with respect to the theoretical peak achievable on a given computer architecture. It allows one to graphically represent the performance of an application in terms of operational intensity, i.e. the ratio of flops performed and bytes moved from memory in order to guide optimization efforts. Given the scale and complexity of modern science applications, it can often be a tedious task for the user to perform the analysis on the level of functions or loops to identify where performance gains can be made. With new Intel tools, it is now possible to automate this task, as well as base the estimates of peak performance on measurements rather than vendor specifications. The goal of this session is to demonstrate how the roofline feature of Intel Advisor can be used to balance memory vs. computation related optimization efforts and effectively identify performance bottlenecks. A series of typical optimization techniques: cache blocking, structure refactoring, data alignment, and vectorization illustrated by the kernel cases will be addressed. # Description of the codes ## XGC1 The XGC1 code [3] is a magnetic fusion Particle-In-Cell code that uses an unstructured mesh for its Poisson solver that allows it to accurately resolve the edge plasma of a magnetic fusion device. After recent optimizations to its collision kernel [4], most of the computing time is spent in the electron push (pushe) kernel, where these optimization efforts have been focused. The kernel code scaled well with MPI+OpenMP but had almost no automatic compiler vectorization, in part due to indirect memory addresses and in part due to low trip counts of low-level loops that would be candidates for vectorization. Particle blocking and sorting have been implemented to increase trip counts of low-level loops and improve memory locality, and OpenMP directives have been added to vectorize compute-intensive loops that were identified by Advisor. The optimizations have improved the performance of the pushe kernel 2x on Haswell processors and 1.7x on KNL. The KNL node-for-node performance has been brought to within 30% of a NERSC Cori phase I Haswell node and we expect to bridge this gap by reducing the memory footprint of compute intensive routines to improve cache reuse. ## PICSAR is a Fortran/Python high-performance Particle-In-Cell library targeting at MIC architectures first designed to be coupled with the PIC code WARP for the simulation of laser-matter interaction and particle accelerators. PICSAR also contains a FORTRAN stand-alone kernel for performance studies and benchmarks. A MPI domain decomposition is used between NUMA domains and a tile decomposition (cache-blocking) handled by OpenMP has been added for shared-memory parallelism and better cache management. The so-called current deposition and field gathering steps that compose the PIC time loop constitute major hotspots that have been rewritten to enable more efficient vectorization. Particle communications between tiles and MPI domain has been merged and parallelized. All considered, these improvements provide speedups of 3.1 for order 1 and 4.6 for order 3 interpolation shape factors on KNL configured in SNC4 quadrant flat mode. Performance is similar between a node of cori phase 1 and KNL at order 1 and better on KNL by a factor 1.6 at order 3 with the considered test case (homogeneous thermal plasma).