Showing papers in "arXiv: Computational Physics in 2013"

PDF

Open Access

Journal Article•DOI•

GALAMOST: GPU-accelerated large-scale molecular simulation toolkit

[...]

You-Liang Zhu¹, Hong Liu¹, Zhan-Wei Li², Hu-Jun Qian¹, Giuseppe Milano³, Zhongyuan Lu¹ - Show less +2 more•Institutions (3)

Jilin University¹, Chinese Academy of Sciences², University of Salerno³

08 Oct 2013-arXiv: Computational Physics

TL;DR: In this article, a new molecular simulation toolkit composed of some lately developed force fields and specified models is presented to study the self-assembly, phase transition, and other properties of polymeric systems at mesoscopic scale by utilizing the computational power of GPUs.

...read moreread less

Abstract: A new molecular simulation toolkit composed of some lately developed force fields and specified models is presented to study the self-assembly, phase transition, and other properties of polymeric systems at mesoscopic scale by utilizing the computational power of GPUs. In addition, the hierarchical self-assembly of soft anisotropic particles and the problems related to polymerization can be studied by corresponding models included in this toolkit.

...read moreread less

168 citations

Journal Article•DOI•

Hamiltonian replica-exchange in GROMACS: a flexible implementation

[...]

Giovanni Bussi¹•Institutions (1)

International School for Advanced Studies¹

19 Jul 2013-arXiv: Computational Physics

TL;DR: A simple and general implementation of Hamiltonian replica exchange for the popular molecular-dynamics software GROMACS is presented in this paper, where arbitrarily different Hamiltonians can be used for the different replicas without incurring in any significant performance penalty.

...read moreread less

Abstract: A simple and general implementation of Hamiltonian replica exchange for the popular molecular-dynamics software GROMACS is presented. In this implementation, arbitrarily different Hamiltonians can be used for the different replicas without incurring in any significant performance penalty. The implementation was validated on a simple toy model - alanine dipeptide in water - and applied to study the rearrangement of an RNA tetraloop, where it was used to compare recently proposed force-field corrections.

...read moreread less

152 citations

Journal Article•DOI•

Robust and efficient configurational molecular sampling via Langevin Dynamics

[...]

Benedict Leimkuhler¹, Charles Matthews¹•Institutions (1)

University of Edinburgh¹

11 Apr 2013-arXiv: Computational Physics

TL;DR: In this article, a wide variety of numerical methods are evaluated and compared for solving the stochastic differential equations encountered in molecular dynamics, based on the application of deterministic impulses, drifts and Brownian motions in some combination.

...read moreread less

Abstract: A wide variety of numerical methods are evaluated and compared for solving the stochastic differential equations encountered in molecular dynamics. The methods are based on the application of deterministic impulses, drifts, and Brownian motions in some combination. The Baker-Campbell-Hausdorff expansion is used to study sampling accuracy following recent work by the authors, which allows determination of the stepsize-dependent bias in configurational averaging. For harmonic oscillators, configurational averaging is exact for certain schemes, which may result in improved performance in the modelling of biomolecules where bond stretches play a prominent role. For general systems, an optimal method can be identified that has very low bias compared to alternatives. In simulations of the alanine dipeptide reported here (both solvated and unsolvated), higher accuracy is obtained without loss of computational efficiency, while allowing large timestep, and with no impairment of the conformational exploration rate (the effective diffusion rate observed in simulation). The optimal scheme is a uniformly better performing algorithm for molecular sampling, with overall efficiency improvements of 25% or more in practical timestep size achievable in vacuum, and with reductions in the error of configurational averages of a factor of ten or more attainable in solvated simulations at large timestep.

...read moreread less

107 citations

Journal Article•DOI•

Linear-scaling and parallelizable algorithms for stochastic quantum chemistry

[...]

George H. Booth, Simon D. Smart, Ali Alavi

30 May 2013-arXiv: Computational Physics

TL;DR: An algorithm used for Full Configuration Interaction Quantum Monte Carlo (FCIQMC), which is implemented and available in MOLPRO and as a standalone code, and is designed for high-level parallelism and linear-scaling with walker number is explored.

...read moreread less

Abstract: For many decades, quantum chemical method development has been dominated by algorithms which involve increasingly complex series of tensor contractions over one-electron orbital spaces. Procedures for their derivation and implementation have evolved to require the minimum amount of logic and rely heavily on computationally efficient library-based matrix algebra and optimized paging schemes. In this regard, the recent development of exact stochastic quantum chemical algorithms to reduce computational scaling and memory overhead requires a contrasting algorithmic philosophy, but one which when implemented efficiently can often achieve higher accuracy/cost ratios with small random errors. Additionally, they can exploit the continuing trend for massive parallelization which hinders the progress of deterministic high-level quantum chemical algorithms. In the Quantum Monte Carlo community, stochastic algorithms are ubiquitous but the discrete Fock space of quantum chemical methods is often unfamiliar, and the methods introduce new concepts required for algorithmic efficiency. In this paper, we explore these concepts and detail an algorithm used for Full Configuration Interaction Quantum Monte Carlo (FCIQMC), which is implemented and available in MOLPRO and as a standalone code, and is designed for high-level parallelism and linear-scaling with walker number. Many of the algorithms are also in use in, or can be transferred to, other stochastic quantum chemical methods and implementations. We apply these algorithms to the strongly correlated Chromium dimer, to demonstrate their efficiency and parallelism.

...read moreread less

101 citations

Journal Article•DOI•

Efficient Computation of Power, Force, and Torque in BEM Scattering Calculations

[...]

M. T. Homer Reid¹, Steven G. Johnson¹•Institutions (1)

Massachusetts Institute of Technology¹

11 Jul 2013-arXiv: Computational Physics

TL;DR: In this article, the authors present concise, computationally efficient formulas for several quantities of interest (including absorbed and scattered power, optical force (radiation pressure), and torque) in scattering calculations performed using the boundary element method (BEM).

...read moreread less

Abstract: We present concise, computationally efficient formulas for several quantities of interest -- including absorbed and scattered power, optical force (radiation pressure), and torque -- in scattering calculations performed using the boundary-element method (BEM) [also known as the method of moments (MOM)]. Our formulas compute the quantities of interest \textit{directly} from the BEM surface currents with no need ever to compute the scattered electromagnetic fields. We derive our new formulas and demonstrate their effectiveness by computing power, force, and torque in a number of example geometries. Free, open-source software implementations of our formulas are available for download online.

...read moreread less

81 citations

Journal Article•DOI•

The Multi-Layer Multi-Configuration Time-Dependent Hartree Method for Bosons: Theory, Implementation and Applications

[...]

Lushuai Cao¹, Sven Krönke, Oriol Vendrell, Peter Schmelcher•Institutions (1)

University of Hamburg¹

16 May 2013-arXiv: Computational Physics

TL;DR: The multi-layer multi-configuration time-dependent Hartree method for bosons (ML-MCTDHB), a variational numerically exact ab initio method for studying the quantum dynamics and stationary properties of general bosonic systems, is developed.

...read moreread less

Abstract: We develop the multi-layer multi-configuration time-dependent Hartree method for bosons (ML-MCTDHB), a variational numerically exact ab-initio method for studying the quantum dynamics and stationary properties of bosonic systems. ML-MCTDHB takes advantage of the permutation symmetry of identical bosons, which allows for investigations of the quantum dynamics from few to many-body systems. Moreover, the multi-layer feature enables ML-MCTDHB to describe mixed bosonic systems consisting of arbitrary many species. Multi-dimensional as well as mixed-dimensional systems can be accurately and efficiently simulated via the multi-layer expansion scheme. We provide a detailed account of the underlying theory and the corresponding implementation. We also demonstrate the superior performance by applying the method to the tunneling dynamics of bosonic ensembles in a one-dimensional double well potential, where a single-species bosonic ensemble of various correlation strengths and a weakly interacting two-species bosonic ensemble are considered.

...read moreread less

81 citations

Posted Content•

User Guide for the Discrete Dipole Approximation Code DDSCAT 7.3

[...]

Bruce T. Draine¹, Piotr J. Flatau²•Institutions (2)

Princeton University¹, Scripps Institution of Oceanography²

26 May 2013-arXiv: Computational Physics

TL;DR: DDSCAT 7.3 as discussed by the authors is an open-source Fortran-90 software package applying the discrete dipole approximation to calculate scattering and absorption of electromagnetic waves by targets with arbitrary geometries and complex refractive index.

...read moreread less

Abstract: DDSCAT 7.3 is an open-source Fortran-90 software package applying the discrete dipole approximation to calculate scattering and absorption of electromagnetic waves by targets with arbitrary geometries and complex refractive index. The targets may be isolated entities (e.g., dust particles), but may also be 1-d or 2-d periodic arrays of "target unit cells", allowing calculation of absorption, scattering, and electric fields around arrays of nanostructures. The theory of the DDA and its implementation in DDSCAT is presented in Draine (1988) and Draine & Flatau (1994), and its extension to periodic structures in Draine & Flatau (2008), and efficient near-field calculations in Flatau & Draine (2012). DDSCAT 7.3 includes support for MPI, OpenMP, and the Intel Math Kernel Library (MKL). DDSCAT supports calculations for a variety of target geometries. Target materials may be both inhomogeneous and anisotropic. It is straightforward for the user to "import" arbitrary target geometries into the code. DDSCAT automatically calculates total cross sections for absorption and scattering and selected elements of the Mueller scattering intensity matrix for user-specified scattering directions. DDSCAT 7.3 can efficiently calculate E and B throughout a user-specified volume containing the target. This User Guide explains how to use DDSCAT 7.3 to carry out electromagnetic scattering calculations, including use of DDPOSTPROCESS, a Fortran-90 code to perform calculations with E and B at user-selected locations near the target. A number of changes have been made since the last release, DDSCAT 7.2 .

...read moreread less

72 citations

Journal Article•DOI•

LaBonte's method revisited: An effective steepest descent method for micromagnetic energy minimization

[...]

Lukas Exl, Simon Bance, Franz Reichel, Thomas Schrefl, Hans Peter Stimming, Norbert J. Mauser - Show less +2 more

23 Sep 2013-arXiv: Computational Physics

TL;DR: For the computation of static hysteresis loops the steepest descent minimizer is faster than a Landau-Lifshitz micromagnetic solver by more than a factor of two.

...read moreread less

Abstract: We present a steepest descent energy minimization scheme for micromagnetics The method searches on a curve that lies on the sphere which keeps the magnitude of the magnetization vector constant The step size is selected according to a modified Barzilai-Borwein method Standard linear tetrahedral finite elements are used for space discretization For the computation of static hysteresis loops the steepest descent minimizer is faster than a Landau-Lifshitz micromagnetic solver by more than a factor of two The speed up on a graphic processor is 48 as compared to the fastest single-core CPU implementation

...read moreread less

66 citations

Journal Article•DOI•

High Order Lagrangian ADER-WENO Schemes on Unstructured Meshes - Application of Several Node Solvers to Hydrodynamics and Magnetohydrodynamics

[...]

Walter Boscheri, Michael Dumbser, Dinshaw S. Balsara

27 Oct 2013-arXiv: Computational Physics

TL;DR: A class of high‐order accurate cell‐centered arbitrary Lagrangian–Eulerian (ALE) one‐step ADER weighted essentially non‐oscillatory (WENO) finite volume schemes for the solution of nonlinear hyperbolic conservation laws on two‐dimensional unstructured triangular meshes.

...read moreread less

Abstract: In this paper we present a class of high order accurate cell-centered Arbitrary-Eulerian-Lagrangian (ALE) one-step ADER-WENO finite volume schemes for the solution of nonlinear hyperbolic conservation laws on two-dimensional unstructured triangular meshes. High order of accuracy in space is achieved by a WENO reconstruction algorithm, while a local space-time Galerkin predictor allows the schemes to be high order accurate also in time by using an element-local weak formulation of the governing PDE on moving meshes. The mesh motion can be computed by choosing among three different node solvers, which are for the first time compared with each other in this article: the node velocity may be obtained i) either as an arithmetic average among the states surrounding the node, or, ii) as a solution of multiple one-dimensional half-Riemann problems around a vertex, or, iii) by solving approximately a multidimensional Riemann problem around each vertex of the mesh using the genuinely multidimensional HLL Riemann. Once the vertex velocity and thus the new node location has been determined by the node solver, the local mesh motion is then constructed by straight edges connecting the vertex positions at the old time level with the new ones at the next time level. If necessary, a rezoning step can be introduced here to overcome mesh tangling or highly deformed elements. We apply the high order algorithm presented in this paper to the Euler equations of compressible gas dynamics as well as to the ideal classical and relativistic MHD equations. We show numerical convergence results up to fifth order of accuracy in space and time together with some classical numerical test problems for each hyperbolic system under consideration.

...read moreread less

62 citations

Journal Article•DOI•

Sampling exactly from the normal distribution

[...]

Charles F. F. Karney¹•Institutions (1)

SRI International¹

25 Mar 2013-arXiv: Computational Physics

TL;DR: An algorithm for sampling exactly from the normal distribution that reads some number of uniformly distributed random digits in a given base and generates an initial portion of the representation of a normal deviate in the same base with mean cost that scales linearly in the precision.

...read moreread less

Abstract: An algorithm for sampling exactly from the normal distribution is given. The algorithm reads some number of uniformly distributed random digits in a given base and generates an initial portion of the representation of a normal deviate in the same base. Thereafter, uniform random digits are copied directly into the representation of the normal deviate. Thus, in contrast to existing methods, it is possible to generate normal deviates exactly rounded to any precision with a mean cost that scales linearly in the precision. The method performs no extended precision arithmetic, calls no transcendental functions, and, indeed, uses no floating point arithmetic whatsoever; it uses only simple integer operations. It can easily be adapted to sample exactly from the discrete normal distribution whose parameters are rational numbers.

...read moreread less

54 citations

Journal Article•DOI•

Real-space density functional theory on graphical processing units: computational approach and comparison to Gaussian basis set methods

[...]

Xavier Andrade¹, Alán Aspuru-Guzik¹•Institutions (1)

Harvard University¹

12 Jun 2013-arXiv: Computational Physics

TL;DR: Results for current-generation GPUs from AMD and Nvidia show that the implementation, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code.

...read moreread less

Abstract: We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code. Moreover, for some systems our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs.

...read moreread less

Journal Article•DOI•

Stochastic resonance-free multiple time-step algorithm for molecular dynamics with very large time steps

[...]

Ben Leimkuhler¹, Daniel T. Margul², Mark E. Tuckerman³•Institutions (3)

University of Edinburgh¹, New York University², Courant Institute of Mathematical Sciences³

03 Jul 2013-arXiv: Computational Physics

TL;DR: In this paper, a set of stochastic isokinetic equations of motion that are shown to be rigorously ergodic and can be integrated using a multiple time-stepping algorithm that can be easily implemented in existing molecular dynamics codes.

...read moreread less

Abstract: Molecular dynamics is one of the most commonly used approaches for studying the dynamics and statistical distributions of many physical, chemical, and biological systems using atomistic or coarse-grained models. It is often the case, however, that the interparticle forces drive motion on many time scales, and the efficiency of a calculation is limited by the choice of time step, which must be sufficiently small that the fastest force components are accurately integrated. Multiple time-stepping algorithms partially alleviate this inefficiency by assigning to each time scale an appropriately chosen step-size. However, such approaches are limited by resonance phenomena, wherein motion on the fastest time scales limits the step sizes associated with slower time scales. In atomistic models of biomolecular systems, for example, resonances limit the largest time step to around 5-6 fs. In this paper, we introduce a set of stochastic isokinetic equations of motion that are shown to be rigorously ergodic and that can be integrated using a multiple time-stepping algorithm that can be easily implemented in existing molecular dynamics codes. The technique is applied to a simple, illustrative problem and then to a more realistic system, namely, a flexible water model. Using this approach outer time steps as large as 100 fs are shown to be possible.

...read moreread less

Journal Article•DOI•

Multiphysics simulation of corona discharge induced ionic wind

[...]

Davide Cagnoni, Francesco Agostini, Thomas Christen, Carlo de Falco, Nicola Parolini, Ivica Stevanovic - Show less +2 more

27 Jun 2013-arXiv: Computational Physics

TL;DR: This work presents a numerical model for predicting the performance of ionic wind devices or electrostatic fluid accelerators with the main benefit is the ability to accurately predict the amount of charge injected from the corona electrode.

...read moreread less

Abstract: Ionic wind devices or electrostatic fluid accelerators are becoming of increasing interest as tools for thermal management, in particular for semiconductor devices. In this work, we present a numerical model for predicting the performance of such devices, whose main benefit is the ability to accurately predict the amount of charge injected at the corona electrode. Our multiphysics numerical model consists of a highly nonlinear strongly coupled set of PDEs including the Navier-Stokes equations for fluid flow, Poisson's equation for electrostatic potential, charge continuity and heat transfer equations. To solve this system we employ a staggered solution algorithm that generalizes Gummel's algorithm for charge transport in semiconductors. Predictions of our simulations are validated by comparison with experimental measurements and are shown to closely match. Finally, our simulation tool is used to estimate the effectiveness of the design of an electrohydrodynamic cooling apparatus for power electronics applications.

...read moreread less

Journal Article•DOI•

Explicitly correlated plane waves: Accelerating convergence in periodic wavefunction expansions

[...]

Andreas Grüneis¹, James J. Shepherd¹, Ali Alavi¹, David P. Tew, George H. Booth¹ - Show less +1 more•Institutions (1)

University of Cambridge¹

24 Jul 2013-arXiv: Computational Physics

TL;DR: In this paper, an explicitly correlated plane wave basis for periodic wave function expansions at the level of second-order Mller-Plesset perturbation theory (MP2) was investigated and compared to conventional MP2 theory in a finite homogeneous electron gas model.

...read moreread less

Abstract: We present an investigation into the use of an explicitly correlated plane wave basis for periodic wavefunction expansions at the level of second-order M{\o}ller-Plesset perturbation theory (MP2). The convergence of the electronic correlation energy with respect to the one-electron basis set is investigated and compared to conventional MP2 theory in a finite homogeneous electron gas model. In addition to the widely used Slater-type geminal correlation factor, we also derive and investigate a novel correlation factor that we term Yukawa-Coulomb. The Yukawa-Coulomb correlation factor is motivated by analytic results for two electrons in a box and allows for a further improved convergence of the correlation energies with respect to the employed basis set. We find the combination of the infinitely delocalized plane waves and local short-ranged geminals provides a complementary, and rapidly convergent basis for the description of periodic wavefunctions. We hope that this approach will expand the scope of discrete wavefunction expansions in periodic systems.

...read moreread less

Journal Article•DOI•

Information-theoretic tools for parametrized coarse-graining of non-equilibrium extended systems

[...]

Markos A. Katsoulakis¹, Petr Plecháč²•Institutions (2)

University of Massachusetts Amherst¹, University of Delaware²

29 Apr 2013-arXiv: Computational Physics

TL;DR: This paper proposes error estimation and controlled-fidelity model reduction methods based on Path-Space Information Theory, combined with statistical parametric estimation of rates for non-equilibrium stationary processes, and proposes an asymptotically equivalent method-related to maximum likelihood estimators for stochastic processes.

...read moreread less

Abstract: In this paper we focus on the development of new methods suitable for efficient and reliable coarse-graining of {\it non-equilibrium} molecular systems. In this context, we propose error estimation and controlled-fidelity model reduction methods based on Path-Space Information Theory, and combine it with statistical parametric estimation of rates for non-equilibrium stationary processes. The approach we propose extends the applicability of existing information-based methods for deriving parametrized coarse-grained models to Non-Equilibrium systems with Stationary States (NESS). In the context of coarse-graining it allows for constructing optimal parametrized Markovian coarse-grained dynamics, by minimizing information loss (due to coarse-graining) on the path space. Furthermore, the associated path-space Fisher Information Matrix can provide confidence intervals for the corresponding parameter estimators. We demonstrate the proposed coarse-graining method in a non-equilibrium system with diffusing interacting particles, driven by out-of-equilibrium boundary conditions.

...read moreread less

Journal Article•DOI•

Libsharp - spherical harmonic transforms revisited

[...]

Martin Reinecke¹, D. S. Seljebotn²•Institutions (2)

Max Planck Society¹, University of Oslo²

18 Mar 2013-arXiv: Computational Physics

TL;DR: Libsharp as discussed by the authors is a code library for spherical harmonic transforms (SHTs), which evolved from the libpsht library, addressing several of its shortcomings, such as adding MPI support for distributed memory systems and SHTs of fields with arbitrary spin, but also supporting new developments in CPU instruction sets like AVX or fused multiply-accumulate (FMA) instructions.

...read moreread less

Abstract: We present libsharp, a code library for spherical harmonic transforms (SHTs), which evolved from the libpsht library, addressing several of its shortcomings, such as adding MPI support for distributed memory systems and SHTs of fields with arbitrary spin, but also supporting new developments in CPU instruction sets like the Advanced Vector Extensions (AVX) or fused multiply-accumulate (FMA) instructions. The library is implemented in portable C99 and provides an interface that can be easily accessed from other programming languages such as C++, Fortran, Python etc. Generally, libsharp's performance is at least on par with that of its predecessor; however, significant improvements were made to the algorithms for scalar SHTs, which are roughly twice as fast when using the same CPU capabilities. The library is available at this http URL under the terms of the GNU General Public License.

...read moreread less

Posted Content•

HOOMD-blue: A Python package for high-performance molecular dynamics and hard particle Monte Carlo simulations

[...]

Joshua A. Anderson¹, Jens Glaser¹, Sharon C. Glotzer¹•Institutions (1)

University of Michigan¹

26 Aug 2013-arXiv: Computational Physics

TL;DR: HOOMD-blue as discussed by the authors is a particle simulation engine designed for nano-and colloidal-scale molecular dynamics and hard particle Monte Carlo simulations, which has been actively developed since March 2007 and available open source since August 2008.

...read moreread less

Abstract: HOOMD-blue is a particle simulation engine designed for nano- and colloidal-scale molecular dynamics and hard particle Monte Carlo simulations. It has been actively developed since March 2007 and available open source since August 2008. HOOMD-blue is a Python package with a high performance C++/CUDA backend that we built from the ground up for GPU acceleration. The Python interface allows users to combine HOOMD-blue with with other packages in the Python ecosystem to create simulation and analysis workflows. We employ software engineering practices to develop, test, maintain, and expand the code.

...read moreread less

Journal Article•DOI•

Adaptive two-regime method: application to front propagation

[...]

Martin Robinson¹, Mark B. Flegg², Radek Erban¹•Institutions (2)

University of Oxford¹, Monash University²

22 Dec 2013-arXiv: Computational Physics

TL;DR: This paper uses the Adaptive Two-Regime Method for an in-depth study of front propagation in a stochastic reaction-diffusion system which has its mean-field model given in terms of the Fisher equation.

...read moreread less

Abstract: The Adaptive Two-Regime Method (ATRM) is developed for hybrid (multiscale) stochastic simulation of reaction-diffusion problems. It efficiently couples detailed Brownian dynamics simulations with coarser lattice-based models. The ATRM is a generalization of the previously developed Two-Regime Method [Flegg et al, Journal of the Royal Society Interface, 2012] to multiscale problems which require a dynamic selection of regions where detailed Brownian dynamics simulation is used. Typical applications include a front propagation or spatio-temporal oscillations. In this paper, the ATRM is used for an in-depth study of front propagation in a stochastic reaction-diffusion system which has its mean-field model given in terms of the Fisher equation [Fisher, Annals of Eugenics, 1937]. It exhibits a travelling reaction front which is sensitive to stochastic fluctuations at the leading edge of the wavefront. Previous studies into stochastic effects on the Fisher wave propagation speed have focused on lattice-based models, but there has been limited progress using off-lattice (Brownian dynamics) models, which suffer due to their high computational cost, particularly at the high molecular numbers that are necessary to approach the Fisher mean-field model. By modelling only the wavefront itself with the off-lattice model, it is shown that the ATRM leads to the same Fisher wave results as purely off-lattice models, but at a fraction of the computational cost. The error analysis of the ATRM is also presented for a morphogen gradient model.

...read moreread less

Journal Article•DOI•

Numerical Integration of the Extended Variable Generalized Langevin Equation with a Positive Prony Representable Memory Kernel

[...]

Andrew Baczewski¹, Stephen D. Bond•Institutions (1)

Sandia National Laboratories¹

24 Apr 2013-arXiv: Computational Physics

TL;DR: This article derives a family of extended variable integrators for the Generalized Langevin equation with a positive Prony series memory kernel using stability and error analysis and implements the corresponding numerical algorithm in the LAMMPS MD software package.

...read moreread less

Abstract: Generalized Langevin dynamics (GLD) arise in the modeling of a number of systems, ranging from structured fluids that exhibit a viscoelastic mechanical response, to biological systems, and other media that exhibit anomalous diffusive phenomena. Molecular dynamics (MD) simulations that include GLD in conjunction with external and/or pairwise forces require the development of numerical integrators that are efficient, stable, and have known convergence properties. In this article, we derive a family of extended variable integrators for the Generalized Langevin equation (GLE) with a positive Prony series memory kernel. Using stability and error analysis, we identify a superlative choice of parameters and implement the corresponding numerical algorithm in the LAMMPS MD software package. Salient features of the algorithm include exact conservation of the first and second moments of the equilibrium velocity distribution in some important cases, stable behavior in the limit of conventional Langevin dynamics, and the use of a convolution-free formalism that obviates the need for explicit storage of the time history of particle velocities. Capability is demonstrated with respect to accuracy in numerous canonical examples, stability in certain limits, and an exemplary application in which the effect of a harmonic confining potential is mapped onto a memory kernel.

...read moreread less

Posted Content•

GPU peer-to-peer techniques applied to a cluster interconnect

[...]

Roberto Ammendola, Massimo Bernaschi, Andrea Biagioni, Mauro Bisson, Massimiliano Fatica¹, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Enrico Mastrostefano², Pierluigi Paolucci, Davide Rossetti, Francesco Simula, Laura Tosoratto, Piero Vicini - Show less +10 more•Institutions (2)

Nvidia¹, Sapienza University of Rome²

31 Jul 2013-arXiv: Computational Physics

TL;DR: In this paper, the authors describe the architectural modifications required to implement peer-to-peer access to NVIDIA Fermi- and Kepler-class GPUs on an FPGA-based cluster interconnect.

...read moreread less

Abstract: Modern GPUs support special protocols to exchange data directly across the PCI Express bus. While these protocols could be used to reduce GPU data transmission times, basically by avoiding staging to host memory, they require specific hardware features which are not available on current generation network adapters. In this paper we describe the architectural modifications required to implement peer-to-peer access to NVIDIA Fermi- and Kepler-class GPUs on an FPGA-based cluster interconnect. Besides, the current software implementation, which integrates this feature by minimally extending the RDMA programming model, is discussed, as well as some issues raised while employing it in a higher level API like MPI. Finally, the current limits of the technique are studied by analyzing the performance improvements on low-level benchmarks and on two GPU-accelerated applications, showing when and how they seem to benefit from the GPU peer-to-peer method.

...read moreread less

Journal Article•DOI•

Fixed-node errors in quantum Monte Carlo: interplay of electron density and node nonlinearities

[...]

Kevin Rasch, Shuming Hu, Lubos Mitas

09 Oct 2013-arXiv: Computational Physics

TL;DR: In this article, the origin of the fixed-node errors in quantum Monte Carlo calculations is investigated. And the key features which affect the fixed node errors are the differences in electron density and the degree of node nonlinearity.

...read moreread less

Abstract: We elucidate the origin of large differences (two-fold or more) in the fixed-node errors between the first- vs second-row systems for single-configuration trial wave functions in quantum Monte Carlo calculations. This significant difference in the fixed-node biases is studied across a set of atoms, molecules, and also Si, C solid crystals. The analysis is done over valence isoelectronic systems that share similar correlation energies, bond patterns, geometries, ground states, and symmetries. We show that the key features which affect the fixed-node errors are the differences in electron density and the degree of node nonlinearity. The findings reveal how the accuracy of the quantum Monte Carlo varies across a variety of systems, provide new perspectives on the origins of the fixed-node biases in electronic structure calculations of molecular and condensed systems, and carry implications for pseudopotential constructions for heavy elements

...read moreread less

Journal Article•DOI•

Three-dimensional brittle fracture: configurational-force-driven crack propagation

[...]

Lukasz Kaczmarczyk, Mohaddeseh Mousavi Nezhad, Chris J. Pearce

22 Apr 2013-arXiv: Computational Physics

TL;DR: In this paper, the authors present a computational framework for quasi-static brittle fracture in 3D solids, based on the concept of configurational mechanics, consistent with Griffith's theory.

...read moreread less

Abstract: This paper presents a computational framework for quasi-static brittle fracture in three dimensional solids. The paper set outs the theoretical basis for determining the initiation and direction of propagating cracks based on the concept of configurational mechanics, consistent with Griffith's theory. Resolution of the propagating crack by the finite element mesh is achieved by restricting cracks to element faces and adapting the mesh to align it with the predicted crack direction. A local mesh improvement procedure is developed to maximise mesh quality in order to improve both accuracy and solution robustness and to remove the influence of the initial mesh on the direction of propagating cracks. An arc-length control technique is derived to enable the dissipative load path to be traced. A hierarchical hp-refinement strategy is implemented in order to improve both the approximation of displacements and crack geometry. The performance of this modelling approach is demonstrated on two numerical examples that qualitatively illustrate its ability to predict complex crack paths. All problems are three-dimensional, including a torsion problem that results in the accurate prediction of a doubly-curved crack.

...read moreread less

Journal Article•DOI•

Discrete flow mapping: transport of phase space densities on triangulated surfaces

[...]

David J. Chappell, Gregor Tanner, Niels Søndergaard, Dominik Loechel

18 Mar 2013-arXiv: Computational Physics

TL;DR: This paper presents an efficient and widely applicable method, called discrete flow mapping, for solving problems on triangulated surfaces of high-frequency linear wave fields, and an application in structural dynamics, determining the vibroacoustic response of a cast aluminium car body component is presented.

...read moreread less

Abstract: Energy distributions of high frequency linear wave fields are often modelled in terms of flow or transport equations with ray dynamics given by a Hamiltonian vector field in phase space. Applications arise in underwater and room acoustics, vibro-acoustics, seismology, electromagnetics, and quantum mechanics. Related flow problems based on general conservation laws are used, for example, in weather forecasting or molecular dynamics simulations. Solutions to these flow equations are often large scale, complex and high-dimensional, leading to formidable challenges for numerical approximation methods. This paper presents an efficient and widely applicable method, called discrete flow mapping, for solving such problems on triangulated surfaces. An application in structural dynamics - determining the vibro-acoustic response of a cast aluminium car body component - is presented.

...read moreread less

Posted Content•

Electromagnetic Wave Source Conditions

[...]

Ardavan Oskooi, Steven G. Johnson

23 Jan 2013-arXiv: Computational Physics

TL;DR: In this paper, the relationship between current sources and the resulting electromagnetic waves in FDTD simulations is discussed and the effects of dispersion and discretization are discussed, and a simple technique to separate incident and scattered fields is described to compensate for imperfect equivalent currents.

...read moreread less

Abstract: This chapter discusses the relationships between current sources and the resulting electromagnetic waves in FDTD simulations. First, the "total-field/scattered-field" approach to creating incident plane waves is reviewed and seen to be a special case of the well-known principle of equivalence in electromagnetism: this can be used to construct "equivalent" current sources for any desired incident field, including waveguide modes. The effects of dispersion and discretization are discussed, and a simple technique to separate incident and scattered fields is described in order to compensate for imperfect equivalent currents. The important concept of the local density of states (LDOS) is reviewed, which elucidates the relationship between current sources and the resulting fields, including enhancement of the LDOS via mode cutoffs (Van Hove singularities) and resonant cavities (Purcell enhancement). We also address various other source techniques such as covering a wide range of frequencies and incident angles in a small number of simulations for waves incident on a periodic surface, sources to excite eigenmodes in rectangular supercells of periodic systems, moving sources, and thermal sources via a Monte Carlo/Langevin approach.

...read moreread less

Posted Content•

Gradient type optimization methods for electronic structure calculations

[...]

Xin Zhang¹, Jinwei Zhu¹, Zaiwen Wen¹, Aihui Zhou¹•Institutions (1)

Chinese Academy of Sciences¹

13 Aug 2013-arXiv: Computational Physics

TL;DR: In this paper, the authors study gradient-based methods for solving the direct minimization problem by constructing new iterations along the gradient on the Stiefel manifold, which can outperform SCF consistently on many practically large systems.

...read moreread less

Abstract: The density functional theory (DFT) in electronic structure calculations can be formulated as either a nonlinear eigenvalue or direct minimization problem. The most widely used approach for solving the former is the so-called self-consistent field (SCF) iteration. A common observation is that the convergence of SCF is not clear theoretically while approaches with convergence guarantee for solving the latter are often not competitive to SCF numerically. In this paper, we study gradient type methods for solving the direct minimization problem by constructing new iterations along the gradient on the Stiefel manifold. Global convergence (i.e., convergence to a stationary point from any initial solution) as well as local convergence rate follows from the standard theory for optimization on manifold directly. A major computational advantage is that the computation of linear eigenvalue problems is no longer needed. The main costs of our approaches arise from the assembling of the total energy functional and its gradient and the projection onto the manifold. These tasks are cheaper than eigenvalue computation and they are often more suitable for parallelization as long as the evaluation of the total energy functional and its gradient is efficient. Numerical results show that they can outperform SCF consistently on many practically large systems.

...read moreread less

Journal Article•DOI•

Multiple Time Step Integrators in Ab Initio Molecular Dynamics

[...]

Nathan Luehr¹, Thomas E. Markland¹, Todd J. Martínez¹•Institutions (1)

Stanford University¹

10 Nov 2013-arXiv: Computational Physics

TL;DR: Two schemes that enable efficient time-scale separation in ab initio calculations are presented: one based on fragment decomposition and the other on range separation of the Coulomb operator in the electronic Hamiltonian.

...read moreread less

Abstract: Multiple time-scale algorithms exploit the natural separation of time-scales in chemical systems to greatly accelerate the efficiency of molecular dynamics simulations. Although the utility of these methods in systems where the interactions are described by empirical potentials is now well established, their application to ab initio molecular dynamics calculations has been limited by difficulties associated with splitting the ab initio potential into fast and slowly varying components. Here we show that such a timescale separation is possible using two different schemes: one based on fragment decomposition and the other on range separation of the Coulomb operator in the electronic Hamiltonian. We demonstrate for both water clusters and a solvated hydroxide ion that multiple time-scale molecular dynamics allows for outer time steps of 2.5 fs, which are as large as those obtained when such schemes are applied to empirical potentials, while still allowing for bonds to be broken and reformed throughout the dynamics. This permits computational speedups of up to 4.4x, compared to standard Born-Oppenheimer ab initio molecular dynamics with a 0.5 fs time step, while maintaining the same energy conservation and accuracy.

...read moreread less

Posted Content•

Robust Compressive Phase Retrieval via L1 Minimization With Application to Image Reconstruction

[...]

Zai Yang, Cishen Zhang, Lihua Xie

01 Feb 2013-arXiv: Computational Physics

TL;DR: For real-valued, nonnegative image reconstruction, the image of interest is shown to be an optimal solution of the formulated l1 minimization in the noise free case and the proposed approach is fast, accurate and robust to measurements noises.

...read moreread less

Abstract: Phase retrieval refers to a classical nonconvex problem of recovering a signal from its Fourier magnitude measurements Inspired by the compressed sensing technique, signal sparsity is exploited in recent studies of phase retrieval to reduce the required number of measurements, known as compressive phase retrieval (CPR) In this paper, l1 minimization problems are formulated for CPR to exploit the signal sparsity and alternating direction algorithms are presented for problem solving For real-valued, nonnegative image reconstruction, the image of interest is shown to be an optimal solution of the formulated l1 minimization in the noise free case Numerical simulations demonstrate that the proposed approach is fast, accurate and robust to measurements noises

...read moreread less

Journal Article•DOI•

First Evaluation of the CPU, GPGPU and MIC Architectures for Real Time Particle Tracking based on Hough Transform at the LHC

[...]

V. Halyo¹, Patrick LeGresley¹, Paul Lujan¹, V. Karpusenko, Andrey Vladimirov - Show less +1 more•Institutions (1)

Princeton University¹

28 Oct 2013-arXiv: Computational Physics

TL;DR: A new tracking algorithm based on the Hough transform will be evaluated for the first time on a multi-core Intel Xeon E5-2697v2 CPU, an NVIDIA Tesla K20c GPU, and an Intel \xphi\ 7120 coprocessor.

...read moreread less

Abstract: Recent innovations focused around {\em parallel} processing, either through systems containing multiple processors or processors containing multiple cores, hold great promise for enhancing the performance of the trigger at the LHC and extending its physics program. The flexibility of the CMS/ATLAS trigger system allows for easy integration of computational accelerators, such as NVIDIA's Tesla Graphics Processing Unit (GPU) or Intel's \xphi, in the High Level Trigger. These accelerators have the potential to provide faster or more energy efficient event selection, thus opening up possibilities for new complex triggers that were not previously feasible. At the same time, it is crucial to explore the performance limits achievable on the latest generation multicore CPUs with the use of the best software optimization methods. In this article, a new tracking algorithm based on the Hough transform will be evaluated for the first time on a multi-core Intel Xeon E5-2697v2 CPU, an NVIDIA Tesla K20c GPU, and an Intel \xphi\ 7120 coprocessor. Preliminary time performance will be presented.

...read moreread less

Journal Article•DOI•

Generalized Taylor-Duffy Method for Efficient Evaluation of Galerkin Integrals in Boundary-Element Method Computations

[...]

M. T. Homer Reid, Steven G. Johnson, Jacob K. White

05 Dec 2013-arXiv: Computational Physics

TL;DR: A significant improvement in its efficiency is achieved by showing how the dimension of the final numerical integral may often be reduced by one, if n is the number of common vertices between the two triangles.

...read moreread less

Abstract: We present a generic technique, automated by computer-algebra systems and available as open-source software \cite{scuff-em}, for efficient numerical evaluation of a large family of singular and nonsingular 4-dimensional integrals over triangle-product domains, such as those arising in the boundary-element method (BEM) of computational electromagnetism. To date, practical implementation of BEM solvers has often required the aggregation of multiple disparate integral-evaluation schemes to treat all of the distinct types of integrals needed for a given BEM formulation; in contrast, our technique allows many different types of integrals to be handled by the \emph{same} algorithm and the same code implementation. Our method is a significant generalization of the Taylor--Duffy approach \cite{Taylor2003,Duffy1982}, which was originally presented for just a single type of integrand; in addition to generalizing this technique to a broad class of integrands, we also achieve a significant improvement in its efficiency by showing how the \emph{dimension} of the final numerical integral may often be reduced by one. In particular, if $n$ is the number of common vertices between the two triangles, in many cases we can reduce the dimension of the integral from $4-n$ to $3-n$, obtaining a closed-form analytical result for $n=3$ (the common-triangle case).

...read moreread less

Journal Article•DOI•

Graphics processing units accelerated semiclassical initial value representation molecular dynamics

[...]

Dario Tamascelli¹, Francesco Saverio Dambrosio¹, Riccardo Conte², Michele Ceotto¹•Institutions (2)

University of Milan¹, Emory University²

17 Dec 2013-arXiv: Computational Physics

TL;DR: This paper presents a Graphics Processing Units (GPUs) implementation of the Semiclassical Initial Value Representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations, showing a reduction in computational time and power consumption and semiclassical GPU calculations are shown to be environment friendly.

...read moreread less

Abstract: This paper presents a Graphics Processing Units (GPUs) implementation of the Semiclassical Initial Value Representation (SC-IVR) propagator for vibrational molecular spectroscopy calculations. The time-averaging formulation of the SC-IVR for power spectrum calculations is employed. Details about the GPU implementation of the semiclassical code are provided. Four molecules with an increasing number of atoms are considered and the GPU-calculated vibrational frequencies perfectly match the benchmark values. The computational time scaling of two GPUs (NVIDIA Tesla C2075 and Kepler K20) respectively versus two CPUs (Intel Core i5 and Intel Xeon E5-2687W) and the critical issues related to the GPU implementation are discussed. The resulting reduction in computational time and power consumption is significant and semiclassical GPU calculations are shown to be environment friendly.

...read moreread less

Collapse