scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Computational Physics in 2018"


Journal ArticleDOI
TL;DR: This work presents a fully automated approach for the generation of datasets with the intent of training universal ML potentials based on the concept of active learning (AL) via Query by Committee (QBC), which uses the disagreement between an ensemble ofML potentials to infer the reliability of the ensemble's prediction.
Abstract: The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this work, we present a fully automated approach for the generation of datasets with the intent of training universal ML potentials. It is based on the concept of active learning (AL) via Query by Committee (QBC), which uses the disagreement between an ensemble of ML potentials to infer the reliability of the ensemble's prediction. QBC allows the presented AL algorithm to automatically sample regions of chemical space where the ML potential fails to accurately predict the potential energy. AL improves the overall fitness of ANAKIN-ME (ANI) deep learning potentials in rigorous test cases by mitigating human biases in deciding what new training data to use. AL also reduces the training set size to a fraction of the data required when using naive random sampling techniques. To provide validation of our AL approach we develop the COMP6 benchmark (publicly available on GitHub), which contains a diverse set of organic molecules. Through the AL process, it is shown that the AL-based potentials perform as well as the ANI-1 potential on COMP6 with only 10% of the data, and vastly outperforms ANI-1 with 25% the amount of data. Finally, we show that our proposed AL technique develops a universal ANI potential (ANI-1x) that provides accurate energy and force predictions on the entire COMP6 benchmark. This universal ML potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials, while remaining applicable to the general class of organic molecules comprised of the elements CHNO.

351 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists and emphasize the many natural connections between ML and statistical physics.
Abstract: Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute. (Notebooks are available at this https URL )

249 citations


Journal ArticleDOI
TL;DR: In this paper, automatic protocols to select a number of fingerprints out of a large pool of candidates, based on the correlations that are intrinsic to the training data, have been proposed.
Abstract: Machine learning of atomic-scale properties is revolutionizing molecular modelling, making it possible to evaluate inter-atomic potentials with first-principles accuracy, at a fraction of the costs. The accuracy, speed and reliability of machine-learning potentials, however, depends strongly on the way atomic configurations are represented, i.e. the choice of descriptors used as input for the machine learning method. The raw Cartesian coordinates are typically transformed in "fingerprints", or "symmetry functions", that are designed to encode, in addition to the structure, important properties of the potential-energy surface like its invariances with respect to rotation, translation and permutation of like atoms. Here we discuss automatic protocols to select a number of fingerprints out of a large pool of candidates, based on the correlations that are intrinsic to the training data. This procedure can greatly simplify the construction of neural network potentials that strike the best balance between accuracy and computational efficiency, and has the potential to accelerate by orders of magnitude the evaluation of Gaussian Approximation Potentials based on the Smooth Overlap of Atomic Positions kernel. We present applications to the construction of neural network potentials for water and for an Al-Mg-Si alloy, and to the prediction of the formation energies of small organic molecules using Gaussian process regression.

228 citations


Posted Content
TL;DR: This work extends PINNs to fractional PINNs (fPINNs) to solve space-time fractional advection-diffusion equations (fractional ADEs), and demonstrates their accuracy and effectiveness in solving multi-dimensional forward and inverse problems with forcing terms whose values are only known at randomly scattered spatio-temporal coordinates (black-box forcing terms).
Abstract: Physics-informed neural networks (PINNs) are effective in solving integer-order partial differential equations (PDEs) based on scattered and noisy data. PINNs employ standard feedforward neural networks (NNs) with the PDEs explicitly encoded into the NN using automatic differentiation, while the sum of the mean-squared PDE-residuals and the mean-squared error in initial/boundary conditions is minimized with respect to the NN parameters. We extend PINNs to fractional PINNs (fPINNs) to solve space-time fractional advection-diffusion equations (fractional ADEs), and we demonstrate their accuracy and effectiveness in solving multi-dimensional forward and inverse problems with forcing terms whose values are only known at randomly scattered spatio-temporal coordinates (black-box forcing terms). A novel element of the fPINNs is the hybrid approach that we introduce for constructing the residual in the loss function using both automatic differentiation for the integer-order operators and numerical discretization for the fractional operators. We consider 1D time-dependent fractional ADEs and compare white-box (WB) and black-box (BB) forcing. We observe that for the BB forcing fPINNs outperform FDM. Subsequently, we consider multi-dimensional time-, space-, and space-time-fractional ADEs using the directional fractional Laplacian and we observe relative errors of $10^{-4}$. Finally, we solve several inverse problems in 1D, 2D, and 3D to identify the fractional orders, diffusion coefficients, and transport velocities and obtain accurate results even in the presence of significant noise.

177 citations


Posted Content
TL;DR: A deep learning based approach is demonstrated to build a ROM using the POD basis of canonical DNS datasets, for turbulent flow control applications and finds that a type of Recurrent Neural Network, the Long Short Term Memory (LSTM) shows attractive potential in modeling temporal dynamics of turbulence.
Abstract: Reduced Order Modeling (ROM) for engineering applications has been a major research focus in the past few decades due to the unprecedented physical insight into turbulence offered by high-fidelity CFD. The primary goal of a ROM is to model the key physics/features of a flow-field without computing the full Navier-Stokes (NS) equations. This is accomplished by projecting the high-dimensional dynamics to a low-dimensional subspace, typically utilizing dimensionality reduction techniques like Proper Orthogonal Decomposition (POD), coupled with Galerkin projection. In this work, we demonstrate a deep learning based approach to build a ROM using the POD basis of canonical DNS datasets, for turbulent flow control applications. We find that a type of Recurrent Neural Network, the Long Short Term Memory (LSTM) which has been primarily utilized for problems like speech modeling and language translation, shows attractive potential in modeling temporal dynamics of turbulence. Additionally, we introduce the Hurst Exponent as a tool to study LSTM behavior for non-stationary data, and uncover useful characteristics that may aid ROM development for a variety of applications.

155 citations


Journal ArticleDOI
TL;DR: A data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks and a hybrid architecture, extending the LSTM with a mean stochastic model (MSM–L STM), is proposed to ensure convergence to the invariant measure.
Abstract: We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.

148 citations


Journal ArticleDOI
TL;DR: QMCPACK as mentioned in this paper is an open source quantum Monte Carlo package for ab-initio electronic structure calculations that supports calculations of metallic and insulating solids, molecules, atoms, and some model Hamiltonians.
Abstract: QMCPACK is an open source quantum Monte Carlo package for ab-initio electronic structure calculations. It supports calculations of metallic and insulating solids, molecules, atoms, and some model Hamiltonians. Implemented real space quantum Monte Carlo algorithms include variational, diffusion, and reptation Monte Carlo. QMCPACK uses Slater-Jastrow type trial wave functions in conjunction with a sophisticated optimizer capable of optimizing tens of thousands of parameters. The orbital space auxiliary field quantum Monte Carlo method is also implemented, enabling cross validation between different highly accurate methods. The code is specifically optimized for calculations with large numbers of electrons on the latest high performance computing architectures, including multicore central processing unit (CPU) and graphical processing unit (GPU) systems. We detail the program's capabilities, outline its structure, and give examples of its use in current research calculations. The package is available at this http URL .

147 citations


Journal ArticleDOI
TL;DR: In this article, an active learning procedure called Deep Potential Generator (DP-GEN) is proposed for the construction of accurate and transferable machine learning-based models of the potential energy surface (PES) for the molecular modeling of materials.
Abstract: An active learning procedure called Deep Potential Generator (DP-GEN) is proposed for the construction of accurate and transferable machine learning-based models of the potential energy surface (PES) for the molecular modeling of materials. This procedure consists of three main components: exploration, generation of accurate reference data, and training. Application to the sample systems of Al, Mg and Al-Mg alloys demonstrates that DP-GEN can produce uniformly accurate PES models with a minimal number of reference data.

139 citations


Journal ArticleDOI
TL;DR: QuSpin this article is an open-source Python package for exact diagonalization and quantum dynamics of arbitrary boson, fermion and spin many-body systems, supporting the use of various (user-defined) symmetries in one and higher dimension and (imaginary) time evolution following a user-specified driving protocol.
Abstract: We present a major update to QuSpin, SciPostPhys.2.1.003 -- an open-source Python package for exact diagonalization and quantum dynamics of arbitrary boson, fermion and spin many-body systems, supporting the use of various (user-defined) symmetries in one and higher dimension and (imaginary) time evolution following a user-specified driving protocol. We explain how to use the new features of QuSpin using seven detailed examples of various complexity: (i) the transverse-field Ising chain and the Jordan-Wigner transformation, (ii) free particle systems: the Su-Schrieffer-Heeger (SSH) model, (iii) the many-body localized 1D Fermi-Hubbard model, (iv) the Bose-Hubbard model in a ladder geometry, (v) nonlinear (imaginary) time evolution and the Gross-Pitaevskii equation on a 1D lattice, (vi) integrability breaking and thermalizing dynamics in the translationally-invariant 2D transverse-field Ising model, and (vii) out-of-equilibrium Bose-Fermi mixtures. This easily accessible and user-friendly package can serve various purposes, including educational and cutting-edge experimental and theoretical research. The complete package documentation is available under this http URL.

112 citations


Posted Content
TL;DR: CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force-matching scheme, is introduced, which shows that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse- grained beads and no solvent, while classical coarse-Graining methods fail to capture crucial features of the free energy surface.
Abstract: Atomistic or ab-initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse-grained beads and no solvent, while classical coarse-graining methods fail to capture crucial features of the free energy surface. Thus, CGnets are able to capture multi-body terms that emerge from the dimensionality reduction.

106 citations


Journal ArticleDOI
TL;DR: It is found that, by including physical parameters that are known to affect permeability into the neural network, the physics-informed CNN generated better results than regular CNN, however, improvements vary with implemented heterogeneity.
Abstract: Fast prediction of permeability directly from images enabled by image recognition neural networks is a novel pore-scale modeling method that has a great potential. This article presents a framework that includes (1) generation of porous media samples, (2) computation of permeability via fluid dynamics simulations, (3) training of convolutional neural networks (CNN) with simulated data, and (4) validations against simulations. Comparison of machine learning results and the ground truths suggests excellent predictive performance across a wide range of porosities and pore geometries, especially for those with dilated pores. Owning to such heterogeneity, the permeability cannot be estimated using the conventional Kozeny-Carman approach. Computational time was reduced by several orders of magnitude compared to fluid dynamic simulations. We found that, by including physical parameters that are known to affect permeability into the neural network, the physics-informed CNN generated better results than regular CNN, however improvements vary with implemented heterogeneity.

Posted Content
TL;DR: Deep Potential - Smooth Edition (DeepPot-SE), an end-to-end machine learning-based PES model, which is able to efficiently represent the PES for a wide variety of systems with the accuracy of ab initio quantum mechanics models is developed.
Abstract: Machine learning models are changing the paradigm of molecular modeling, which is a fundamental tool for material science, chemistry, and computational biology. Of particular interest is the inter-atomic potential energy surface (PES). Here we develop Deep Potential - Smooth Edition (DeepPot-SE), an end-to-end machine learning-based PES model, which is able to efficiently represent the PES for a wide variety of systems with the accuracy of ab initio quantum mechanics models. By construction, DeepPot-SE is extensive and continuously differentiable, scales linearly with system size, and preserves all the natural symmetries of the system. Further, we show that DeepPot-SE describes finite and extended systems including organic molecules, metals, semiconductors, and insulators with high fidelity.

Journal ArticleDOI
TL;DR: This approach can optimize 2D-PC structures over a parameter space of a size unfeasibly large for previous optimization methods that were based solely on direct calculations and is also useful for improving other optical characteristics.
Abstract: An approach to optimizing the Q factors of two-dimensional photonic crystal (2D-PC) nanocavities based on deep learning is proposed and demonstrated. We prepare a dataset consisting of 1000 nanocavities generated by randomly displacing the positions of many air holes of a base nanocavity and their Q factors calculated by a first-principle method. We train a four-layer neural network including a convolutional layer to recognize the relationship between the air holes' displacements and the Q factors using the prepared dataset. After the training, the neural network becomes able to estimate the Q factors from the air holes' displacements with an error of 13% in standard deviation. Crucially, the trained neural network can estimate the gradient of the Q factor with respect to the air holes' displacements very quickly based on back-propagation. A nanocavity structure with an extremely high Q factor of 1.58 x 10^9 is successfully obtained by optimizing the positions of 50 air holes over ~10^6 iterations, having taken advantage of the very fast evaluation of the gradient in high-dimensional parameter space. The obtained Q factor is more than one order of magnitude higher than that of the base cavity and more than twice that of the highest Q factors reported so far for cavities with similar modal volumes. This approach can optimize 2D-PC structures over a parameter space of a size unfeasibly large for previous optimization methods based solely on direct calculations. We believe this approach is also useful for improving other optical characteristics.

Posted Content
Kim Albertsson, Piero Altoè, Dustin Anderson, John Anderson, Michael Benjamin Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian John Bevan, Wahid Bhimji, Daniele Bonacorsi, Bjorn Burkle, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Yi-fan Chen, Taylor Childers, Yann Coadou, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Andrea De Simone, Javier Duarte, Martin Erdmann, Jonas Nathanael Eschle, Amir Farbin, Matthew Feickert, Nuno Filipe Castro, Conor Fitzpatrick, Michele Floris, Alessandra Forti, Jordi Garra-Tico, J. Gemmler, Maria Girone, Paul Glaysher, Sergei Gleyzer, Vladimir Gligorov, Tobias Golling, Jonas Graw, Lindsey Gray, Dick Greenwood, Thomas J. Hacker, John T Harvey, Benedikt Hegner, Lukas Heinrich, Ulrich Heintz, Ben Hooberman, Johannes Josef Junggeburth, Michael Kagan, Meghan Kane, Konstantin Kanishchev, Przemysław Karpiński, Zahari Kassabov, Gaurav Kaul, Dorian Kcira, T. Keck, Alexei Klimentov, Jim Kowalkowski, L. Kreczko, A. B. Kurepin, Rob Kutschke, Valentin Kuznetsov, Nicolas Maximilian Köhler, Igor Lakomov, Kevin Lannon, Mario Lassnig, Antonio Limosani, Gilles Louppe, Aashrita Mangu, Pere Mato, Narain Meenakshi, H. Meinhard, Dario Menasce, Lorenzo Moneta, Seth Moortgat, Mark Neubauer, Harvey B Newman, Sydney Otten, Hans Pabst, Michela Paganini, Manfred Paulini, Gabriel Perdue, Uzziel Perez, Attilio Picazio, Jim Pivarski, Harrison Prosper, Fernanda Psihas, A. Radovic, Ryan Reece, A. Rinkevicius, Eduardo Rodrigues, Jamal Rorie, David Rousseau, Aaron G. Sauers, Steven Schramm, Ariel Schwartzman, Horst Severini, Paul Seyfert, Filip Siroky, Konstantin Skazytkin, M. D. Sokoloff, Graeme Stewart, Bob Stienen, Ian Stockdale, Giles Strong, Wei Sun, Savannah Jennifer Thais, Karen Tomko, Eli Upfal, Emanuele Usai, Andrey Ustyuzhanin, Martin Vala, J. Vasel, Sofia Vallecorsa, Mauro Verzetti, Xavier Vilasis-Cardona, Jean-Roch Vlimant, Ilija Vukotic, Sean-Jiun Wang, Gordon Watts, Michael Williams, Wenjing Wu, Stefan Wunsch, Kun Yang, Omar Zapata 
TL;DR: In this article, a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science is discussed.
Abstract: Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit.

Posted Content
TL;DR: In this paper, a two-step low-rank factorization of the Hamiltonian and cluster operator, accompanied by truncation of small terms, is proposed to reduce the complexity of the computation.
Abstract: The quantum simulation of quantum chemistry is a promising application of quantum computers. However, for N molecular orbitals, the $\mathcal{O}(N^4)$ gate complexity of performing Hamiltonian and unitary Coupled Cluster Trotter steps makes simulation based on such primitives challenging. We substantially reduce the gate complexity of such primitives through a two-step low-rank factorization of the Hamiltonian and cluster operator, accompanied by truncation of small terms. Using truncations that incur errors below chemical accuracy, we are able to perform Trotter steps of the arbitrary basis electronic structure Hamiltonian with $\mathcal{O}(N^3)$ gate complexity in small simulations, which reduces to $\mathcal{O}(N^2 \log N)$ gate complexity in the asymptotic regime, while our unitary Coupled Cluster Trotter step has $\mathcal{O}(N^3)$ gate complexity as a function of increasing basis size for a given molecule. In the case of the Hamiltonian Trotter step, these circuits have $\mathcal{O}(N^2)$ depth on a linearly connected array, an improvement over the $\mathcal{O}(N^3)$ scaling assuming no truncation. As a practical example, we show that a chemically accurate Hamiltonian Trotter step for a 50 qubit molecular simulation can be carried out in the molecular orbital basis with as few as 4,000 layers of parallel nearest-neighbor two-qubit gates, consisting of fewer than 100,000 non-Clifford rotations. We also apply our algorithm to iron-sulfur clusters relevant for elucidating the mode of action of metalloenzymes.

Journal ArticleDOI
TL;DR: In this paper, the authors conducted density functional theory (DFT) and classical molecular dynamics simulations to study the mechanical, thermal conductivity and stability, electronic and optical properties of single-layer B-graphdiyne.
Abstract: Most recently, boron-graphdiyne, a {\pi}-conjugated two-dimensional (2D) structure made from merely sp carbon skeleton connected with boron atoms was successfully experimentally realized through a bottom-to-up synthetic strategy. Motivated by this exciting experimental advance, we conducted density functional theory (DFT) and classical molecular dynamics simulations to study the mechanical, thermal conductivity and stability, electronic and optical properties of single-layer B-graphdiyne. We particularly analyzed the application of this novel 2D material as an anode for Li, Na, Mg and Ca ions storage. Uniaxial tensile simulation results reveal that B-graphdiyne owing to its porous structure and flexibility can yield superstretchability. The single-layer B-graphdiyne was found to exhibit semiconducting electronic character, with a narrow band-gap of 1.15 eV based on the HSE06 prediction. It was confirmed that the mechanical straining can be employed to further tune the optical absorbance and electronic band-gap of B-graphdiyne. Ab initio molecular dynamics results reveal that B-graphdiyne can withstand at high temperatures, like 2500 K. The thermal conductivity of suspended single-layer B-graphdiyne was predicted to be very low, ~2.5 W/mK at the room temperature. Our first-principles results reveal the outstanding prospect of B-graphdiyne as an anode material with ultrahigh charge capacities of 808 mAh/g, 5174 mAh/g and 3557 mAh/g for Na, Ca and Li ions storage, respectively. The comprehensive insight provided by this investigation highlights the outstanding physics of B-graphdiyne nanomembranes, and suggest them as highly promising candidates for the design of novel stretchable nanoelectronics and energy storage devices.

Journal ArticleDOI
TL;DR: Li and Na adatoms illustrate outstanding anodic characteristics for rechargeable storage cells as mentioned in this paper, and the insertion of Li/Na into the novel N-graphdiyne materials enhances the electrical conductivity of nanosheets.
Abstract: N-graphdiyne monolayers, a set of carbon-nitride nanosheets, have been synthesized recently through the polymerization of triazine- and pyrazine-based monomers. Since the two-dimensional nano-structures are mainly composed of light-weight nonmetallic elements including carbon and nitrogen, they might be able to provide high storage capacities for rechargeable cells. In this study, we used extensive first principle calculations such as electronic density of states, band structure, adsorption energy, open-circuit voltage, nudged-elastic band and charge analyses to investigate the application of the newly fabricated N-graphdiyne monolayers as the anode material for Li/Na/Mg ion batteries. Our calculations suggest that while Mg foreign atoms poorly interact with monolayers, Li and Na adatoms illustrate outstanding anodic characteristics for rechargeable storage cells. Electronic density of states calculations indicate that the insertion of Li/Na into the novel N-graphdiyne materials enhances the electrical conductivity of nanosheets. Adsorption energy and open-circuit voltage calculations predict that the nanosheets can provide a high storage capacity spectrum of 623-2180 mAh/g which is higher than that for most recently discovered 2D materials (e.g. phosphorene, borophane, and germanene involve Li binding capacities of 433, 504, and 369 mAh/g, respectively) and it is also significantly greater than the capacity of commercial anode materials (e.g. graphite contains a capacity of 372 mAh/g). This study provides valuable insights about the electronic characteristics of newly fabricated N-graphdiyne nanomaterials, rendering them as promising candidates to be used in the growing industry of rechargeable storage devices.

Journal ArticleDOI
TL;DR: The Belle II experiment at the KEK laboratory in Japan is described in this paper, where the core components of the Belle II software that provide the foundation for the development of complex algorithms and their efficient application on large data sets.
Abstract: Modern high-energy physics (HEP) enterprises, such as the Belle II experiment at the KEK laboratory in Japan, create huge amounts of data Sophisticated algorithms for simulation, reconstruction, visualization, and analysis are required to fully exploit the potential of these data We describe the core components of the Belle II software that provide the foundation for the development of complex algorithms and their efficient application on large data sets

Journal ArticleDOI
TL;DR: In this article, the authors presented a new database of candidate molecules for organic photovoltaic applications, comprising approximately 91,000 unique chemical structures, and showed that message-passing neural networks trained with and without 3D structural information for these molecules achieve similar accuracy, comparable to state-of-the-art methods on existing benchmark datasets.
Abstract: Machine learning methods have shown promise in predicting molecular properties, and given sufficient training data machine learning approaches can enable rapid high-throughput virtual screening of large libraries of compounds. Graph-based neural network architectures have emerged in recent years as the most successful approach for predictions based on molecular structure, and have consistently achieved the best performance on benchmark quantum chemical datasets. However, these models have typically required optimized 3D structural information for the molecule to achieve the highest accuracy. These 3D geometries are costly to compute for high levels of theory, limiting the applicability and practicality of machine learning methods in high-throughput screening applications. In this study, we present a new database of candidate molecules for organic photovoltaic applications, comprising approximately 91,000 unique chemical structures.Compared to existing datasets, this dataset contains substantially larger molecules (up to 200 atoms) as well as extrapolated properties for long polymer chains. We show that message-passing neural networks trained with and without 3D structural information for these molecules achieve similar accuracy, comparable to state-of-the-art methods on existing benchmark datasets. These results therefore emphasize that for larger molecules with practical applications, near-optimal prediction results can be obtained without using optimized 3D geometry as an input. We further show that learned molecular representations can be leveraged to reduce the training data required to transfer predictions to a new DFT functional.

Posted Content
TL;DR: In this article, a lattice Boltzmann (LB) model based on the Allen-Cahn phase-field theory is proposed for simulating axisymmetric multiphase flows.
Abstract: In this paper, a novel lattice Boltzmann (LB) model based on the Allen-Cahn phase-field theory is proposed for simulating axisymmetric multiphase flows. The most striking feature of the model is that it enables to handle multiphase flows with large density ratio, which are unavailable in all previous axisymmetric LB models. The present model utilizes two LB evolution equations, one of which is used to solve fluid interface, and another is adopted to solve hydrodynamic properties. To simulate axisymmetric multiphase flows effectively, the appropriate source term and equilibrium distribution function are introduced into the LB equation for interface tracking, and simultaneously, a simple and efficient forcing distribution function is also delicately designed in the LB equation for hydrodynamic properties. Unlike many existing LB models, the source and forcing terms of the model arising from the axisymmetric effect include no additional gradients, and consequently, the present model contains only one non-local phase field variable, which in this regard is much simpler. We further conducted the Chapman-Enskog analysis to demonstrate the consistencies of our present MRT-LB model with the axisymmetric Allen-Cahn equation and hydrodynamic equations. A series of numerical examples, including static droplet, oscillation of a viscous droplet, breakup of a liquid thread, and bubble rising in a continuous phase, are used to test the performance of the proposed model. It is found that the present model can generate relatively small spurious velocities and can capture interfacial dynamics with higher accuracy than the previously improved axisymmetric LB model. Besides, it is also found that our present numerical results show excellent agreement with analytical solutions or available experimental data for a wide range of density ratios, which highlights the strengths of the proposed model.

Posted Content
TL;DR: In this paper, a meshless method is presented to solve the radiative transfer equation in the even parity formulation of the discrete ordinates method in complex 2D and 3D geometries.
Abstract: A meshless method is presented to solve the radiative transfer equation in the even parity formulation of the discrete ordinates method in complex 2D and 3D geometries. Prediction results of radiative heat transfer problems obtained by the proposed method are compared with reference in order to assess the correctness of the present method.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate the utility of an unsupervised machine learning tool for the detection of phase transitions in off-lattice systems using principal component analysis (PCA).
Abstract: We demonstrate the utility of an unsupervised machine learning tool for the detection of phase transitions in off-lattice systems. We focus on the application of principal component analysis (PCA) to detect the freezing transitions of two-dimensional hard-disk and three-dimensional hard-sphere systems as well as liquid-gas phase separation in a patchy colloid model. As we demonstrate, PCA autonomously discovers order-parameter-like quantities that report on phase transitions, mitigating the need for a priori construction or identification of a suitable order parameter--thus streamlining the routine analysis of phase behavior. In a companion paper, we further develop the method established here to explore the detection of phase transitions in various model systems controlled by compositional demixing, liquid crystalline ordering, and non-equilibrium active forces.

Journal ArticleDOI
TL;DR: In this paper, a numerical method for the accurate and efficient simulation of strongly localized light sources, such as quantum dots, embedded in dielectric micro-optical structures is presented.
Abstract: We present a numerical method for the accurate and efficient simulation of strongly localized light sources, such as quantum dots, embedded in dielectric micro-optical structures. We apply the method in order to optimize the photon extraction efficiency of a single-photon emitter consisting of a quantum dot embedded into a multi-layer stack with further lateral structures. Furthermore, we present methods to study the robustness of the extraction efficiency with respect to fabrication errors and defects.

Posted Content
TL;DR: In this article, a unified stochastic particle ESBGK (USP-ESBGK) method was proposed by combining the molecular convection and collision effects to simulate multiscale gas flows ranging from rarefied to continuum regime.
Abstract: The stochastic particle method based on Bhatnagar-Gross-Krook (BGK) or ellipsoidal statistical BGK (ESBGK) model approximates the pairwise collisions in the Boltzmann equation using a relaxation process. Therefore, it is more efficient to simulate gas flows at small Knudsen numbers than the counterparts based on the original Boltzmann equation, such as the Direct Simulation Monte Carlo (DSMC) method. However, the traditional stochastic particle BGK method decouples the molecular motions and collisions in analogy to the DSMC method, and hence its transport properties deviate from physical values as the time step increases. This defect significantly affects its computational accuracy and efficiency for the simulation of multiscale flows, especially when the transport processes in the continuum regime is important. In the present paper, we propose a unified stochastic particle ESBGK (USP-ESBGK) method by combining the molecular convection and collision effects. In the continuum regime, the proposed method can be applied using large temporal-spatial discretization and approaches to the Navier-Stokes solutions accurately. Furthermore, it is capable to simulate both the small scale non-equilibrium flows and large scale continuum flows within a unified framework efficiently and accurately. The applications of USP-ESBGK method to a variety of benchmark problems, including Couette flow, thermal Couette flow, Poiseuille flow, Sod tube flow, cavity flow, and flow through a slit, demonstrated that it is a promising tool to simulate multiscale gas flows ranging from rarefied to continuum regime.

Posted Content
TL;DR: This manual provides a general guide to compiling and running BIGSTICK, which comes with numerous sample input files, as well as some of the basic theory underlying the code.
Abstract: We present BIGSTICK, a flexible configuration-interaction open-source shell-model code for the many-fermion problem. Written mostly in Fortran 90 with some later extensions, BIGSTICK utilizes a factorized on-the-fly algorithm for computing many-body matrix elements, and has both MPI (distributed memory) and OpenMP (shared memory) parallelization, and can run on platforms ranging from laptops to the largest parallel supercomputers. It uses a flexible yet efficient many-body truncation scheme, and reads input files in multiple formats, allowing one to tackle both phenomenological (major valence shell space) and ab initio (the so-called no-core shell model) calculations. BIGSTICK can generate energy spectra, static and transition one-body densities, and expectation values of scalar operators. Using the built-in Lanczos algorithm one can compute transition probability distributions and decompose wave functions into components defined by group theory. This manual provides a general guide to compiling and running BIGSTICK, which comes with numerous sample input files, as well as some of the basic theory underlying the code.

Journal ArticleDOI
TL;DR: In this paper, the authors study the performance of fourth-order gradient expansions of the Kohn-Sham kinetic energy density (KED) in semi-local kinetic energy functionals depending on the density-dependent variables.
Abstract: We study the performance of fourth-order gradient expansions of the kinetic energy density (KED) in semi-local kinetic energy functionals depending on the density-dependent variables. The formal fourth-order expansion is convergent for periodic systems and small molecules but does not improve over the second-order expansion (Thomas-Fermi term plus one-ninth of von Weizsacker term). Linear fitting of the expansion coefficients somewhat improves on the formal expansion. The tuning of the fourth order expansion coefficients allows for better reproducibility of Kohn-Sham kinetic energy density than the tuning of the second-order expansion coefficients alone. The possibility of a much more accurate match with the Kohn-Sham kinetic energy density by using neural networks trained using the terms of the 4th order expansion as density-dependent variables is demonstrated. We obtain ultra-low fitting errors without overfitting. Small single hidden layer neural networks can provide good accuracy in separate KED fits of each compound, while for joint fitting of KEDs of multiple compounds multiple hidden layers were required to achieve good fit quality. The critical issue of data distribution is highlighted. We also show the critical role of pseudopotentials in the performance of the expansion, where in the case of a too rapid decay of the valence density at the nucleus with some pseudopotentials, numeric instabilities arise.

Journal ArticleDOI
TL;DR: In this article, the authors compare the performance of two approaches in the study of homogeneous crystallization of two simple metals, Na and Al, and search for the most efficient collective variables that can be expressed as a linear combination of X-ray diffraction peak intensities.
Abstract: Several enhanced sampling methods such as umbrella sampling or metadynamics rely on the identification of an appropriate set of collective variables Recently two methods have been proposed to alleviate the task of determining efficient collective variables One is based on linear discriminant analysis, the other on a variational approach to conformational dynamics, and uses time-lagged independent component analysis In this paper, we compare the performance of these two approaches in the study of the homogeneous crystallization of two simple metals We focus on Na and Al and search for the most efficient collective variables that can be expressed as a linear combination of X-ray diffraction peak intensities We find that the performances of the two methods are very similar However, the method based on linear discriminant analysis, in its harmonic version, is to be preferred because it is simpler and much less computationally demanding

Posted Content
TL;DR: In this article, the authors examined various integration schemes for the time-dependent Kohn-Sham equations and compared the performance of four different families of propagators: linear multistep, runge-kutta, exponential Runge-Kutta and commutator-free Magnus integrator.
Abstract: We examine various integration schemes for the time-dependent Kohn-Sham equations. Contrary to the time-dependent Schrodinger's equation, this set of equations is non-linear, due to the dependence of the Hamiltonian on the electronic density. We discuss some of their exact properties, and in particular their symplectic structure. Four different families of propagators are considered, specifically the linear multistep, Runge-Kutta, exponential Runge-Kutta, and the commutator-free Magnus schemes. These have been chosen because they have been largely ignored in the past for time-dependent electronic structure calculations. The performance is analyzed in terms of cost-versus-accuracy. The clear winner, in terms of robustness, simplicity, and efficiency is a simplified version of a fourth-order commutator-free Magnus integrator. However, in some specific cases, other propagators, such as some implicit versions of the multistep methods, may be useful.

Journal ArticleDOI
TL;DR: The simulations show that turbostratic stacking of hydrated Na- and Ca-montMorillonite and hydrated montmorillonite with intercalated carbon dioxide is an energetically demanding process accompanied by an increase in the interlayer spacing, while rotational disordering of dry or nearly dry smectite systems can be ener getically favorable.
Abstract: Molecular dynamics simulations using classical force fields were carried out to study energetic and structural properties of rotationally disordered clay mineral-water-CO2 systems at pressure and temperature relevant to geological carbon storage. The simulations show that turbostratic stacking of hydrated Na- and Ca-montmorillonite and hydrated montmorillonite with intercalated carbon dioxide is an energetically demanding process accompanied by an increase in the interlayer spacing. On the other hand, rotational disordering of dry or nearly dry smectite systems can be energetically favorable. The distributions of interlayer species are calculated as a function of the rotational angle between adjacent clay layers.

Journal ArticleDOI
Claas Abert1
TL;DR: In this article, an overview of the analytical micromagnetic model as well as its numerical implementation is given, where the main focus is put on the integration of spin-transport effects with classical micromagnetics.
Abstract: Computational micromagnetics has become an indispensable tool for the theoretical investigation of magnetic structures. Classical micromagnetics has been successfully applied to a wide range of applications including magnetic storage media, magnetic sensors, permanent magnets and more. The recent advent of spintronics devices has lead to various extensions to the micromagnetic model in order to account for spin-transport effects. This article aims to give an overview over the analytical micromagnetic model as well as its numerical implementation. The main focus is put on the integration of spin-transport effects with classical micromagnetics.