scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Chemical Theory and Computation in 2020"


Journal ArticleDOI
TL;DR: The updated model presented here, ff19SB, when combined with a more accurate water model such as OPC, should have better predictive power for modeling sequence-specific behavior, protein mutations, and also rational protein design.
Abstract: Molecular dynamics (MD) simulations have become increasingly popular in studying the motions and functions of biomolecules. The accuracy of the simulation, however, is highly determined by the mole...

568 citations


Journal ArticleDOI
TL;DR: In this article, the Variational Quantum Eigensolver (VQE) technique performs separate measurements for multiple parts of the system Hamiltonian, which can be used to obtain estimates of electronic energies.
Abstract: To obtain estimates of electronic energies, the Variational Quantum Eigensolver (VQE) technique performs separate measurements for multiple parts of the system Hamiltonian. Current quantum hardware...

144 citations


Journal ArticleDOI
TL;DR: This work provides an extension of the ANI-1x model that is trained to three additional chemical elements: S, F, and Cl, and is shown to accurately predict molecular energies compared to DFT with a ~106 factor speedup and a negligible slowdown.
Abstract: Machine learning (ML) methods have become powerful, predictive tools in a wide range of applications, such as facial recognition and autonomous vehicles. In the sciences, computational chemists and physicists have been using ML for the prediction of physical phenomena, such as atomistic potential energy surfaces and reaction pathways. Transferable ML potentials, such as ANI-1x, have been developed with the goal of accurately simulating organic molecules containing the chemical elements H, C, N, and O. Here, we provide an extension of the ANI-1x model. The new model, dubbed ANI-2x, is trained to three additional chemical elements: S, F, and Cl. Additionally, ANI-2x underwent torsional refinement training to better predict molecular torsion profiles. These new features open a wide range of new applications within organic chemistry and drug development. These seven elements (H, C, N, O, F, Cl, and S) make up ∼90% of drug-like molecules. To show that these additions do not sacrifice accuracy, we have tested this model across a range of organic molecules and applications, including the COMP6 benchmark, dihedral rotations, conformer scoring, and nonbonded interactions. ANI-2x is shown to accurately predict molecular energies compared to density functional theory with a ∼106 factor speedup and a negligible slowdown compared to ANI-1x and shows subchemical accuracy across most of the COMP6 benchmark. The resulting model is a valuable tool for drug development which can potentially replace both quantum calculations and classical force fields for a myriad of applications.

139 citations


Journal ArticleDOI
TL;DR: The problem of finding the minimum number of fully commuting groups of terms covering the whole Hamiltonian is found to be equivalent to the minimum clique cover problem for a graph representing Hamiltonian terms as vertices and commutativity between them as edges.
Abstract: The Variational Quantum Eigensolver approach to the electronic structure problem on a quantum computer involves measurement of the Hamiltonian expectation value. Formally, quantum mechanics allows ...

132 citations


Journal ArticleDOI
TL;DR: The need to define not only the operators present in the ansatz, but also the order in which they appear is established, necessary for adhering to the quantum chemical notion of a "model chemistry", in addition to the general importance of scientific reproducibility.
Abstract: The variational quantum eigensolver (VQE) has emerged as one of the most promising near-term quantum algorithms that can be used to simulate many-body systems such as molecular electronic structures. Serving as an attractive ansatz in the VQE algorithm, unitary coupled cluster (UCC) theory has seen a renewed interest in recent literature. However, unlike the original classical UCC theory, implementation on a quantum computer requires a finite-order Suzuki-Trotter decomposition to separate the exponentials of the large sum of Pauli operators. While previous literature has recognized the nonuniqueness of different orderings of the operators in the Trotterized form of UCC methods, the question of whether or not different orderings matter at the chemical scale has not been addressed. In this Letter, we explore the effect of operator ordering on the Trotterized UCCSD ansatz, as well as the much more compact k-UpCCGSD ansatz recently proposed by Lee et al. [ J. Chem. Theory Comput. , 2019 , 15 , 311 . arXiv, 2019 , quant-ph:1909.09114. https://arxiv.org/abs/1909.09114 ]. We observe a significant, system-dependent variation in the energies of Trotterizations with different operator orderings. The energy variations occur on a chemical scale, sometimes on the order of hundreds of kcal/mol. This Letter establishes the need to define not only the operators present in the ansatz but also the order in which they appear. This is necessary for adhering to the quantum chemical notion of a "model chemistry", in addition to the general importance of scientific reproducibility. As a final note, we suggest a useful strategy to select out of the combinatorial number of possibilities, a single well-defined and effective ordering of the operators.

128 citations


Journal ArticleDOI
TL;DR: The present contribution gathers a large, diverse and accurate set of more than 200 highly-accurate transition energies for states of various natures (valence, Rydberg, singlet, triplet, n-pi*, pi-pi*...) to benchmark a series of popular methods for excited state calculations.
Abstract: Following our previous work focussing on compounds containing up to 3 non-hydrogen atoms [\emph{J. Chem. Theory Comput.} {\bfseries 14} (2018) 4360--4379], we present here highly-accurate vertical transition energies obtained for 27 molecules encompassing 4, 5, and 6 non-hydrogen atoms. To obtain these energies, we use equation-of-motion coupled cluster theory up to the highest technically possible excitation order for these systems (CC3, EOM-CCSDT, and EOM-CCSDTQ), selected configuration interaction (SCI) calculations (with tens of millions of determinants in the reference space), as well as the multiconfigurational $n$-electron valence state perturbation theory (NEVPT2) method. All these approaches are applied in combination with diffuse-containing atomic basis sets. For all transitions, we report at least CC3/\emph{aug}-cc-pVQZ vertical excitation energies as well as CC3/\emph{aug}-cc-pVTZ oscillator strengths for each dipole-allowed transition. We show that CC3 almost systematically delivers transition energies in agreement with higher-level methods with a typical deviation of $\pm 0.04$ eV, except for transitions with a dominant double excitation character where the error is much larger. The present contribution gathers a large, diverse and accurate set of more than 200 highly-accurate transition energies for states of various natures (valence, Rydberg, singlet, triplet, $n \rightarrow \pi^*$, $\pi \rightarrow \pi^*$, \ldots). We use this series of theoretical best estimates to benchmark a series of popular methods for excited state calculations: CIS(D), ADC(2), CC2, STEOM-CCSD, EOM-CCSD, CCSDR(3), CCSDT-3, CC3, as well as NEVPT2. The results of these benchmarks are compared to the available literature data.

117 citations


Journal ArticleDOI
TL;DR: The results suggest that orbital optimized excited state DFT methods can be used to push past the limitations of TDDFT to doubly excited, charge-transfer or Rydberg states, making them a useful tool for the practical quantum chemist's toolbox for studying excited states in large systems.
Abstract: We present a general approach to converge excited state solutions to any quantum chemistry orbital optimization process, without the risk of variational collapse. The resulting square gradient minimization (SGM) approach only requires analytic energy/Lagrangian orbital gradients and merely costs 3 times as much as ground state orbital optimization (per iteration), when implemented via a finite difference approach. SGM is applied to both single determinant ΔSCF and spin-purified restricted open-shell Kohn-Sham (ROKS) approaches to study the accuracy of orbital optimized DFT excited states. It is found that SGM can converge challenging states where the maximum overlap method (MOM) or analogues either collapse to the ground state or fail to converge. We also report that ΔSCF/ROKS predict highly accurate excitation energies for doubly excited states (which are inaccessible via TDDFT). Singly excited states obtained via ROKS are also found to be quite accurate, especially for Rydberg states that frustrate (semi)local TDDFT. Our results suggest that orbital optimized excited state DFT methods can be used to push past the limitations of TDDFT to doubly excited, charge-transfer, or Rydberg states, making them a useful tool for the practical quantum chemist's toolbox for studying excited states in large systems.

116 citations


Journal ArticleDOI
TL;DR: An iterative version of the qubit coupled cluster (QCC) method to find ground electronic energies of molecules on noisy intermediate-scale quantum (NISQ) devices is proposed and an algorithm for constructing this set that scales linearly with the size of the Hamiltonian is reported.
Abstract: An iterative version of the qubit coupled cluster (QCC) method [I. G. Ryabinkin et al., J. Chem. Theory Comput. 2019, 14, 6317] is proposed. The new method seeks to find ground electronic energies ...

114 citations


Journal ArticleDOI
TL;DR: A new version of NCIPLOT, NCIPlOT4, is presented, which allows quantifying the properties of the NCI regions (volume, charge) in small and big systems in a fast manner, and enables to characterize and retrieve local information in supramolecular chemistry and biosystems at the static and dynamic level.
Abstract: The NonCovalent Interaction index (NCI) enables identification of attractive and repulsive noncovalent interactions from promolecular densities in a fast manner. However, the approach remained up to now qualitative, only providing visual information. We present a new version of NCIPLOT, NCIPLOT4, which allows quantifying the properties of the NCI regions (volume, charge) in small and big systems in a fast manner. Examples are provided of how this new twist enables characterization and retrieval of local information in supramolecular chemistry and biosystems at the static and dynamic levels.

112 citations


Journal ArticleDOI
TL;DR: A review of the current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab-initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
Abstract: Machine learning encompasses a set of tools and algorithms which are now becoming popular in almost all scientific and technological fields This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab-initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling

110 citations


Journal ArticleDOI
TL;DR: This work shows that a useful paradigm for generating efficient selected CI/exact diagonalization algorithms is driven by fast sorting algorithms, much in the same way iterative diagonalization is based on the paradigm of matrix vector multipli- cation.
Abstract: Recent advances in selected configuration interaction methods have made them competitive with the most accurate techniques available and, hence, creating an increasingly powerful tool for solving quantum Hamiltonians. In this work, we build on recent advances from the adaptive sampling configuration interaction (ASCI) algorithm. We show that a useful paradigm for generating efficient selected CI/exact diagonalization algorithms is driven by fast sorting algorithms, much in the same way iterative diagonalization is based on the paradigm of matrix vector multiplication. We present several new algorithms for all parts of performing a selected CI, which includes new ASCI search, dynamic bit masking, fast orbital rotations, fast diagonal matrix elements, and residue arrays. The ASCI search algorithm can be used in several different modes, which includes an integral driven search and a coefficient driven search. The algorithms presented here are fast and scalable, and we find that because they are built on fast sorting algorithms they are more efficient than all other approaches we considered. After introducing these techniques, we present ASCI results applied to a large range of systems and basis sets to demonstrate the types of simulations that can be practically treated at the full-CI level with modern methods and hardware, presenting double- and triple-ζ benchmark data for the G1 data set. The largest of these calculations is Si2H6 which is a simulation of 34 electrons in 152 orbitals. We also present some preliminary results for fast deterministic perturbation theory simulations that use hash functions to maintain high efficiency for treating large basis sets.

Journal ArticleDOI
TL;DR: In this article, a multireference selected quantum Krylov (MRSQK) algorithm is proposed for quantum simulation of many-body problems, which is a low-cost alternative to the quantum phase estimation algorithm that generates a target state as a linear combination of nonorthogonal Krylov basis states.
Abstract: We introduce a multireference selected quantum Krylov (MRSQK) algorithm suitable for quantum simulation of many-body problems. MRSQK is a low-cost alternative to the quantum phase estimation algorithm that generates a target state as a linear combination of non-orthogonal Krylov basis states. This basis is constructed from a set of reference states via real-time evolution; thus, avoiding the numerical optimization of parameters. An efficient algorithm for the evaluation of the off-diagonal matrix elements of the overlap and Hamiltonian matrices is discussed and a selection procedure is introduced to identify a basis of orthogonal references that ameliorates the linear dependency problem. Preliminary benchmarks on linear H6, H8, and BeH2 indicate that MRSQK can predict the energy of these systems accurately using very compact Krylov bases.

Journal ArticleDOI
TL;DR: It is shown that state-of-the-art force fields tuned to provide an accurate description of both ordered and disordered proteins can be limited in their ability to accurately describe protein-protein complexes, and an extensive reparameterization of one variant of the Amber protein force field is called DES-Amber.
Abstract: The accuracy of atomistic physics-based force fields for the simulation of biological macromolecules has typically been benchmarked experimentally using biophysical data from simple, often single-chain systems. In the case of proteins, the careful refinement of force field parameters associated with torsion-angle potentials and the use of improved water models have enabled a great deal of progress toward the highly accurate simulation of such monomeric systems in both folded and, more recently, disordered states. In living organisms, however, proteins constantly interact with other macromolecules, such as proteins and nucleic acids, and these interactions are often essential for proper biological function. Here, we show that state-of-the-art force fields tuned to provide an accurate description of both ordered and disordered proteins can be limited in their ability to accurately describe protein-protein complexes. This observation prompted us to perform an extensive reparameterization of one variant of the Amber protein force field. Our objective involved refitting not only the parameters associated with torsion-angle potentials but also the parameters used to model nonbonded interactions, the specification of which is expected to be central to the accurate description of multicomponent systems. The resulting force field, which we call DES-Amber, allows for more accurate simulations of protein-protein complexes, while still providing a state-of-the-art description of both ordered and disordered single-chain proteins. Despite the improvements, calculated protein-protein association free energies still appear to deviate substantially from experiment, a result suggesting that more fundamental changes to the force field, such as the explicit treatment of polarization effects, may simultaneously further improve the modeling of single-chain proteins and protein-protein complexes.

Journal ArticleDOI
TL;DR: High-dimensional neural network potentials can be employed to automatically generate the potential energy surface of finite sized clusters at coupled cluster accuracy, namely CCSD(T*)-F12a/aug-cc-pVTZ and this process will allow one to tackle finite systems much beyond the present case.
Abstract: Highly accurate potential energy surfaces are of key interest for the detailed understanding and predictive modeling of chemical systems. In recent years, several new types of force fields, which are based on machine learning algorithms and fitted to ab initio reference calculations, have been introduced to meet this requirement. Here, we show how high-dimensional neural network potentials can be employed to automatically generate the potential energy surface of finite sized clusters at coupled cluster accuracy, namely CCSD(T*)-F12a/aug-cc-pVTZ. The developed automated procedure utilizes the established intrinsic properties of the model such that the configurations for the training set are selected in an unbiased and efficient way to minimize the computational effort of expensive reference calculations. These ideas are applied to protonated water clusters from the hydronium cation, H3O+, up to the tetramer, H9O4+, and lead to a single potential energy surface that describes all these systems at essentially converged coupled cluster accuracy with a fitting error of 0.06 kJ/mol per atom. The fit is validated in detail for all clusters up to the tetramer and yields reliable results not only for stationary points but also for reaction pathways and intermediate configurations as well as different sampling techniques. Per design, the neural network potentials (NNPs) constructed in this fashion can handle very different conditions including the quantum nature of the nuclei and enhanced sampling techniques covering very low as well as high temperatures. This enables fast and exhaustive exploration of the targeted protonated water clusters with essentially converged interactions. In addition, the automated process will allow one to tackle finite systems much beyond the present case.

Journal ArticleDOI
TL;DR: In this article, the adaptive sampling configuration interaction (ASCI) method is used as an approximate full active space self-consistent field solver in the active space to solve the orbital optimization problem.
Abstract: The complete active space self-consistent field (CASSCF) method is the principal approach employed for studying strongly correlated systems. However, exact CASSCF can only be performed on small active spaces of ∼20 electrons in ∼20 orbitals due to exponential growth in the computational cost. We show that employing the Adaptive Sampling Configuration Interaction (ASCI) method as an approximate Full CI solver in the active space allows CASSCF-like calculations within chemical accuracy (<1 kcal/mol for relative energies) in active spaces with more than ∼50 active electrons in ∼50 active orbitals, significantly increasing the sizes of systems amenable to accurate multiconfigurational treatment. The main challenge with using any selected CI-based approximate CASSCF is the orbital optimization problem; they tend to exhibit large numbers of local minima in orbital space due to their lack of invariance to active-active rotations (in addition to the local minima that exist in exact CASSCF). We highlight methods that can avoid spurious local extrema as a practical solution to the orbital optimization problem. We employ ASCI-SCF to demonstrate a lack of polyradical character in moderately sized periacenes with up to 52 correlated electrons and compare against heat-bath CI on an iron porphyrin system with more than 40 correlated electrons.

Journal ArticleDOI
TL;DR: LiGaMD provides a powerful enhanced sampling approach for characterizing ligand binding thermodynamics and kinetics simultaneously, which is expected to facilitate computer-aided drug design.
Abstract: Calculations of ligand binding free energies and kinetic rates are important for drug design. However, such tasks have proven challenging in computational chemistry and biophysics. To address this challenge, we have developed a new computational method, ligand Gaussian accelerated molecular dynamics (LiGaMD), which selectively boosts the ligand nonbonded interaction potential energy based on the Gaussian accelerated molecular dynamics (GaMD) enhanced sampling technique. Another boost potential could be applied to the remaining potential energy of the entire system in a dual-boost algorithm (LiGaMD_Dual) to facilitate ligand binding. LiGaMD has been demonstrated on host-guest and protein-ligand binding model systems. Repetitive guest binding and unbinding in the β-cyclodextrin host were observed in hundreds-of-nanosecond LiGaMD_Dual simulations. The calculated guest binding free energies agreed excellently with experimental data with <1.0 kcal/mol errors. Compared with converged microsecond-time scale conventional molecular dynamics simulations, the sampling errors of LiGaMD_Dual simulations were also <1.0 kcal/mol. Accelerations of ligand kinetic rate constants in LiGaMD simulations were properly estimated using Kramers' rate theory. Furthermore, LiGaMD allowed us to capture repetitive dissociation and binding of the benzamidine inhibitor in trypsin within 1 μs simulations. The calculated ligand binding free energy and kinetic rate constants compared well with the experimental data. In summary, LiGaMD provides a powerful enhanced sampling approach for characterizing ligand binding thermodynamics and kinetics simultaneously, which is expected to facilitate computer-aided drug design.

Journal ArticleDOI
TL;DR: A benchmark dataset consisting of 185 protein-peptide complexes with peptide length ranging from 5 to 20 residues was employed to evaluate the performance of fourteen docking programs, and a new evaluation parameter, named IL_RMSD, was proposed to measure the docking accuracy.
Abstract: A large number of protein-protein interactions (PPIs) are mediated by the interactions between proteins and peptide segments binding partners, and therefore determination of protein-peptide interactions (PpIs) is quite crucial to elucidate important biological processes and design peptides or peptidomimetic drugs that can modulate PPIs. Nowadays, as a powerful computation tool, molecular docking has been widely utilized to predict the binding structures of protein-peptide complexes. However, although a number of docking programs have been available, the systematic study on the assessment of their performance for PpIs has never been reported. In this study, a benchmark data set called PepSet consisting of 185 protein-peptide complexes with peptide length ranging from 5 to 20 residues was employed to evaluate the performance of 14 docking programs, including three protein-protein docking programs (ZDOCK, FRODOCK, and HawkDock), three small molecule docking programs (GOLD, Surflex-Dock, and AutoDock Vina), and eight protein-peptide docking programs (GalaxyPepDock, MDockPeP, HPEPDOCK, CABS-dock, pepATTRACT, DINC, AutoDock CrankPep (ADCP), and HADDOCK peptide docking). A new evaluation parameter, named IL_RMSD, was proposed to measure the docking accuracy with fnat (the fraction of native contacts). In global docking, HPEPDOCK performs the best for the entire data set and yields the success rates of 4.3%, 24.3%, and 55.7% at the top 1, 10, and 100 levels, respectively. In local docking, overall, ADCP achieves the best predictions and reaches the success rates of 11.9%, 37.3%, and 70.3% at the top 1, 10, and 100 levels, respectively. It is expected that our work can provide some helpful insights into the selection and development of improved docking programs for PpIs. The benchmark data set is freely available at http://cadd.zju.edu.cn/pepset/.

Journal ArticleDOI
TL;DR: In this paper, the second quantization representation of spatial symmetries is used to reduce the number of qubits required for simulating molecules in order to reduce computational complexity of quantum computers.
Abstract: Simulating molecules is believed to be one of the early stage applications for quantum computers. Current state-of-the-art quantum computers are limited in size and coherence; therefore, optimizing resources to execute quantum algorithms is crucial. In this work, we develop the second quantization representation of spatial symmetries, which are then transformed to their qubit operator representation. These qubit operator representations are used to reduce the number of qubits required for simulating molecules. We present our results for various molecules and elucidate a formal connection of this work with a previous technique that analyzed generic Z2 Pauli symmetries.

Journal ArticleDOI
TL;DR: In this paper, an efficient quantum embedding framework for realistic ab initio density matrix embedding theory (DMET) calculations in solids is described, and the choice of orbitals and mapping to a lattice, treatment of the virtual space and bath truncation are discussed.
Abstract: We describe an efficient quantum embedding framework for realistic ab initio density matrix embedding theory (DMET) calculations in solids. We discuss in detail the choice of orbitals and mapping to a lattice, treatment of the virtual space and bath truncation, and the lattice-to-embedded integral transformation. We apply DMET in this ab initio framework to a hexagonal boron nitride monolayer, crystalline silicon, and nickel monoxide in the antiferromagnetic phase, using large embedded clusters with up to 300 embedding orbitals. We demonstrate our formulation of ab initio DMET in the computation of ground-state properties such as the total energy, equation of state, magnetic moment, and correlation functions.

Journal ArticleDOI
TL;DR: A neural network-specifically, a multilayer perceptron (MLP) is trained as the first example of a machine learning model capable of predicting full adsorption isotherms of different molecules not included in the training of the model, illustrating a new philosophy of training that can be built upon.
Abstract: Tailoring the structure and chemistry of metal-organic frameworks (MOFs) enables the manipulation of their adsorption properties to suit specific energy and environmental applications. As there are millions of possible MOFs (with tens of thousands already synthesized), molecular simulation has frequently been used to rapidly evaluate the adsorption performance of a large set of MOFs. This allows subsequent experiments to focus only on a small subset of the most promising MOFs. In many instances, however, even molecular simulation becomes prohibitively time-consuming, underscoring the need for alternative screening methods, such as machine learning, to precede molecular simulation efforts. In this study, as a proof of concept, we trained a neural network-specifically, a multilayer perceptron (MLP)-as the first example of a machine learning model capable of predicting full adsorption isotherms of different molecules not included in the training of the model. To achieve this, we trained our MLP on "alchemical" species, represented only by variables derived from their force-field parameters, to predict the loadings of real adsorbates. Alchemical species used for training were small, near-spherical, and nonpolar, enabling the prediction of analogous real molecules relevant for chemical separations such as argon, krypton, xenon, methane, ethane, and nitrogen. MOFs were also represented by simple descriptors (e.g., geometric properties and chemical moieties). The trained model was shown to make accurate adsorption predictions for these six adsorbates in both hypothetical and existing MOFs. The MLP presented here is not expected to be applied "as is" to more complex adsorbates with properties not considered during its training. However, our results illustrate a new philosophy of training that can be built upon with the goal of predicting adsorption isotherms in not only a database of MOFs but also a database of adsorbates and over a range of relevant operating conditions.

Journal ArticleDOI
TL;DR: Molecular dynamics simulations of the quintessential choline chloride-urea mixture are reported, using a newly parameterized force field with scaled charges to account for physical properties of hydrated DES mixtures and indicate that water changes the nanostructure of solution even at very low hydration.
Abstract: Deep eutectic mixtures are a promising sustainable and diverse class of tunable solvents that hold great promise for various green chemical and technological processes. Many deep eutectic solvents (DES) are hygroscopic and find use in applications with varying extents of hydration, hence urging a profound understanding of changes in the nanostructure of DES with water content. Here, we report on molecular dynamics simulations of the quintessential choline chloride-urea mixture, using a newly parametrized force field with scaled charges to account for physical properties of hydrated DES mixtures. These simulations indicate that water changes the nanostructure of solution even at very low hydration. We present a novel approach that uses convex constrained analysis to dissect radial distribution functions into base components representing different modes of local association. Specifically, DES mixtures can be deconvoluted locally into two dominant competing nanostructures, whose relative prevalence (but not their salient structural features) change with added water over a wide concentration range, from dry up to ∼30 wt % hydration. Water is found to be associated strongly with several DES components but remarkably also forms linear bead-on-string clusters with chloride. At high water content (beyond ∼50 wt % of water), the solution changes into an aqueous electrolyte-like mixture. Finally, the structural evolution of the solution at the nanoscale with extent of hydration is echoed in the DES macroscopic material properties. These changes to structure, in turn, should prove important in the way DES acts as a solvent and to its interactions with additive components.

Journal ArticleDOI
TL;DR: A symmetry analysis of the key quantities determining transport probabilities of electrons of different spin orientations helps to identify essential constraints in the theoretical description of the CISS effect and draws an analogy with the appearance of imaginary terms in simple models of barrier scattering, which may help understanding the unusually effective long-range electron transfer in biological systems.
Abstract: The chiral-induced spin selectivity (CISS) effect, which describes the spin-filtering ability of diamagnetic structures like DNA or peptides having chiral symmetry, has emerged in the past years as...

Journal ArticleDOI
TL;DR: An alternative procedure called state-targeted energy projection (STEP) is introduced that is based on level shifting and is identical in cost to a normal SCF procedure, yet converges in numerous cases where MOM suffers variational collapse.
Abstract: Orbital optimization is crucial when using a non-Aufbau Slater determinant that involves promotion of an electron from a (nominally) occupied molecular orbital to an unoccupied one, or else ionization from a molecular orbital that lies below the highest occupied frontier molecular orbital. However, orbital relaxation of a non-Aufbau determinant risks "variational collapse" back to the Aufbau solution of the self-consistent field (SCF) equations. Algorithms such as the maximum overlap method (MOM) that are designed to avoid this collapse are not guaranteed to work, and more robust alternatives increase the cost per SCF iteration. Here, we introduce an alternative procedure called state-targeted energy projection (STEP) that is based on level shifting and is identical in cost to a normal SCF procedure, yet converges in numerous cases where MOM suffers variational collapse. Benchmark calculations on small-molecule reference data suggest that ΔSCF calculations based on STEP are an accurate way to compute both ionization and excitation energies, including core-level ionization and excited states with significant double-excitation character. For the molecule 2,4,6-trifluoroborazine, ΔSCF calculations based on STEP afford excellent agreement with experiment for both vertical and adiabatic ionization energies, the latter requiring geometry optimization of a non-Aufbau valence hole. Semiquantitative agreement with experiment is obtained for the absorption spectrum of chlorophyll a. Finally, the importance of asymptotic exchange and correlation is illustrated by application to Rydberg states using spin-scaled Moller-Plesset perturbation theory with a non-Aufbau reference determinant. Together, these results suggest that STEP offers a reliable and affordable alternative to the MOM procedure for determining non-Aufbau solutions of the SCF equations.

Journal ArticleDOI
TL;DR: The iCIPT2 as discussed by the authors algorithm is based on the Epstein-Nesbet second-order perturbation theory (PT2) and is shown to achieve state-of-the-art performance on the C2, O2, Cr2 and C6H6.
Abstract: Even when starting with very poor initial guess, the iterative configuration interaction (iCI) approach [J. Chem. Theory Comput. 12, 1169 (2016)] for strongly correlated electrons can converge from above to full CI (FCI) very quickly by constructing and diagonalizing a very small Hamiltonian matrix at each macro/micro-iteration. However, as a direct solver of the FCI problem, iCI is computationally very expensive. The problem can be mitigated by observing that a vast number of configurations have little weights in the wave function and hence do not contribute discernibly to the correlation energy. The real questions are as follows: (a) how to identify those important configurations as early as possible in the calculation and (b) how to account for the residual contributions of those unimportant configurations. It is generally true that if a high-quality yet compact variational space can be determined for describing static correlation, a low-order treatment of the residual dynamic correlation would then be sufficient. While this is common to all selected CI schemes, the "iCI with selection" scheme presented here has the following distinctive features: (1) the full spin symmetry is always maintained by taking configuration state functions (CSF) as the many-electron basis. (2) Although the selection is performed on individual CSFs, it is orbital configurations (oCFGs) that are used as the organizing units. (3) Given a coefficient pruning-threshold Cmin (which determines the size of the variational space for static correlation), the selection of important oCFGs/CSFs is performed iteratively until convergence. (4) At each iteration, for the growth of the wave function, the first-order interacting space is decomposed into disjoint subspaces so as to reduce memory requirement on the one hand and facilitate parallelization on the other hand. (5) Upper bounds (which involve only two-electron integrals) for the interactions between doubly connected oCFG pairs are used to screen each first-order interacting subspace before the first-order coefficients of individual CSFs are evaluated. (6) Upon convergence of the static correlation for a given Cmin, dynamic correlation is estimated using the state-specific Epstein-Nesbet second-order perturbation theory (PT2). The efficacy of the iCIPT2 scheme is demonstrated numerically using benchmark examples, including C2, O2, Cr2, and C6H6.

Journal ArticleDOI
TL;DR: This work presents a comprehensive description how to postprocess the results of a COSMO calculation through to the evaluation of thermodynamic properties, and assembled a large database of COS MO files, consisting of 2261 compounds, freely available to academic and noncommercial users.
Abstract: The COSMO-SAC modeling approach has found wide application in science as well as in a range of industries due to its good predictive capabilities. While other models for liquid phases, as for examp...

Journal ArticleDOI
TL;DR: It is found that commonly used pure density functionals such as BP86, PBE, M11-L and hybrid functionals with 20%-25% of Hartree-Fock (HF) exchange have a tendency to consistently underestimating vertical excitation energies (VEE) relative to the CC2 values, whereas hybrid density functional with around 50% HF exchange as well as long-range corrected functionals, such as CAM-B3LYP, ωPBE, �
Abstract: Quantum chemical calculations are important for elucidating light-capturing mechanisms in photobiological systems. The time-dependent density functional theory (TDDFT) has become a popular methodology because of its balance between accuracy and computational scaling, despite its problems in describing, for example, charge transfer states. As a step toward systematically understanding the performance of TDDFT calculations on biomolecular systems, we study here 17 commonly used density functionals, including seven long-range separated functionals, and compare the obtained results with excitation energies calculated at the approximate second order coupled-cluster theory level (CC2). The benchmarking set includes the first five singlet excited states of 11 chemical analogues of biochromophores from the green fluorescent protein, rhodopsin/bacteriorhodopsin (Rh/bR), and the photoactive yellow protein. We find that commonly used pure density functionals such as BP86, PBE, M11-L, and hybrid functionals with 20-25% of Hartree-Fock (HF) exchange (B3LYP, PBE0) have a tendency to consistently underestimate vertical excitation energies (VEEs) relative to the CC2 values, whereas hybrid density functionals with around 50% HF exchange such as BHLYP, PBE50, and M06-2X and long-range corrected functionals such as CAM-B3LYP, ωPBE, ωPBEh, ωB97X, ωB97XD, BNL, and M11 overestimate the VEEs. We observe that calculations using the CAM-B3LYP and ωPBEh functionals with 65% and 100% long-range HF exchange, respectively, lead to an overestimation of the VEEs by 0.2-0.3 eV for the benchmarking set. To reduce the systematic error, we introduce here two new empirical functionals, CAMh-B3LYP and ωhPBE0, for which we adjusted the long-range HF exchange to 50%. The introduced parameterization reduces the mean signed average (MSA) deviation to 0.07 eV and the root mean square (rms) deviation to 0.17 eV as compared to the CC2 values. In the present study, TDDFT calculations using the aug-def2-TZVP basis sets, the best performing functionals relative to CC2 are ωhPBE0 (rms = 0.17, MSA = 0.06 eV); CAMh-B3LYP (rms = 0.16, MSA = 0.07 eV); and PBE0 (rms = 0.23, MSA = -0.14 eV). For the popular range-separated CAM-B3LYP functional, we obtain an rms value of 0.31 eV and an MSA value of 0.25 eV, which can be compared with the rms and MSA values of 0.37 and -0.31 eV, respectively, as obtained at the B3LYP level.

Journal ArticleDOI
TL;DR: A generic algorithm is presented to improve the accuracy of coarse-grained IDP models using a diverse set of experimental measurements and combines maximum entropy optimization and least squares regression to systematically adjust model parameters and improve the agreement between simulation and experiment.
Abstract: Intrinsically disordered proteins (IDPs) constitute a significant fraction of eukaryotic proteomes. High-resolution characterization of IDP conformational ensembles can help elucidate their roles in a wide range of biological processes but remains challenging both experimentally and computationally. Here, we present a generic algorithm to improve the accuracy of coarse-grained IDP models using a diverse set of experimental measurements. It combines maximum entropy optimization and least-squares regression to systematically adjust model parameters and improve the agreement between simulation and experiment. We successfully applied the algorithm to derive a transferable force field, which we term the maximum entropy optimized force field (MOFF), for de novo prediction of IDP structures. Statistical analysis of force field parameters reveals features of amino acid interactions not captured by potentials designed to work well for folded proteins. We anticipate its combination of efficiency and accuracy will make MOFF useful for studying the phase separation of IDPs, which drives the formation of various biological compartments.

Journal ArticleDOI
TL;DR: A new variant of the so-called "cheap composite scheme" has been purposely developed for the evaluation of the interaction energy of non-covalent molecular complexes, with its various contributions being tested for a set of 15 systems using the accurate interaction energies reported in ref. 17.
Abstract: A new variant of the so-called “cheap” composite scheme has been purposely developed for the evaluation of the interaction energy of noncovalent molecular complexes, with its various contributions ...

Journal ArticleDOI
TL;DR: This approach uses a global optimization procedure to identify low energy molecular clusters with different numbers of explicit solvent molecules and then employs the Smooth Overlap for Atomic Positions (SOAP) machine learning kernel to quantify the similarity between different low-energy solute environments.
Abstract: Molecular-level understanding and characterization of solvation environments are often needed across chemistry, biology, and engineering. Toward practical modeling of local solvation effects of any solute in any solvent, we report a static and all-quantum mechanics-based cluster-continuum approach for calculating single-ion solvation free energies. This approach uses a global optimization procedure to identify low-energy molecular clusters with different numbers of explicit solvent molecules and then employs the smooth overlap for atomic positions learning kernel to quantify the similarity between different low-energy solute environments. From these data, we use sketch maps, a nonlinear dimensionality reduction algorithm, to obtain a two-dimensional visual representation of the similarity between solute environments in differently sized microsolvated clusters. After testing this approach on different ions having charges 2+, 1+, 1-, and 2-, we find that the solvation environment around each ion can be seen to usually become more similar in hand with its calculated single-ion solvation free energy. Without needing either dynamics simulations or an a priori knowledge of local solvation structure of the ions, this approach can be used to calculate solvation free energies within 5% of experimental measurements for most cases, and it should be transferable for the study of other systems where dynamics simulations are not easily carried out.

Journal ArticleDOI
TL;DR: This work adopts an hierarchical approach that builds on the "flexible-meccano" model of Bernadó et al. and can begin to exploit the massive parallelism afforded by current and future high-performance computing resources for atomic-resolution characterization of IDPs.
Abstract: Intrinsically disordered proteins (IDPs) constitute a large fraction of the human proteome and are critical in the regulation of cellular processes. A detailed understanding of the conformational d...