scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Convergence analysis of direct minimization and self-consistent iterations

02 Mar 2021-SIAM Journal on Matrix Analysis and Applications (Society for Industrial and Applied Mathematics)-Vol. 42, Iss: 1, pp 243-274
TL;DR: In this article, the numerical solution of subspace optimization problems, consisting of minimizing a smooth functional over the set of orthogonal projectors of fixed rank, is studied. But this paper is not concerned with the optimization of subspaces.
Abstract: This article is concerned with the numerical solution of subspace optimization problems, consisting of minimizing a smooth functional over the set of orthogonal projectors of fixed rank. Such probl...

Summary (2 min read)

1. Introduction

  • This problem is of interest in a number of contexts, such as matrix approximation, computer vision [1], and electronic structure theory [12, 28, 39, 40, 45, 56], the latter of which being the main motivation for this work.
  • In the case when E(P ) = Tr(H0P ) for a fixed symmetric matrix H0, one recovers the classical eigenvalue problem H0φi = εiφi.
  • While the convergence of several SCF and direct minimization algorithms has been analyzed from a mathematical point of view (see e.g. [16, 38, 42, 53, 60, 64] and references therein), the two approaches have not been compared in a systematic way to their knowledge.
  • Rather, in this paper, the authors aim to focus on the very simplest representative of each general strategy (SCF and direct minimization).

2. Optimization on Grassmann manifolds

  • The authors focus in this paper on the case of real symmetric matrices, but the study can be easily extended to complex hermitian matrices.
  • Other models from electronic structure can be considered, such as the discretized Hartree-Fock or Kohn-Sham models, where the energy is of the form E(P ) := Tr(H0P ) + Enl(P ) with H0 being the core Hamiltonian (representing the kinetic energy and the external potential) and Enl a nonlinear energy functional depending on the model (representing the interaction between electrons).
  • Assumption 2.2. Assumption 2.2 is true for Hartree-Fock models.
  • To study the first-order optimality conditions, the authors start by recalling some classical results about the geometry of the manifold MN .
  • The operator Remark 2.6 (Link with the Liouvillian).

3. Algorithms and analysis of convergence

  • The gradient descent algorithm consists in following the steepest descent direction with a fixed step β at each iteration point.
  • Note that, by the use of the retraction R and Assumption 3.1, the projection step has no influence on the convergence of the algorithm for β small.
  • K∗ is positive definite on TP∗MN , for β small enough, the spectral radius r(df(P∗)) of the derivative df(P∗), is less than 1, which concludes the proof.
  • Second, the bigger ‖Ω∗‖op = εNb − ε1, the more difficult the convergence.
  • The same analysis holds true for the preconditioned SCF algorithm.

4. Numerical tests

  • The authors present here some numerical experiments to illustrate their theoretical results, explore their limits and investigate the global behavior of the algorithms.
  • The results confirm the theory the authors developed in Section 3.3: the gap has a strong influence on the convergence behavior of the SCF algorithm.
  • For larger values of α, the two alternatives of Lemma 4.1 appear.
  • In the first case, with a = 10.26 Bohrs, the simple SCF method appears to be converging for almost 20 iterations, but then diverges, until the density residual stabilizes at a positive value, as predicted in [14] for the Hartree-Fock model.
  • Note that this phenomenon is reminiscent of that observed in Figure 7, where all the modes were not fully excited, making the convergence faster than expected.

5. Conclusion

  • The authors examined the convergence of two simple representatives in the class of direct minimization and SCF algorithms.
  • Accelerated algorithms are generally found to follow the trend suggested by their theoretical results, although the authors showed that the Anderson-accelerated SCF algorithm was able to converge quickly even in the presence of a single very small gap.
  • In quantum chemistry using Gaussian basis sets to solve the Hartree-Fock model or Kohn-Sham density functional theory using hybrid functionals, the rate-limiting step is often the computation of the Fock matrix H(P ).
  • Direct minimization algorithms effectively merge the two loops of the SCF and linear eigensolver, and should therefore be more efficient.
  • Another interest of direct minimization algorithms is their robustness, as the choice of a stepsize can be made in order to minimize the energy, unlike the damped SCF algorithm where choosing an appropriate damping parameter is often done empirically.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

HAL Id: hal-02546060
https://hal.inria.fr/hal-02546060v2
Submitted on 27 Oct 2020
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Convergence analysis of direct minimization and
self-consistent iterations
Eric Cancès, Gaspard Kemlin, Antoine Levitt
To cite this version:
Eric Cancès, Gaspard Kemlin, Antoine Levitt. Convergence analysis of direct minimization and self-
consistent iterations. SIAM Journal on Matrix Analysis and Applications, Society for Industrial and
Applied Mathematics, 2021, 42 (1), 243-274 (32 p.). �10.1137/20M1332864�. �hal-02546060v2�

CONVERGENCE ANALYSIS OF DIRECT MINIMIZATION AND
SELF-CONSISTENT ITERATIONS
ERIC CANC
`
ES, GASPARD KEMLIN, ANTOINE LEVITT
Abstract. This article is concerned with the numerical solution of subspace optimization problems,
consisting of minimizing a smooth functional over the set of orthogonal projectors of fixed rank. Such
problems are encountered in particular in electronic structure calculation (Hartree-Fock and Kohn-
Sham Density Functional Theory DFT models). We compare from a numerical analysis perspective
two simple representatives, the damped self-consistent field (SCF) iterations and the gradient descent
algorithm, of the two classes of methods competing in the field: SCF and direct minimization methods.
We derive asymptotic rates of convergence for these algorithms and analyze their dependence on the
spectral gap and other properties of the problem. Our theoretical results are complemented by numerical
simulations on a variety of examples, from toy models with tunable parameters to realistic Kohn-Sham
computations. We also provide an example of chaotic behavior of the simple SCF iterations for a
nonquadratic functional.
1. Introduction
This paper is concerned with the convergence behavior of algorithms to solve the subspace optimization
problem
min
n
E(P )
P R
N
b
×N
b
, P
2
= P = P
, Tr(P ) = N
o
(1.1)
consisting of optimizing a C
2
function E : R
N
b
×N
b
R over the set of rank-N orthogonal projectors P .
Here P
denotes the adjoint (transpose) of P . This problem can also be reformulated as
min
(
E
N
X
i=1
φ
i
φ
i
!
φ
i
R
N
b
, φ
i
φ
j
= δ
ij
i, j {1, . . . , N}
)
, (1.2)
using an orthonormal basis (φ
i
)
i=1,...,N
for the subspace Ran(P ). This problem is of interest in a number
of contexts, such as matrix approximation, computer vision [1], and electronic structure theory [12, 28,
39, 40, 45, 56], the latter of which being the main motivation for this work.
Let H(P ) = E(P ). The first order conditions for problem (1.1) is
P H(P)(1 P ) = (1 P )H(P )P = 0.
Up to an appropriate choice for the orthonormal basis (φ
i
)
i=1,...,N
of Ran(P ), this yields
H(P )φ
i
= ε
i
φ
i
, (1.3)
which reveals an alternative interpretation of this problem as a nonlinear eigenvector problem (to be
distinguished from nonlinear eigenvalue problems of the form A(ε)φ = 0, where A : R R
N
b
×N
b
). In
the case when E(P ) = Tr(H
0
P ) for a fixed symmetric matrix H
0
, one recovers the classical eigenvalue
problem H
0
φ
i
= ε
i
φ
i
. At a minimizer of (1.1), the (ε
i
)
i=1,...,N
are the lowest eigenvalues of H
0
, counting
multiplicities.
Problems of the form (1.1) are found in the Hartree-Fock and Kohn-Sham theories of electronic struc-
ture [28, 45], both approximations of the many-body Schr¨odinger equation. In this context, the φ
i
are
(discretized) orbitals, the projector P is the density matrix, and the energy E(P ) includes linear contri-
butions from the kinetic and external potential energy of the electrons, as well as nonlinear terms arising
from electron-electron interaction. Another notable problem of this form is the nonlinear Schr¨odinger
or Gross-Pitaevskii equation for Bose-Einstein condensates [6], where N = 1. In all these cases, the
first-order condition (1.3) is interpreted as a self-consistent or mean-field equation: the particles behave
as independent particles in an effective Hamiltonian H(P ) (also known as the Fock matrix) involving the
Date: October 27, 2020.
1

mean-field they create. In the rest of this paper, we will work on the formulation (1.1) without specifying
E for generality.
The minimization problem (1.1) is compact but nonconvex: there exists at least one minimizer, but
the minimizer might not be unique, and local minima might not be global ones. Solving this optimization
problem is of considerable practical interest, and algorithms for doing so date back to the early days of
quantum mechanics [26]. The first introduced and still most popular approach is the self-consistent field
(SCF) method, which, in its original version [48, 54], works as follows: if P
k
is the current iterate of the
algorithm, P
k+1
is found by solving (1.3) for the fixed matrix H(P
k
):
H(P
k
)φ
k
i
= ε
k
i
φ
k
i
, (φ
k
i
)
φ
k
j
= δ
ij
with the ε
k
i
sorted in non-decreasing order, and building P
k+1
as
P
k+1
=
N
X
i=1
φ
k
i
(φ
k
i
)
.
This algorithm assumes the Aufbau property, which is that at a minimum P
we have P
=
P
N
i=1
φ
i
φ
i
with φ
i
a system of orthogonal eigenvectors associated with the lowest N eigenvalues of H(P
). This
property holds for the (spin-unconstrained) Hartree-Fock model [4] and the Gross-Pitaevskii models
without magnetic field [10], usually holds for molecular systems in the Kohn-Sham model, but does not
hold in general for Gross-Pitaevskii models with strong magnetic fields.
This basic procedure converges for systems where the nonlinearity is weak, but fails to converge
otherwise (see [14] for a comprehensive mathematical analysis of this behavior when the functional E is
a sum of a linear and a quadratic term in P , which is the case for the Hartree-Fock model). A solution
is to damp this procedure, and mix the iterates to accelerate convergence. This gives rise to a variety
of SCF algorithms, among which Broyden-like and Anderson-like mixing algorithms [33, 44, 52, 57], the
Direct Inversion in the Iterative Space (DIIS) algorithm [36, 50, 51], the Optimal Damping Algorithm
[13] (ODA), and the Energy-DIIS (EDIIS) algorithm combining the latter two approaches [37].
A second class of algorithms solves the minimization problem (1.1) directly. The minimization set
{P R
N
b
×N
b
, P
2
= P
= P, Tr P = N} is diffeomorphic to the Grassmann manifold of the N -
dimensional vector subspaces of R
N
b
. This set is naturally equipped with the structure of a Riemannian
manifold, and this allows the use of Riemann optimization algorithms [1, 21]. Direct minimization
algorithms are preferred for the Gross-Pitaevskii model with magnetic fields [3, 18, 27, 29], for which the
Aufbau principle is not satisfied in general. Gradient-type [2, 17, 49, 61, 66], Newton-type [5, 15, 68], and
trust-region methods have also been designed to solve (1.1) for larger values of N. At the time of writing,
direct minimization algorithms are less popular than SCF algorithms in electronic structure calculation,
where N can be very large, but it is not clear whether this is for sound scientific reasons or because SCF
algorithms have been implemented and optimized for decades in the main production codes, which has
not been the case for direct minimization algorithms.
While the convergence of several SCF and direct minimization algorithms has been analyzed from a
mathematical point of view (see e.g. [16, 38, 42, 53, 60, 64] and references therein), the two approaches
have not been compared in a systematic way to our knowledge. The purpose of this paper is to contribute
to fill this gap, by focusing on very simple representatives of each class, namely the damped SCF iteration
and the gradient descent. We emphasize that neither of these two algorithms is a practical choice as
is. The SCF iteration should be accelerated (for instance using the Anderson acceleration technique),
and the gradient information in direct minimization methods should rather be used as part of a quasi-
Newton method (such as the limited-memory BFGS algorithm [1]). Depending on the exact problem
at hand, all these methods should be preconditioned to avoid issues related to small mesh sizes (which
leads to a divergence of the kinetic energy term) and/or large computational domains (which can lead
to a divergence of the Coulomb energy, or the confining potential). We refer to [63] for a recent review
in the context of the Kohn-Sham equations for solids. Rather, in this paper, we aim to focus on the
very simplest representative of each general strategy (SCF and direct minimization). The investigation
of these two basic algorithms is informative on the strengths and weaknesses of the two classes, and is a
first step in the analysis of more complex methods.
The paper is organized as follows. In Section 2, we recall some results about optimization on Grass-
mann manifolds, in particular the first and second order optimality conditions, and prove preparatory
2

lemmas. In Section 3, we present the two algorithms that are in the scope of this paper: a fixed-step
gradient descent and a damped SCF algorithm. We prove their local convergence as long as the step is
small enough and we derive convergence rates. We find that the convergence rates depend on the spectral
radius of operators (acting on R
N
b
×N
b
) of the form 1βJ, with β the fixed step and J =
+K
for the
gradient descent, J = 1 +
1
K
for the SCF algorithm, where the operators
and K
are specified
in the next section. Let us just mention at this stage that the lowest eigenvalue of
is equal to the
spectral gap between the N
th
and (N +1)
st
eigenvalues of H(P
), allowing us to analyze the convergence
rates of the algorithms in terms of natural quantities of the problem. This also shows that the damped
SCF algorithm can be seen as a matrix splitting of the fixed-step gradient descent algorithm.
In Section 4, we compare the two algorithms on several test problems. First, we focus on a toy model
for which we can easily tune the gap and observe some fundamental differences between SCF and direct
minimization algorithms, in agreement with the mathematical results established in Section 3. We also
provide an example of chaos in SCF iterations, complementing the results of [14, 38] in the case of a
non-quadratic objective functional E. Then, we analyse a 1D Gross-Pitaevskii model (N = 1) and its
fermionic counterpart for N = 2, for which we investigate the behavior of the algorithms when the gap
closes. We conclude with an example from electronic structure calculation: a Silicon crystal, in the
framework of Kohn-Sham DFT, where we show in particular that accelerated SCF algorithms are less
sensitive to small gaps than the simple damped SCF. Finally, in Section 5 we draw conclusions and
outline perspectives for future work.
2. Optimization on Grassmann manifolds
We focus in this paper on the case of real symmetric matrices, but the study can be easily extended to
complex hermitian matrices. Let H
:
= R
N
b
×N
b
sym
be the vector space of N
b
× N
b
real symmetric matrices
endowed with the Frobenius inner product hA, Bi
F
:
= Tr(AB). Let
M
:
=
P H
P
2
= P
and M
N
:
=
P H
P
2
= P, Tr(P ) = N
.
From a geometrical point of view, M is a compact subset of H with N
b
+ 1 connected components
M
N
, N = 0, . . . , N
b
, each of them being characterized by the value of Tr(P ), namely the rank of the
orthogonal projector P , and being diffeomorphic to the Grassmann manifold Grass(N, N
b
) [1]. From
now on, we fix the number of electrons N and we seek the local minimizers of the problem
min
P ∈M
N
E(P ), (2.1)
where E : H R is a discretized energy functional, for which some examples are given below.
Example 2.1. As an example, we study a discrete Gross-Pitaevskii model in Section 4.4. Other models
from electronic structure can be considered, such as the discretized Hartree-Fock or Kohn-Sham models,
where the energy is of the form
E(P )
:
= Tr(H
0
P ) + E
nl
(P )
with H
0
being the core Hamiltonian (representing the kinetic energy and the external potential) and E
nl
a nonlinear energy functional depending on the model (representing the interaction between electrons).
For instance, for the Hartree-Fock model,
E
nl
(P )
:
=
1
2
Tr(G(P )P ) where (G(P ))
ij
:
=
N
b
X
k,l=1
A
ijkl
P
kl
i, j = 1, . . . , N
b
,
with A a symmetric tensor of order 4. For more details on these models or electronic structure in general,
we refer to [12, 40, 56].
In plane-wave, finite differences, finite elements or wavelets electronic structure calculation codes, the
size N
b
of the discretized space is in practice much larger than the number N of electrons. Therefore, it
is not practical to store and manipulate the (dense) matrix P . Instead, algorithms work on the orbitals
(φ
i
)
i=1,...,N
introduced in (1.3). The density matrix P is then recovered as
P =
N
X
i=1
φ
i
φ
i
.
3

All the results in this article are presented in the density matrix framework. However, the algorithms we
study can be expressed in a way that avoids ever forming the density matrix. We refer to [63] for details.
We will need two assumptions for our results.
Assumption 2.2. The energy functional E : H R is of class C
2
(twice continuously differentiable).
Assumption 2.2 is true for Hartree-Fock models. For Kohn-Sham models, it is true when the density
ρ =
P
N
i=1
|φ
i
|
2
is uniformly bounded avay from zero, which is the case for instance in condensed phase
systems. Most of the results presented in this article are local in nature, and therefore this assumption
can be relaxed to local regularity.
Assumption 2.3. P
M
N
is a nondegenerate local minimizer of (2.1) in the sense that there exists
some η > 0 such that, for P M
N
in a neighborhood of P
, we have
E(P ) > E(P
) + η kP P
k
2
F
.
It is very hard in most practical situations to check this assumption, but it seems to be verified in
practice. Notable exceptions are systems invariant with respect to continuous symmetry groups, in which
case E(P ) = E(P
) for all P in the orbit of P
along the symmetry group. In this case, the assumption
can not be true, and kP P
k
F
must be replaced by the distance from P to the orbit of P
. Our results
can be extended to this case up to quotienting H by the symmetry group.
Throughout the paper, we will use the following notation:
H(P )
:
= E(P ) is the gradient, and H
:
= H(P
);
K(P )
:
= Π
P
2
E(P
P
is the Hessian projected onto the tangent space at P , and K
:
= K(P
)
(the projection Π
P
is defined below in Proposition 2.4).
2.1. First-order condition. To study the first-order optimality conditions, we start by recalling some
classical results about the geometry of the manifold M
N
.
Proposition 2.4. M
N
is a smooth real manifold and its tangent space T
P
M
N
at P M
N
is given by
T
P
M
N
= {X H | P X + XP = X, Tr(X) = 0} = {X H | P XP = (1 P )X(1 P ) = 0}.
The orthogonal projection Π
P
on T
P
M
N
for the Frobenius inner product is
X H, Π
P
(X) = P X(1 P ) + (1 P )XP = [P, [P, X]], (2.2)
where [A, B]
:
= AB BA.
This classical result is proved in e.g. [1, Section 3.4]. Using the fact that H = Ran(P ) Ran(1 P )
and the induced decomposition of P M
N
and X H as
P =
I
N
0
0 0
, X =
(X)
oo
(X)
ov
(X)
vo
(X)
vv
, (2.3)
the projection Π
P
is given by
Π
P
(X) =
0 (X)
ov
(X)
vo
0
.
Here the subscript “o” (resp. “v”) stand for occupied (resp. virtual). The first-order optimality condition
at P
is Π
P
(H
) = 0, which can be formulated as follows:
First-order optimality condition: P
H
(1 P
) = (1 P
)H
P
= 0. (2.4)
Note that this condition can be rewritten as [H
, P
] = 0, showing that H
and P
can be codiagonalized.
Let (φ
k
)
16k6N
b
be an orthonormal basis of eigenvectors of H
associated with the eigenvalues (ε
k
)
16k6N
b
sorted in ascending order. Then P =
P
i∈I
φ
i
φ
i
, where I {1, . . . , N
b
}, |I| = N is the set of occupied
orbitals. The minimizer P
is said to satisfy
the Aufbau principle if I = {1, . . . , N };
the strong Aufbau principle if I = {1, . . . , N} and if in addition ε
N
< ε
N+1
, in which case
P
=
P
N
i=1
φ
i
φ
i
.
4

Citations
More filters
Journal ArticleDOI
07 May 2021
TL;DR: In this paper, the authors present a high-throughput screening approach to identify promising novel materials for targeted follow-up investigation using density functional theory (DFT) codes, which is a widely used method for simulating the quantum-chemical behavior of electrons in matter.
Abstract: Density-functional theory (DFT) is a widespread method for simulating the quantum-chemical behaviour of electrons in matter. It provides a first-principles description of many optical, mechanical and chemical properties at an acceptable computational cost [16, 2, 3]. For a wide range of systems the obtained predictions are accurate and shortcomings of the theory are by now wellunderstood [2, 3]. The desire to tackle even bigger systems and more involved materials, however, keeps posing novel challenges that require methods to constantly improve. One example are socalled high-throughput screening approaches, which are becoming prominent in recent years. In these techniques one wishes to systematically scan over huge design spaces of compounds in order to identify promising novel materials for targeted follow-up investigation. This has already lead to many success stories [14], such as the discovery of novel earth-abundant semiconductors [11], novel light-absorbing materials [20], electrocatalysts [8], materials for hydrogen storage [13] or for Li-ion batteries [1]. Keeping in mind the large range of physics that needs to be covered in these studies as well as the typical number of calculations (up to the order of millions), a bottleneck in these studies is the reliability and performance of the underlying DFT codes.

25 citations

Journal ArticleDOI
TL;DR: In this paper, the authors prove the existence of the reduced Hartree-Fock equations at finite temperature for a periodic crystal with a small defect, and show total screening of the defect charge by the electrons.
Abstract: We prove the existence of solutions of the reduced Hartree-Fock equations at finite temperature for a periodic crystal with a small defect, and show total screening of the defect charge by the electrons. We also show the convergence of the damped self-consistent field iteration using Kerker preconditioning to remove charge sloshing. As a crucial step of the proof, we define and study the properties of the dielectric operator.

13 citations

Posted Content
TL;DR: Using the Łojasiewicz inequality, it is shown that a Sobolev gradient descent method with adaptive inner product converges exponentially fast to the ground state for the Gross-Pitaevskii eigenproblem.
Abstract: We propose to use the Łojasiewicz inequality as a general tool for analyzing the convergence rate of gradient descent on a Hilbert manifold, without resorting to the continuous gradient flow. Using this tool, we show that a Sobolev gradient descent method with adaptive inner product converges exponentially fast to the ground state for the Gross-Pitaevskii eigenproblem. This method can be extended to a class of general high-degree optimizations or nonlinear eigenproblems under certain conditions. We demonstrate this generalization by several examples, in particular a nonlinear Schrodinger eigenproblem with an extra high-order interaction term. Numerical experiments are presented for these problems.

10 citations

Journal ArticleDOI
22 Jul 2022
TL;DR: In this article , an extension to the atomic cluster expansion (ACE) descriptor was proposed to represent Hamiltonian matrix blocks that transform equivariantly with respect to the full rotation group.
Abstract: Abstract We propose a scheme to construct predictive models for Hamiltonian matrices in atomic orbital representation from ab initio data as a function of atomic and bond environments. The scheme goes beyond conventional tight binding descriptions as it represents the ab initio model to full order, rather than in two-centre or three-centre approximations. We achieve this by introducing an extension to the atomic cluster expansion (ACE) descriptor that represents Hamiltonian matrix blocks that transform equivariantly with respect to the full rotation group. The approach produces analytical linear models for the Hamiltonian and overlap matrices. Through an application to aluminium, we demonstrate that it is possible to train models from a handful of structures computed with density functional theory, and apply them to produce accurate predictions for the electronic structure. The model generalises well and is able to predict defects accurately from only bulk training data.

8 citations

Journal ArticleDOI
TL;DR: In this article , it was shown that the local density of states (LDOS) of a wide class of tight-binding models has a weak body-order expansion, and that the resulting bodyorder expansion for analytic observables such as the electron density or the energy has an exponential rate of convergence both at finite Fermi-temperature as well as for insulators at zero-Fermi temperature.
Abstract: We show that the local density of states (LDOS) of a wide class of tight-binding models has a weak body-order expansion. Specifically, we prove that the resulting body-order expansion for analytic observables such as the electron density or the energy has an exponential rate of convergence both at finite Fermi-temperature as well as for insulators at zero Fermi-temperature. We discuss potential consequences of this observation for modelling the potential energy landscape, as well as for solving the electronic structure problem.

4 citations

References
More filters
Journal ArticleDOI
TL;DR: An efficient scheme for calculating the Kohn-Sham ground state of metallic systems using pseudopotentials and a plane-wave basis set is presented and the application of Pulay's DIIS method to the iterative diagonalization of large matrices will be discussed.
Abstract: We present an efficient scheme for calculating the Kohn-Sham ground state of metallic systems using pseudopotentials and a plane-wave basis set. In the first part the application of Pulay's DIIS method (direct inversion in the iterative subspace) to the iterative diagonalization of large matrices will be discussed. Our approach is stable, reliable, and minimizes the number of order ${\mathit{N}}_{\mathrm{atoms}}^{3}$ operations. In the second part, we will discuss an efficient mixing scheme also based on Pulay's scheme. A special ``metric'' and a special ``preconditioning'' optimized for a plane-wave basis set will be introduced. Scaling of the method will be discussed in detail for non-self-consistent and self-consistent calculations. It will be shown that the number of iterations required to obtain a specific precision is almost independent of the system size. Altogether an order ${\mathit{N}}_{\mathrm{atoms}}^{2}$ scaling is found for systems containing up to 1000 electrons. If we take into account that the number of k points can be decreased linearly with the system size, the overall scaling can approach ${\mathit{N}}_{\mathrm{atoms}}$. We have implemented these algorithms within a powerful package called VASP (Vienna ab initio simulation package). The program and the techniques have been used successfully for a large number of different systems (liquid and amorphous semiconductors, liquid simple and transition metals, metallic and semiconducting surfaces, phonons in simple metals, transition metals, and semiconductors) and turned out to be very reliable. \textcopyright{} 1996 The American Physical Society.

81,985 citations

Journal ArticleDOI
TL;DR: In this paper, the Hartree and Hartree-Fock equations are applied to a uniform electron gas, where the exchange and correlation portions of the chemical potential of the gas are used as additional effective potentials.
Abstract: From a theory of Hohenberg and Kohn, approximation methods for treating an inhomogeneous system of interacting electrons are developed. These methods are exact for systems of slowly varying or high density. For the ground state, they lead to self-consistent equations analogous to the Hartree and Hartree-Fock equations, respectively. In these equations the exchange and correlation portions of the chemical potential of a uniform electron gas appear as additional effective potentials. (The exchange portion of our effective potential differs from that due to Slater by a factor of $\frac{2}{3}$.) Electronic systems at finite temperatures and in magnetic fields are also treated by similar methods. An appendix deals with a further correction for systems with short-wavelength density oscillations.

47,477 citations

Journal ArticleDOI
TL;DR: The pseudopotential is of an analytic form that gives optimal efficiency in numerical calculations using plane waves as a basis set and is separable and has optimal decay properties in both real and Fourier space.
Abstract: We present pseudopotential coefficients for the first two rows of the Periodic Table. The pseudopotential is of an analytic form that gives optimal efficiency in numerical calculations using plane waves as a basis set. At most, seven coefficients are necessary to specify its analytic form. It is separable and has optimal decay properties in both real and Fourier space. Because of this property, the application of the nonlocal part of the pseudopotential to a wave function can be done efficiently on a grid in real space. Real space integration is much faster for large systems than ordinary multiplication in Fourier space, since it shows only quadratic scaling with respect to the size of the system. We systematically verify the high accuracy of these pseudopotentials by extensive atomic and molecular test calculations. \textcopyright{} 1996 The American Physical Society.

5,009 citations

Journal ArticleDOI

4,691 citations

Journal ArticleDOI
TL;DR: The theory proposed here provides a taxonomy for numerical linear algebra algorithms that provide a top level mathematical view of previously unrelated algorithms and developers of new algorithms and perturbation theories will benefit from the theory.
Abstract: In this paper we develop new Newton and conjugate gradient algorithms on the Grassmann and Stiefel manifolds. These manifolds represent the constraints that arise in such areas as the symmetric eigenvalue problem, nonlinear eigenvalue problems, electronic structures computations, and signal processing. In addition to the new algorithms, we show how the geometrical framework gives penetrating new insights allowing us to create, understand, and compare algorithms. The theory proposed here provides a taxonomy for numerical linear algebra algorithms that provide a top level mathematical view of previously unrelated algorithms. It is our hope that developers of new algorithms and perturbation theories will benefit from the theory, methods, and examples in this paper.

2,686 citations

Frequently Asked Questions (14)
Q1. What contributions have the authors mentioned in the paper "Convergence analysis of direct minimization and self-consistent iterations" ?

This article is concerned with the numerical solution of subspace optimization problems, consisting of minimizing a smooth functional over the set of orthogonal projectors of fixed rank. The authors compare from a numerical analysis perspective two simple representatives, the damped self-consistent field ( SCF ) iterations and the gradient descent algorithm, of the two classes of methods competing in the field: SCF and direct minimization methods. The authors derive asymptotic rates of convergence for these algorithms and analyze their dependence on the spectral gap and other properties of the problem. The authors also provide an example of chaotic behavior of the simple SCF iterations for a nonquadratic functional. 

In particular, this is necessary to extend the convergence theory presented in this paper to infinite-dimensional settings. 

For instance for β = 1 at a = 10.26 Bohrs, the method appears to be converging for almost 20 iterations, up to a reduction in residual of a factor 10−8. 

Solving the linear eigenproblem is then done using iterative block eigensolvers, which can be understood as specialized direct minimization algorithms in the case of a linear energy functional E(P ) = Tr(H0P ). 

In quantum chemistry using Gaussian basis sets to solve the Hartree-Fock model or Kohn-Sham density functional theory using hybrid functionals, the rate-limiting step is often the computation of the Fock matrix H(P ). 

In condensed-matter physics using plane-wave basis sets to solve the Kohn-Sham density functional theory with local or semilocal functionals, the matrices P and H are not stored explicitly. 

Since the second smallest eigenvalue of the matrix h is strictly lower than the third one, for α small enough, the unique minimizer P∗ of (4.8) is onM2 and satisfies the strong Aufbau principle, and both the gradient descent and SCF algorithm locally converge to P∗. 

Another interest of direct minimization algorithms is their robustness, as the choice of a stepsize can be made in order to minimize the energy, unlike the damped SCF algorithm where choosing an appropriate damping parameter is often done empirically. 

The cause of this effect appears to be that the divergent modes break the natural inversion symmetry of the crystal in this particular case: the authors have checked that the divergence occurs much sooner if the authors break this symmetry by perturbing the positions of the atoms around their symmetric positions (at 9 iterations by perturbing the position of one atom by 10%). 

In the first case, with a = 10.26 Bohrs, the simple (undamped) SCF method appears to be converging for almost 20 iterations, but then diverges, until the density residual stabilizes at a positive value, as predicted in [14] for the Hartree-Fock model. 

For the damped SCF algorithm with the ground state of the core Hamiltonian as starting point, surprisingly, the authors observe an asymptotic convergence rate slightly faster than that expected from the spectral radius of the Jacobian matrix 1− βJSCF. 

The authors plot the convergence of the density residual ‖ρΦ(Pn) − ρPn‖2 as a function of the iterations for three values of a, with decreasing gaps. 

Thus the authors expect convergence for β < 4ε2, and therefore a critical εc of ≈ 0.158 for β = 10−1 and 0.0158 for β = 10−3, with a number of iterations proportional to 1ε−εc when ε > εc. 

In practice, this issue is solved by preconditioning (see Remark 3.7).• for the damped SCF algorithm, a naive bound would beκ(JSCF) 6 ‖Ω∗‖op 1 + ν−1 ‖K∗‖opη .