scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A New Paradigm for Parallel Adaptive Meshing Algorithms

01 Jan 2003-Siam Review (Society for Industrial and Applied Mathematics)-Vol. 45, Iss: 2, pp 291-323
TL;DR: This approach addresses the load balancing problem in a new way, requiring far less communication than current approaches, and allows existing sequential adaptive PDE codes such as PLTMG and MC to run in a parallel environment without a large investment in recoding.
Abstract: We present a new approach to the use of parallel computers with adaptive finite element methods. This approach addresses the load balancing problem in a new way, requiring far less communication than current approaches. It also allows existing sequential adaptive PDE codes such as PLTMG and MC to run in a parallel environment without a large investment in recoding. In this new approach, the load balancing problem is reduced to the numerical solution of a small elliptic problem on a single processor, using a sequential adaptive solver, without requiring any modifications to the sequential solver. The small elliptic problem is used to produce a posteriori error estimates to predict future element densities in the mesh, which are then used in a weighted recursive spectral bisection of the initial mesh. The bulk of the calculation then takes place independently on each processor, with no communication, using possibly the same sequential adaptive solver. Each processor adapts its region of the mesh independent...

Summary (2 min read)

Introduction

  • Adaptivity, finite element methods, a posteriori error estimation, parallel computing AMS subject classifications.
  • If an initial mesh is distributed quite fairly among a number of processors, a very good error estimator (coupled with adaptive refinement) quickly produces a very bad work load imbalance among the processors.
  • A similar approach appeared recently in [18].
  • MC also discretizes the solution over piecewise linear triangular or tetrahedral elements, and as in PLTMG, error estimates for each element are produced by solving a local problem using the edge-based quadratic bump functions.
  • Since the target size of all problems solved in step 2 is the same, and each subregion initially has approximately equal error, the authors expect the final composite mesh to have approximately equal errors, and approximately equal numbers of elements, in each of the refined subregions created in step 2.

The global refined mesh.

  • Each processor then adaptively solved the problem with a target value of 100000 vertices.
  • The uniformity of color in the a posteriori error graph indicates that their strategy did a reasonable job of globally equilibrating the error despite the lack of communication in the mesh generation phase.
  • Each processor then continued with Step 2 of the paradigm, with a target value of 100000 vertices, as in the first example.
  • The final global solve was also an interior point iteration, with the domain decomposition/multigraph solver used for the resulting linear systems.
  • In Figure 8, the authors show the initial partition in to 16 subregions and the final global refined mesh.

A posteriori error estimate for the final solution.

  • The two previous examples demonstrated the effectiveness of the parallel algorithm for linear and nonlinear scalar problems and variational inequalities in 2D.
  • The stress tensor T is a function of the gradient ∇u of the unknown displacement u, and the corresponding deformation mapping ϕ and deformation gradient ∇ϕ are given by ϕ = id+.
  • The mesh is generated by adaptively bisecting an initial mesh consisting of an icosahedron volume filled with tetrahedra.
  • The four subdomain problems are then solved independently by MC, starting from the complete coarse mesh and coarse mesh solution.

3. Computational considerations.

  • In this section the authors describe the algorithm they use for partitioning the coarse mesh so that each subregion has approximately equal error.
  • The authors now briefly describe some details of their procedure for computing the second eigenvector of (3.1).
  • Taken together (3.5)–(3.7) indicate that for good performance, the difficulty of the problem must in some sense scale in proportion to the number of processors.
  • In their scheme, each subregion contributes equations corresponding all fine mesh points, including its interface.
  • Thus the final matrix and mesh from Step 2 of the paradigm can be reused once again in the domain decomposition solver.

Did you find this useful? Give us your feedback

Figures (22)

Content maybe subject to copyright    Report

A NEW PARADIGM FOR PARALLEL ADAPTIVE MESHING
ALGORITHMS
RANDOLPH E. BANK
AND MICHAEL HOLST
SIAM REV.
c
2003 Society for Industrial and Applied Mathematics
Vol. 45, No. 2, pp. 000–000
Abstract. We present a new approach to the use of parallel computers with adap tive finite
element me thods. This approach addresses the load balancing problem in a new way, requiring
far less communication than current approaches. It also allows existing sequential adaptive PDE
codes such as PLTMG and MC to run in a parallel environment without a large inve stme nt in
recoding. In this new approach, the load balancing problem is reduced to the numerical solution of a
small elliptic problem on a single pro ce ssor, using a sequential adaptive solver, without requiring any
modificat ions to the sequential solver. The small elliptic problem is used to produce a posteriori error
estimates to predict future element densities in the mesh, which are then used in a weighted recursive
spectral bisection of the initial mesh. The bulk of the calculation then takes place independently on
each processor, with no commu nica tion, using possibly the same sequential adaptive solver. Each
processor adapts its region of the mesh independently, and a nearly load-balanced mesh distribution
is usually obtained as a result of the initial weighted spectral bisection. Only the initial fan-out of
the mesh decomposition to the processors requires communication. Two additional steps requiring
boundary exchange communication may be employed after the individual processors reach an adapted
solution, namely, the construction of a global conforming mesh from the independent subproblems,
followed by a final smoothing phase using the subdomain solutions as an initial guess. We present
a series of convincing numerical experiments that illustrate the effectiveness of this approach. The
justification of the initial refinement prediction step, as well as the justification of skipping the two
communication-intensive steps, is supported by some recent [J. Xu and A. Zhou, Math. Comp., to
appear] and not so recent [J. A. Nitsche and A. H. Schatz, Math. Comp., 28 (1974), pp. 937–958;
A. H. Schatz and L. B. Wahlbin, Math. Comp., 31 (1977), pp. 414–442; A. H. Schatz and L. B.
Wahlbin, Math. Comp., 64 (1995), pp. 907–928] results on local a priori and a posteriori error
estimation. This revision of the original article [R. E. Bank and M. J. Holst, SIAM J. Sci. Comput.,
22 (2000), pp. 1411–1443] up dat es the numerical experiments, and reflects the knowledge we have
gained since the original paper appeared.
Key words. adaptivity, finite element methods, a posteriori error estimation, parallel computing
AMS subject classifications. 65M55, 65N55
PII. S0000000000000000
1. Introduction. One of the most difficult obstacles to overcome in making
effective use of parallel computers for adaptive finite element codes such as PLTMG [5]
and MC [24] is the load balancing problem. As an adaptive method adjusts the mesh
according to the features of the solution, elements in some areas are refined, whereas
others are not. If an initial mesh is distributed quite fairly among a number of
processors, a very good error estimator (coupled with adaptive refinement) quickly
produces a very bad work load imbalance among the processors.
A number of static and dynamic load balancing approaches for unstructured
meshes have been proposed in the literature [21, 22, 23, 27, 40, 43]; most of the
dynamic strategies involve repeated application of a particular static strategy. One of
Received by the e ditor s April 7, 1999; accepted for publication (in revised form) December 6,
1999; published electronically November 2, 2000. The work of the first author was supported by
the National Science Foundation under contracts DMS-9706090, DMS-9973276, and DMS-0208449.
The work of the second author was supported by the National Science Foundation under CAREER
award 9875856 and under contracts DMS-9973276 and DMS-0208449. The UCSD Scicomp Beowulf
cluster was built using funds provided by the National Science Foundation through SCREMS Grant
0112413, with matching funds from the University of California at San Diego.
http://www.siam.org/journals/sirev/45-2/00000.html
Department of Mathematics, University of California at San Diego, La Jolla, CA 92093
(rbank@ucsd.edu, mholst@math.ucsd.edu).
1

2 RANDOLPH E. BANK AND MICHAEL HOLST
the difficulties in all of these approaches is the amount of communication that must
be performed both to assess the current load imbalance seve rity, and to redistribute
the work among the processors once the imbalance is detected and an improved distri-
bution is calculated. The calculation of the improved work distribution can be quite
inexpensive (such as ge ometric or inertia tensor-based methods), or it may be a costly
procedure, with some approaches requiring the s olution of an associated eigenvalue
problem or evolution of a heat equation to near equilibrium [44]. These calculations
may themselves require communication if they must be solved in parallel using the
existing (poor) distribution.
In recent years, clusters of fast workstations have replaced the more traditional
parallel computer of the past. While this type of parallel computer is now within
reach of an organization with even a modest hardware budget, it is usually difficult to
produce an efficient parallel implementation of an elliptic PDE solver; this is simply
due to the fact that elliptic continuum mechanics problems necessarily lead to tightly
coupled discrete problems, requiring substantial amounts of communication for their
solution. The load balancing problem is also more pronounced on workstation clusters:
even at 100 Mbit/sec speed, the cluster communication speeds are quite slow compared
to modern workstation CPU performance, and the communication required to detect
and correct load imbalances results in severe time penalties.
1.1. A new approach to parallel adaptive finite element methods. In this
work, we present an alternative approach that addresses the load balancing problem in
a new way, requiring far less communication than current approaches. This approach
also allows existing sequential adaptive PDE codes such as PLTMG and MC to run in
a parallel environment without a large investment in recoding.
Our approach has three main components:
1. We solve a small problem on a coarse mesh, and use a posteriori error es-
timates to partition the mesh. Each subregion has approximately the same
error, although subregions may vary considerably in terms of numbers of
elements or grid points.
2. Each processor is provided the complete coarse mesh and instructed to se-
quentially solve the entire problem, with the stipulation that its adaptive
refinement should be limited largely to its own partition. The target number
of elements and gridpoints for each problem is the same.
3. A final mesh is computed using the union of the refined partitions provided
by each processor. This me sh is regularized and a final solution computed
using a standard domain decomposition or parallel multigrid technique.
The above approach has several interesting features. First, the load balancing
problem (step 1) is reduced to the numerical solution of a small elliptic problem on
a single processor, using a sequential adaptive solver such as PLTMG or MC, without
requiring any modifications to the sequential solver. Second, the bulk of the calcula-
tion (step 2) takes place independently on each processor and can also be performed
with a sequential solver such as PLTMG or MC with no modifications necessary for
communication. (In PLTMG, one line of code was added, that artificially multiplied a
posteriori error estimates for elements outside a processor’s partition by 10
6
. In MC,
two lines were added to prevent elements outside the processor’s partition from en-
tering the initial refinement queue.) Step 2 was motivated by recent work of Mitchell
[30, 31, 32] on parallel multigrid methods. A similar approach appeared recently
in [18]. The use of a posteriori error estimates in mesh partitioning strategies has also
been considered in [36].

PARALLEL ADAPTIVE MESHING 3
%
%
%
%
%
%
%
%
%
%
%
%%
e
e
e
e
e
e
e
e
e
e
e
ee
ν
1
ν
2
ν
3
t
`
3
`
2
`
1
Fig. 1. A typical element.
The only parts of the calculation requiring communication are (1) the initial fan-
out of the mesh distribution to the processors, once the decomposition is determined
by the error estimator; (2) the mesh regularization, requiring local communication
to produce a global conforming mesh; and (3) the final solution phase, that might
require local communication (boundary exchanges). Note that a good initial guess
for step 3 is provided in step 2 by taking the solution from each subregion restricted
to its partition. Note also that the initial communication step to fan-out the mesh
decomposition information is not actually required, since each processor can compute
the decomposition independently (with no communication) as a preprocessing step.
1.2. Justificati on. Perhaps the largest issue arising in connection with this pro-
cedure is whether it is well founded, particularly in light of the continuous dependence
of the solution of an elliptic equation on data throughout the domain. To address this
issue, we first note that the primary goal of step 2 above is adaptive mesh generation.
In other words, the most important issue is not how accurately the problem is solved
in step 2 of this procedure, but rather the quality of the (composite) adaptively gener-
ated mesh. These two issues are obviously related, but one should note that it is not
necessary to have an accurate solution in order to generate a well adapted mesh. In-
deed, the ability to generate good meshes from relatively inaccurate solutions explains
the success of many adaptive methods.
A secondary goal of step 2 is the generation of an initial guess for the solution on
the final composite mesh. This aspect of the algorithm will be addressed in section
4. Here we focus on the primary issue of grid generation, and in particular on a
posteriori error estimates, as such estimates provide the link between the computed
solution and the adaptive meshing procedure. Here we consider in detail the schemes
used in PLTMG and MC, but similar points can b e made in connection with other
adaptive algorithms.
PLTMG uses a discretization based on continuous piecewise linear triangular fi-
nite elements. The error is approximated in the subspace of discontinuous piecewise
quadratic polynomials that are zero at the vertices of the mes h. In particular, let
uu
h
denote the error and let t denote a generic triangle in the mesh. In its adaptive
algorithms, PLTMG approximates the error in triangle t using the formula
||∇(u u
h
)||
2
t
Z
t
|∇(u u
h
)|
2
dx v
t
Bv, (1.1)

4 RANDOLPH E. BANK AND MICHAEL HOLST
where (see Figure 1)
ν
i
=
x
i
y
i
for 1 i 3, (1.2)
`
i
= ν
j
ν
k
for (i, j, k) a cyclic permutation of (1, 2, 3), (1.3)
v =
`
t
1
M
t
`
1
`
t
2
M
t
`
2
`
t
3
M
t
`
3
, M
t
=
1
2
u
xx
u
xy
u
xy
u
y y
, (1.4)
B =
1
48|t|
`
t
1
`
1
+ `
t
2
`
2
+ `
t
3
`
3
2`
t
1
`
2
2`
t
1
`
3
2`
t
2
`
1
`
t
1
`
1
+ `
t
2
`
2
+ `
t
3
`
3
2`
t
2
`
3
2`
t
3
`
1
2`
t
3
`
2
`
t
1
`
1
+ `
t
2
`
2
+ `
t
3
`
3
.
Equations (1.1)–(1.5) are derived by comparing the approximation error for linear
and quadratic interpolation on t. See [5, 11] for details. The second derivatives in
the 2 × 2 matrix M
t
are taken as constants on t. The values of the second derivatives
are extracted as a postprocessing step from the a posteriori error estimates. The
remaining information nee ded to compute the right-hand side of (1.1) is generated
directly from the geometry of element t.
To be effective, the approximation of the derivatives need not be extremely ac-
curate. Many adaptive algorithms, and in particular those in PLTMG, are directed
toward creating meshes in which the errors in all elements are equilibrated. Typi-
cally, adaptive algorithms develop refined meshes starting from rather coarse meshes.
For reasons of efficiency, often many elements are refined between recomputing the
approximate solution u
h
.
MC also discretizes the solution over piecewise linear triangular or tetrahedral
elements, and as in PLTMG, error estimates for each element are produced by solving
a local problem using the edge-based quadratic bump functions. However, while these
local problems involve inverting 3 × 3 matrices for scalar problems in 2D, the local
problems are substantially more costly in 3D. In particular, for 3D elasticity, the local
problems require the inversion of 18 × 18 matrices (6 bump functions and 3 unknowns
per spatial point). Therefore, MC also provides an alternative less-expensive error
estimator, namely, the residual of the strong form of the equation, following, e.g., [42].
In this paper, the numerical results involving MC are produced using the residual-
based estimator.
While there is considerable theoretical support for the a posteriori error bounds
that form the foundation for adaptive algorithms (see, e .g., the book of Verf¨urth
[42] and its references), the adaptive algorithms themselves are largely heuristic, in
particular, those aspects described above. However, there is a large and growing
body of empirical evidence that such algorithms are indeed robust and effective for
a wide class of problems. In particular, they are effective on coarse meshes, and on
highly nonuniform meshes. The types of meshes likely to be generated in our parallel
algorithm are qualitatively not very different from typical meshes where a posteriori
error estimates are known to perform quite well.
In our procedure, we artificially set the errors to be very small in regions outside
the subregion ass igned to a given processor, so the standard refinement procedure is

PARALLEL ADAPTIVE MESHING 5
“tricked” into equilibrating the error on just one subregion. Since the target size of
all problems solved in step 2 is the same, and each subregion initially has approxi-
mately equal error, we expect the final composite mesh to have approximately equal
errors, and approximately equal numbers of elements, in each of the refined s ubre-
gions created in step 2. That is, the composite mesh created in step 3 should have
roughly equilibrated errors in all of its elements. This last statement is really just
an expectation, since we control only the target number of elements added in each
subregion and do not control the level of error directly. This and other assumptions
forming the foundation of our load balancing algorithms are discussed in more detail
in section 3.2.
We note the standard adaptive procedures in PLTMG and MC have additional
refinement criteria to insure conforming and shape regular meshes. Thus, some ele-
ments outside the given subregion but near its interface are typically refined in order
to enforce shape regularity; the result is a smooth transition from the small elements
of the refined region to larger elements in the remainder of the domain. If the target
number of elements is large in comparison with the number of elements on the coarse
mesh, this should be a relatively small effect.
To summarize, we e xpect that for any given problem, our algorithm should per-
form comparably to the standard algorithm applied to the same initial mesh in terms
of the quality of the adaptive local mesh refinement.
2. Examples. In this section, we present some simple examples of the algorithm
presented in section 1.
2.1. A convection-diffusion equation. In this example, we use PLTMG to
solve the convection-diffusion equation
−∇ · (u + βu) = 1 in , (2.1)
u = 0 on ,
where β = (0, 10
5
)
t
, and is the region depicted in Figure 2. This coarse triangulation
has 1676 elements and 1000 vertices.
We partitioned the domain into four subregions with approximately equal error
using the recursive spectral bisection algorithm, desc ribed in more detail in section 3.
Then four independent problems were solved, each starting from the coarse grid and
coarse grid solution. In each case the mesh is adaptively refined until a mesh with
approximately 4000 unknowns (located at triangle vertices) is obtained. The mesh
for one subdomain and the corresponding solution is shown in Figure 3. Notice
that the refinement is largely confined to the given region, but some refinement in
adjacent regions is needed in order to maintain shape regularity. We emphasize that
these four problems are solved independently, by the standard sequential adaptive
solver PLTMG. The only change to the code used for problem k (1 k 4) was to
multiply a posteriori error estimates for elements in regions j 6= k by 10
6
, causing the
adaptive refinement procedure to rarely choose these elements for refinement, except
to maintain shape regularity.
The meshes from these four subproblems are combined to form a globally refined
mesh with 24410 triangles and 12808 vertices. This mesh is shown in Figure 4. The
final global solution generated by our domain decomposition/multigraph solver [9, 8,
12] is also shown in Figure 4.
We next repeated this experiment but in a more realistic setting. This time
the coarse mesh had 15153 triangles and 8000 vertices, and was partitioned into

Citations
More filters
Journal ArticleDOI
TL;DR: The solvation of biomolecules with a computational biophysics view toward describing the phenomenon is discussed, and the main focus lies on the computational aspect of the models.
Abstract: An understanding of molecular interactions is essential for insight into biological systems at the molecular scale. Among the various components of molecular interactions, electrostatics are of special importance because of their long-range nature and their influence on polar or charged molecules, including water, aqueous ions, proteins, nucleic acids, carbohydrates, and membrane lipids. In particular, robust models of electrostatic interactions are essential for understanding the solvation properties of biomolecules and the effects of solvation upon biomolecular folding, binding, enzyme catalysis, and dynamics. Electrostatics, therefore, are of central importance to understanding biomolecular structure and modeling interactions within and among biological molecules. This review discusses the solvation of biomolecules with a computational biophysics view toward describing the phenomenon. While our main focus lies on the computational aspect of the models, we provide an overview of the basic elements of biomolecular solvation (e.g. solvent structure, polarization, ion binding, and non-polar behavior) in order to provide a background to understand the different types of solvation models.

174 citations

Journal ArticleDOI
TL;DR: Based on two-grid discretizations, some local and parallel finite element algorithms for the Stokes problem are proposed and analyzed in this article, motivated by the observation that low frequency components can be approximated well by a relatively coarse grid and high frequency component can be computed on a fine grid.
Abstract: Based on two-grid discretizations, some local and parallel finite element algorithms for the Stokes problem are proposed and analyzed in this paper. These algorithms are motivated by the observation that for a solution to the Stokes problem, low frequency components can be approximated well by a relatively coarse grid and high frequency components can be computed on a fine grid by some local and parallel procedure. One technical tool for the analysis is some local a priori estimates that are also obtained in this paper for the finite element solutions on general shape-regular grids.

123 citations

01 Jan 2006
TL;DR: This paper surveys the techniques that are necessary for constructing computationally efficient parallel multigrid solvers, including traditional spatial partitioning and more novel additive multilevel methods.
Abstract: This paper surveys the techniques that are necessary for constructing computationally efficient parallel multigrid solvers. Both geometric and algebraic methods are considered. We first cover the sources of parallelism, including traditional spatial partitioning and more novel additive multilevel methods. We then cover the parallelism issues that must be addressed: parallel smoothing and coarsening, operator complexity, and parallelization of the coarsest grid solve.

116 citations

Journal ArticleDOI
TL;DR: A chain of algorithms for molecular surface and volumetric mesh generation takes as inputs the centers and radii of all atoms of a molecule and the toolchain outputs both triangular and tetrahedral meshes that can be used for molecular shape modeling and simulation.
Abstract: We describe a chain of algorithms for molecular surface and volumetric mesh generation. We take as inputs the centers and radii of all atoms of a molecule and the toolchain outputs both triangular and tetrahedral meshes that can be used for molecular shape modeling and simulation. Experiments on a number of molecules are demonstrated, showing that our methods possess several desirable properties: feature-preservation, local adaptivity, high quality, and smoothness (for surface meshes). We also demonstrate an example of molecular simulation using the finite element method and the meshes generated by our method. The approaches presented and their implementations are also applicable to other types of inputs such as 3D scalar volumes and triangular surface meshes with low quality, and hence can be used for generation/improvement of meshes in a broad range of applications.

113 citations

Journal ArticleDOI
TL;DR: A new parallel semiconductor device simulation using the dynamic load balancing approach based on the adaptive finite volume method with a posteriori error estimation has been developed and successfully implemented on a 16-PC Linux cluster with a message passing interface library.
Abstract: We present a new parallel semiconductor device simulation using the dynamic load balancing approach. This semiconductor device simulation based on the adaptive finite volume method with a posteriori error estimation has been developed and successfully implemented on a 16-PC Linux cluster with a message passing interface library. A constructive monotone iterative technique is also applied for solution of the system of nonlinear algebraic equations. Two different parallel versions of the algorithm to perform a complete device simulation are proposed. The first is a dynamic parallel domain decomposition approach, and the second is a parallel current-voltage characteristic points simulation. This implementation shows that a well-designed load balancing simulation can significantly reduce the execution time up to an order of magnitude. Compared with the measured data, numerical results on various submicron VLSI devices are presented, to show the accuracy and efficiency of the method.

100 citations

References
More filters
Journal ArticleDOI
TL;DR: The application of numerical methods are presented to enable the trivially parallel solution of the Poisson-Boltzmann equation for supramolecular structures that are orders of magnitude larger in size.
Abstract: Evaluation of the electrostatic properties of biomolecules has become a standard practice in molecular biophysics. Foremost among the models used to elucidate the electrostatic potential is the Poisson-Boltzmann equation; however, existing methods for solving this equation have limited the scope of accurate electrostatic calculations to relatively small biomolecular systems. Here we present the application of numerical methods to enable the trivially parallel solution of the Poisson-Boltzmann equation for supramolecular structures that are orders of magnitude larger in size. As a demonstration of this methodology, electrostatic potentials have been calculated for large microtubule and ribosome structures. The results point to the likely role of electrostatics in a variety of activities of these structures.

6,918 citations


"A New Paradigm for Parallel Adaptiv..." refers background in this paper

  • ...[3, 4]), one is primarily interested in a high-quality approximation to the derivatives of a function rather than the function itself; in some situations these problems can be well approximated without the final global solution phase....

    [...]

  • ...This work, which is described in detail in [3, 4], showed a very high degree of parallel efficiency even for hundreds of processors, due to the lack of need for a final global solve (see section 4....

    [...]

Journal ArticleDOI
TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.
Abstract: According to Parlett, 'Vibrations are everywhere, and so too are the eigenvalues associated with them. As mathematical models invade more and more disciplines, we can anticipate a demand for eigenvalue calculations in an ever richer variety of contexts.' Anyone who performs these calculations will welcome the reprinting of Parlett's book (originally published in 1980). In this unabridged, amended version, Parlett covers aspects of the problem that are not easily found elsewhere. The chapter titles convey the scope of the material succinctly. The aim of the book is to present mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few. The author explains why the selected information really matters and he is not shy about making judgments. The commentary is lively but the proofs are terse.

3,115 citations

Book
28 May 1996
TL;DR: Introduction.
Abstract: Introduction. A Simple Model Problem. Abstract Nonlinear Equations. Finite Element Discretizations of Elliptic PDEs. Practical Implementation. Bibliography. Subject Index.

2,253 citations


"A New Paradigm for Parallel Adaptiv..." refers background in this paper

  • ..., the book of Verfürth [42] and its references), the adaptive algorithms themselves are largely heuristic, in particular those aspects described above....

    [...]

  • ...While there is considerable theoretical support for the a posteriori error bounds that form the foundation for adaptive algorithms (see, e.g., the book of Verfürth [42] and its references), the adaptive algorithms themselves are largely heuristic, in particular those aspects described above....

    [...]

Journal ArticleDOI
TL;DR: In this paper, it is shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph.
Abstract: The problem of computing a small vertex separator in a graph arises in the context of computing a good ordering for the parallel factorization of sparse, symmetric matrices. An algebraic approach for computing vertex separators is considered in this paper. It is, shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph. The Laplacian eigenvectors of grid graphs can be computed from Kronecker products involving the eigenvectors of path graphs, and these eigenvectors can be used to compute good separators in grid graphs. A heuristic algorithm is designed to compute a vertex separator in a general graph by first computing an edge separator in the graph from an eigenvector of the Laplacian matrix, and then using a maximum matching in a subgraph to compute the vertex separator. Results on the quality of the separators computed by the spectral algorithm are presented, and these are compared with separators obtained from other algorith...

1,762 citations

Journal ArticleDOI
TL;DR: A unified theory for a diverse group of iterative algorithms, such as Jacobi and Gauss–Seidel iterations, diagonal preconditioning, domain decomposition methods, multigrid methods,Multilevel nodal basis preconditionsers and hierarchical basis methods, is presented by using the notions of space decomposition and subspace correction.
Abstract: The main purpose of this paper is to give a systematic introduction to a number of iterative methods for symmetric positive definite problems. Based on results and ideas from various existing works...

1,176 citations

Frequently Asked Questions (11)
Q1. What contributions have the authors mentioned in the paper "A new paradigm for parallel adaptive meshing algorithms∗" ?

The authors present a new approach to the use of parallel computers with adaptive finite element methods. Two additional steps requiring boundary exchange communication may be employed after the individual processors reach an adapted solution, namely, the construction of a global conforming mesh from the independent subproblems, followed by a final smoothing phase using the subdomain solutions as an initial guess. The authors present a series of convincing numerical experiments that illustrate the effectiveness of this approach. This revision of the original article [ R. E. Bank and M. J. Holst, SIAM J. Sci. Comput., 22 ( 2000 ), pp. 1411–1443 ] updates the numerical experiments, and reflects the knowledge the authors have gained since the original paper appeared. 

In PLTMG, mesh refinement on interface edges is restricted to simple bisection, although their adaptive refinement procedure generally allows the mesh points to move. 

The use of this subspace-iteration–like calculation rather than a simple eigenvector update provides a means to bias the overall Rayleigh quotient iteration towards convergence to ψ2. 

The smoothing procedure locally optimizes the following shape measure function for a given d-simplex s, in an iterative fashion, similar to the approach in [11]:η(s, d) = 22(1− 1 d )3 d−1 2 |s| 2d∑0≤i<j≤d |eij |2 . 

For each of interior point iteration, the authors used just one inner DD iteration, for a grand total of 4 domain decomposition/multigraph iterations. 

Under some reasonable assumptions about the approximation properties of a finite element space Sh0 defined over Ω (existence of superapproximation, inverse, and trace inequalities), the following a priori error estimate holds for the global Galerkin solution uh to a Poisson-like linear elliptic equation:‖u− uh‖H1(Ωk) ≤ C (inf v∈Sh0‖u− v‖H1(Ω0k) + ‖u− uh‖L2(Ω) ) ,and the following a posteriori error estimate holds (where η(uh) is a locally computable jump function):‖u− uh‖H1(Ωk) ≤ C ( ‖hη(uh)‖L2(Ω0k) + ‖u− uh‖L2(Ω) ) . 

In many realistic application problems, the features of the domain and/or the equation coefficients provide more than enough complexity to lead to good scalability. 

This trade-off will be most effective in situations where Np is much larger than Nc (e.g., Np > 10Nc) so that the redundant computation represents a small fraction of the total cost. 

(4.4) The system matrix in (4.4) is the matrix used in the final adaptive refinement step on processor 1 (with possible modifications due to global fine mesh regularization). 

The resulting data structure (mappings of corresponding vertices deduced from the matching edges) forms the basis of the interprocessor communication steps of their domain decomposition solver. 

Without systematically and continually excluding this eigenvector, the Rayleigh quotient iteration could easily converge to ψ1.3.2.