A New Paradigm for Parallel Adaptive Meshing Algorithms
Summary (2 min read)
Introduction
- Adaptivity, finite element methods, a posteriori error estimation, parallel computing AMS subject classifications.
- If an initial mesh is distributed quite fairly among a number of processors, a very good error estimator (coupled with adaptive refinement) quickly produces a very bad work load imbalance among the processors.
- A similar approach appeared recently in [18].
- MC also discretizes the solution over piecewise linear triangular or tetrahedral elements, and as in PLTMG, error estimates for each element are produced by solving a local problem using the edge-based quadratic bump functions.
- Since the target size of all problems solved in step 2 is the same, and each subregion initially has approximately equal error, the authors expect the final composite mesh to have approximately equal errors, and approximately equal numbers of elements, in each of the refined subregions created in step 2.
The global refined mesh.
- Each processor then adaptively solved the problem with a target value of 100000 vertices.
- The uniformity of color in the a posteriori error graph indicates that their strategy did a reasonable job of globally equilibrating the error despite the lack of communication in the mesh generation phase.
- Each processor then continued with Step 2 of the paradigm, with a target value of 100000 vertices, as in the first example.
- The final global solve was also an interior point iteration, with the domain decomposition/multigraph solver used for the resulting linear systems.
- In Figure 8, the authors show the initial partition in to 16 subregions and the final global refined mesh.
A posteriori error estimate for the final solution.
- The two previous examples demonstrated the effectiveness of the parallel algorithm for linear and nonlinear scalar problems and variational inequalities in 2D.
- The stress tensor T is a function of the gradient ∇u of the unknown displacement u, and the corresponding deformation mapping ϕ and deformation gradient ∇ϕ are given by ϕ = id+.
- The mesh is generated by adaptively bisecting an initial mesh consisting of an icosahedron volume filled with tetrahedra.
- The four subdomain problems are then solved independently by MC, starting from the complete coarse mesh and coarse mesh solution.
3. Computational considerations.
- In this section the authors describe the algorithm they use for partitioning the coarse mesh so that each subregion has approximately equal error.
- The authors now briefly describe some details of their procedure for computing the second eigenvector of (3.1).
- Taken together (3.5)–(3.7) indicate that for good performance, the difficulty of the problem must in some sense scale in proportion to the number of processors.
- In their scheme, each subregion contributes equations corresponding all fine mesh points, including its interface.
- Thus the final matrix and mesh from Step 2 of the paradigm can be reused once again in the domain decomposition solver.
Did you find this useful? Give us your feedback
Citations
174 citations
123 citations
116 citations
113 citations
100 citations
References
6,918 citations
"A New Paradigm for Parallel Adaptiv..." refers background in this paper
...[3, 4]), one is primarily interested in a high-quality approximation to the derivatives of a function rather than the function itself; in some situations these problems can be well approximated without the final global solution phase....
[...]
...This work, which is described in detail in [3, 4], showed a very high degree of parallel efficiency even for hundreds of processors, due to the lack of need for a final global solve (see section 4....
[...]
3,115 citations
2,253 citations
"A New Paradigm for Parallel Adaptiv..." refers background in this paper
..., the book of Verfürth [42] and its references), the adaptive algorithms themselves are largely heuristic, in particular those aspects described above....
[...]
...While there is considerable theoretical support for the a posteriori error bounds that form the foundation for adaptive algorithms (see, e.g., the book of Verfürth [42] and its references), the adaptive algorithms themselves are largely heuristic, in particular those aspects described above....
[...]
1,762 citations
1,176 citations
Related Papers (5)
Frequently Asked Questions (11)
Q2. What is the general rule of the mesh refinement procedure?
In PLTMG, mesh refinement on interface edges is restricted to simple bisection, although their adaptive refinement procedure generally allows the mesh points to move.
Q3. What is the way to solve the Rayleigh quotient?
The use of this subspace-iteration–like calculation rather than a simple eigenvector update provides a means to bias the overall Rayleigh quotient iteration towards convergence to ψ2.
Q4. What is the shape measure function for a given d-simplex?
The smoothing procedure locally optimizes the following shape measure function for a given d-simplex s, in an iterative fashion, similar to the approach in [11]:η(s, d) = 22(1− 1 d )3 d−1 2 |s| 2d∑0≤i<j≤d |eij |2 .
Q5. How many iterations did the variational inequality take?
For each of interior point iteration, the authors used just one inner DD iteration, for a grand total of 4 domain decomposition/multigraph iterations.
Q6. what is the a priori error estimate for the global Galerkin solution uh?
Under some reasonable assumptions about the approximation properties of a finite element space Sh0 defined over Ω (existence of superapproximation, inverse, and trace inequalities), the following a priori error estimate holds for the global Galerkin solution uh to a Poisson-like linear elliptic equation:‖u− uh‖H1(Ωk) ≤ C (inf v∈Sh0‖u− v‖H1(Ω0k) + ‖u− uh‖L2(Ω) ) ,and the following a posteriori error estimate holds (where η(uh) is a locally computable jump function):‖u− uh‖H1(Ωk) ≤ C ( ‖hη(uh)‖L2(Ω0k) + ‖u− uh‖L2(Ω) ) .
Q7. What is the trade-off between the domain and the equation coefficients?
In many realistic application problems, the features of the domain and/or the equation coefficients provide more than enough complexity to lead to good scalability.
Q8. What is the trade-off for Np?
This trade-off will be most effective in situations where Np is much larger than Nc (e.g., Np > 10Nc) so that the redundant computation represents a small fraction of the total cost.
Q9. What is the system matrix used in the adaptive refinement step on processor 1?
(4.4) The system matrix in (4.4) is the matrix used in the final adaptive refinement step on processor 1 (with possible modifications due to global fine mesh regularization).
Q10. What is the basis of the interprocessor communication steps of their domain decomposition solver?
The resulting data structure (mappings of corresponding vertices deduced from the matching edges) forms the basis of the interprocessor communication steps of their domain decomposition solver.
Q11. How does the Rayleigh quotient iteration work?
Without systematically and continually excluding this eigenvector, the Rayleigh quotient iteration could easily converge to ψ1.3.2.