Improved Fast Rerouting Using Postprocessing
Summary (4 min read)
1 INTRODUCTION
- Communication networks have become a critical infrastructure of their digital society: enterprises which outsource their IT infrastructure to the cloud, as well as many applications related to health monitoring, power grid management, or disaster response [1], depend on the uninterrupted availability of such networks.
- When encountering a failure, a packet is rerouted onto the next arborescence according to some pre-defined order.
- This paper presents an algorithmic framework for postprocessing state-of-the-art FRR mechanisms based on network decompositions, to improve resilience, performance, and flexibility, of fast rerouting.
- The authors show that they do not limit ourselves by focusing on arc-disjoint arborescence network decompositions by proving that arborescence-based decompositions are as good as any deterministic local failover method.
2 IMPOSSIBILITY OF BEATING ARBORESCENCES
- The authors first motivate their focus on failover algorithms based on arborescence network decompositions, showing that this approach does not only provide a high resilience but also competitive route qualities (in terms of lengths).
- The additive stretch of the routing scheme is then the maximum stretch along all failover routes, i.e., from all v to t.
- The authors start with some definitions for arborescence-based re-routing.
- For those nodes, there is no shorter path to the destination after such a failure, and hence from a competitive point of view, their failover route is optimal.
- To this end, the authors will show that there are k-connected k-regular graphs where every deterministic local algorithm has to take large detours, even though short routes are available.
3 THE POSTPROCESSING FRAMEWORK
- This section presents their algorithmic framework to postprocess arborescence-based network decompositions for improved resilience and performance.
- The authors consider two classes of objectives in this paper and present two examples each.
- From the above correctness conditions (1,4,5) are always satisfied, while (2,3) are irrelevant.
- Based on this arc-swap operation, the idea of their algorithmic framework is then to swap arcs only if it improves a certain objective function, see Algorithm 2.
- When exactly two arcs e, e′ are swapped in a given valid arborescence decomposition, both must be outgoing from the same node v, else at least one arborescence will be disconnected, i.e., the decomposition is invalid.
4 USE CASES AND EVALUATION
- The authors framework for postprocessing a decomposition can be configured with different objective functions, depending on the specific needs.
- In the following, the authors discuss and evaluate different use cases, namely two traffic scenario optimization use cases (for stretch/load) and two pure network decomposition optimizations (SRLG and independent paths).
- For the experimental evaluation the authors generate 100 instances of undirected (bi-directional) 5-regular random graphs with 100 nodes with the NetworkX library3 implementation of Steger and Wormwald’s algorithm [16].
- The authors then compare the unoptimized and optimized arborescences by failing a fraction of the network links picked at random, and simulate a circular arborescence routing process on the resulting infrastructure.
- In the latter case, they continue on the next available arborescence, i.e., if a packet has used arborescence Ti up to the failed link, it will then follow arborescence Ti+1 provided that the corresponding outgoing link is available, or try arborescences Ti+2, . . . otherwise.
4.1 Impact of the Original Network Decomposition
- The authors first study the impact of the network arborescence decomposition algorithm (that is, the input of the optimization process) on the optimization efficiency, before analyzing the optimization scenario in more detail.
- Both of them are described next in more detail.
- Hence, as each arc of the O(E) arcs might get tested O(k) times, the construction finishes in O(|E|2k2).
- The greedy decomposition is analogous to the random decomposition and is used for the experimental evaluation in [17].
- First, one can observe that the Random arborescence decomposition (top) performs worse than the Greedy arborescences decomposition before optimization : for instance facing x = 20 random link failures, the median stretch is 11 for Random and only 5 for Greedy, and 10% of the samples have a stretch above 22 for Random, and only above 9 for Greedy.
4.2 Optimization Use Cases
- A first fundamental objective is to ensure that failover routes are short.
- Given a subset of nodes that are deemed crucial and need to send packets to some destination node (the root of the arborescence) as well as a set of links highly susceptible to failures, the packets should reach the destination even if all these links or a subset of them failed with short detours.
- The two next metrics exhibit a mirrored trend compared to the figure of stretch: optimizing load efficiently reduces the load in both median and 10% worst cases.
- When the number of SRLG links increases, the algorithms manages to put proportionally less such links in the last arborescences.
- Thus paths are already independent with a high probability (949/1000 on average), and that this quantity varies considerably across networks (high dispersion of values).
4.3 Runtime Analysis
- The authors now turn their attention to the runtime of their optimization framework.
- The single-threaded code is executed on a 24-core Intel Xeon E5-2620 platform with 32Gb memory.
- Figure 9 presents the distribution of those results.
- It shows that optimizing stretch or load on a 80-nodes topology takes on average around 750 seconds.
- Quite surprisingly, connectivity only has a slight impact on runtime.
4.4 Optimizing Network Decomposition Heuristics
- So far in this section, the authors evaluated their postprocessing framework on network decomposition algorithms that always yield a valid output.
- Recent work [14] also proposed a heuristic called Bonsai that attempts to generate arborescences of small depth, with no guarantees if a valid output may be produced.
- This is in contrast to the random and greedy schemes, which build arborescences sequentially.
- Even though the Bonsai round-robin scheme outperforms the greedy and random schemes regarding stretch quality in evaluations in [14], it has the downside that it might not produce a valid decomposition.
4.5 Experiments on Real World Graphs
- To complement their experiments on synthetic graphs, the authors also ran them on well-connected cores of network topologies, taken from the Topology Zoo data set [18].
- The authors trim the Topology Zoo graphs s.t. only the well-connected cores remain, as follows.
- Next, the authors replace nodes that have a degree 3 with three edges between the three affected neighbors.
- The results of the experiments are very similar to the results on synthetic graphs.
- In all cases, the optimizations are computed quickly and yield improvements in the same percentage range as the authors have observed on synthetic graphs.
5 AN EXAMPLE ILP MODEL FOR THE CIRCULAR ROUTING SCHEME
- The existence of a valid circular routing scheme based on k arc-disjoint spanning arborescences in a given network graph containing a known set of failed links can also be analyzed with the aid of Integer Linear Programming (ILP) tools.
- To illustrate one of the possible approaches, the authors formulate an example mathematical model of the corresponding ILP optimization problem for path lengths and stretch below.
- The remaining terms in Formula (2) guarantee that the corresponding binary variables are set to 0, unless the positive value is required to satisfy the constraints.
- Then, the authors eliminate the forbidden combinations of used arborescences, which is enforced by the following groups of constraints (16: Non-consecutive trees A)(19: Prohibited rerouting B).
7 CONCLUSION
- This paper was motivated by the computational challenges involved in computing network decompositions which do not only provide basic connectivity but also account for the quality of routes after failures.
- The authors proposed and evaluated a simple solution which improves an arbitrary network decomposition, using fast postprocessing, in terms of basic traffic engineering metrics such as route length and load.
- Furthermore, the authors showed that their framework can also be used to improve resiliency for shared risk link groups: an important extension in practice.
- Lastly, in order to guarantee reproducibility and facilitate other researchers to build upon their algorithms, their code is publicly available at https://gitlab.cs.univie.ac.at/ctpapers/fast-failover.
Did you find this useful? Give us your feedback
Citations
42 citations
10 citations
Cites background from "Improved Fast Rerouting Using Postp..."
...who introduced a powerful approach which decomposes the network into arc-disjoint arborescence covers [13–15], further investigated in [23–26] to reduce stretch and load....
[...]
..., [5, 10, 16, 25, 35, 36, 39, 43, 46]....
[...]
8 citations
8 citations
5 citations
References
22 citations
20 citations
19 citations
2 citations
Related Papers (5)
Frequently Asked Questions (14)
Q2. What have the authors stated for future works in "Improved fast rerouting using postprocessing" ?
The authors understand their work as the first step and believe that it opens several interesting avenues for future research. In particular, it will be interesting to study alternative postprocessing algorithms, and derive formal performance guarantees for them. It would also be interesting to study further use cases for their framework, beyond the ones given in this paper, e. g., for SRLGs combined with load and stretch.
Q3. What is the effect of optimizing for low load?
For low load some flows must take detours, so in general optimizing for low load leads to higher stretch, as the authors will see in their next experiments.
Q4. What is the way to analyze arborescences?
The existence of a valid circular routing scheme based on k arc-disjoint spanning arborescences in a given network graph containing a known set of failed links can also be analyzed with the aid of Integer Linear Programming (ILP) tools.
Q5. What is the first group of constraints?
The first group of constraints (1: Arc in one tree) guarantees that each arc in the network graph belongs to at most one of k arc-disjoint spanning arborescences covering the graph.
Q6. What is the motivation behind this paper?
This paper was motivated by the computational challenges involved in computing network decompositions which do not only provide basic connectivity but also account for the quality of routes after failures.
Q7. What are the main drawbacks of static fast rerouting algorithms in the data plane?
The authors in this paper are interested in static fast rerouting algorithms in the data plane, which rely on precomputed failover rules and do not require packet header rewriting.
Q8. How many failures are there in Greedy arborescences?
Even under a high number of failures (e.g. 40), the median of routing failures is 0 in both optimized and unoptimized arborescences, only the 10% worst unoptimized arborescences seem to raise to a low 5% failure rate.
Q9. What is the effect of stretch optimization on the routing failure rate?
One can first observe (top) that this optimization has an impact on the routing failure rate: before optimizing, some packets do not reach their destination, but after swapping, the failure rate is 0.
Q10. What is the objective of swapping edges?
Figure 8 (right) presents the results of swapping edges with the objective of increasing the number of independent paths from all nodes in all arborescence pairs.
Q11. How many arborescences can be swapped before an improvement of the objective function is required?
the authors note that their algorithmic framework can also be generalized to swap multiple (i.e., more than two) arcs before an improvement of the objective function is required, even from multiple nodes at once.
Q12. What is the shortest path between the source and the root node?
to be able to minimize the maximum path stretch among all user demands d in the network graph containing failed links (arcs belonging to the set F ), the authors first introduce additional virtual unit flows to find the shortest paths between the source nodes sd and the root node r, and then, the authors determine the maximum path stretch based on the difference in length between the actually used paths (circular routing) and the reference paths (the shortest paths avoiding the failed links).
Q13. What is the problem with rewriting packet headers?
The problem is particularly challenging in scenarios where packet headers cannot be used to carry meta-information about encountered failures: such header rewriting is often undesired and introduces overhead (related to header rewriting itself, but also in terms of additional rules required at the routers to process such information).
Q14. Why are there no additional information about failure scenarios and failover objectives?
In particular, the authors are motivated by the observation that in practice, additional information about failure scenarios and failover objectives may be available, e.g., about shared risk link groups [11], [12], [13] or about critical flows for which it is important to be routed along short paths, even after failures.