This work proposes--and proves formally--three generic, low-complexity deadlock avoidance mechanisms that only require local information, which are topology- and routing-independent and their virtual channel count is bounded by the length of the longest path.
Abstract:
Recently, the use of graph-based network topologies has been proposed as an alternative to traditional networks such as tori or fat-trees due to their very good topological characteristics. However they pose practical implementation challenges such as the lack of deadlock avoidance strategies. Previous proposals either lack flexibility, underutilise network resources or are exceedingly complex. We propose--and prove formally--three generic, low-complexity deadlock avoidance mechanisms that only require local information. Our methods are topology- and routing-independent and their virtual channel count is bounded by the length of the longest path. We evaluate our algorithms through an extensive simulation study to measure the impact on the performance using both synthetic and realistic traffic. First we compare against a well-known HPC mechanism for dragonfly and achieve similar performance level. Then we moved to Graph-based networks and show that our mechanisms can greatly outperform traditional, spanning-tree based mechanisms, even if these use a much larger number of virtual channels. Overall, our proposal provides a simple, flexible and high performance deadlock-avoidance solution.
TL;DR: The unified data and storage network architecture is described, reporting on the status of development of different testbeds and highlighting preliminary benchmark results obtained through the execution of scientific, engineering and data analytics scalable application kernels.
TL;DR: A multi-objective optimization-based framework to explore possible network topologies to be implemented in the EU-funded ExaNeSt project shows that the generated solutions can provide better topological characteristics and also higher performance for parallel applications.
TL;DR: This paper proposes a new class of paths that can be used without additional networking hardware and count its members that are shorter than or of equal length to these "minimal paths".
TL;DR: In this paper , the authors proposed a memory-cube network called Diagonal Memory Network (DMN) for low-latency and low-voltage memory-read communication, which reduces the use of hardware resources by more than 31%.
TL;DR: In this article, a deadlock-free routing algorithm for arbitrary interconnection networks using the concept of virtual channels is presented, where the necessary and sufficient condition for deadlock free routing is the absence of cycles in a channel dependency graph.
TL;DR: A deadlock-free routing algorithm can be generated for arbitrary interconnection networks using the concept of virtual channels, which is used to develop deadlocked routing algorithms for k-ary n-cubes, for cube-connected cycles, and for shuffle-exchange networks.
TL;DR: Experiments in the testbed demonstrate that BCube is fault tolerant and load balancing and it significantly accelerates representative bandwidth-intensive applications.
TL;DR: Results from theoretical analysis, simulations, and experiments show that DCell is a viable interconnection structure for data centers and can be incrementally expanded and a partial DCell provides the same appealing features.
TL;DR: There is a distributed randomized algorithm that can route every packet to its destination without two packets passing down the same wire at any one time, and finishes within time $O(\log N)$ with overwhelming probability for all such routing requests.
Q1. What are the contributions in "High-performance, low-complexity deadlock avoidance for arbitrary topologies/routings" ?
The authors propose–and prove formally–three generic, low-complexity deadlock avoidance mechanisms that only require local information. The authors evaluate their proposed mechanisms against previous proposals through an extensive simulation study to measure the impact on the performance using both synthetic and realistic traffic. First the authors compare against a well-known HPC mechanism for dragonfly and achieved similar performance level. Then the authors moved to Graph-based networks and show that their mechanisms can greatly outperform traditional, spanning-tree based mechanisms, even if these use a much larger number of virtual channels. Overall, the authors find that their proposal provides a simple, flexible and high performance deadlock-avoidance solution.
Q2. What future works have the authors mentioned in the paper "High-performance, low-complexity deadlock avoidance for arbitrary topologies/routings" ?
Two of the ideas that the authors are investigating for future works are: ( i ) Analysis of the generated paths so to be able to instrument routing functions in such a way that they reduce the number of VC transitions.
Q3. What are the two ideas that the authors are investigating for future works?
Two of the ideas that the authors are investigating for future works are: (i) Analysis of the generated paths so to be able to instrument routing functions in such a way that the authors reduce the number of VC transitions.
Q4. What is the rationale for using ECMP for balanced traffic?
The rationale for that is that ECMP leverages the gains of using shortest paths for balanced traffic (uniform), with those of using multipath for unbalanced traffic (adversarial).
Q5. What are the two types of routing algorithms to avoid deadlock?
There exist basically two types of routing algorithms to create deadlock-free paths: those that avoid the creation of cycles in the channel dependency graph (CDG) and those that break the cycles in the CDG using VCs.
Q6. What are the main drawbacks of the first group of algorithms?
However all of them require to perform complex searches onto the CDG being the main drawback of these approaches the computational and memory complexity of the algorithms.
Q7. What is the way to generate a random id?
For this reason, a small module that reads several local sensors (e.g. voltage, temperature, internal clock, etc) and hashes them together to generate a random id at boot up seems like a more flexible solution.
Q8. What are the main considerations for the proposed approach?
All these considerations show the feasibility of their approach and also that it imposes very low overhead to the switch architecture and no system-level support.