scispace - formally typeset
Open AccessJournal ArticleDOI

A routing methodology for achieving fault tolerance in direct networks

TLDR
This paper presents a new fault-tolerant routing methodology that does not degrade performance in the absence of faults and tolerates a reasonably large number of faults without disabling any healthy node.
Abstract
Massively parallel computing systems are being built with thousands of nodes. The interconnection network plays a key role for the performance of such systems. However, the high number of components significantly increases the probability of failure. Additionally, failures in the interconnection network may isolate a large fraction of the machine. It is therefore critical to provide an efficient fault-tolerant mechanism to keep the system running, even in the presence of faults. This paper presents a new fault-tolerant routing methodology that does not degrade performance in the absence of faults and tolerates a reasonably large number of faults without disabling any healthy node. In order to avoid faults, for some source-destination pairs, packets are first sent to an intermediate node and then from this node to the destination node. Fully adaptive routing is used along both subpaths. The methodology assumes a static fault model and the use of a checkpoint/restart mechanism. However, there are scenarios where the faults cannot be avoided solely by using an intermediate node. Thus, we also provide some extensions to the methodology. Specifically, we propose disabling adaptive routing and/or using misrouting on a per-packet basis. We also propose the use of more than one intermediate node for some paths. The proposed fault-tolerant routing methodology is extensively evaluated in terms of fault tolerance, complexity, and performance.

read more

Content maybe subject to copyright    Report

A Routing Methodology for Achieving
Fault Tolerance in Direct Networks
Marı
´
a Engracia Go
´
mez, Member, IEEE, Nils Agne Nordbotten, Jose
´
Flich,
Pedro Lo
´
pez, Member, IEEE Computer Society, Antonio Robles, Member, IEEE Computer Society,
Jose Duato, Member, IEEE , Tor Skeie, and Olav Lysne, Member, IEEE
Abstract—Massively parallel computing systems are being built with thousands of nodes. The interconnection network plays a key
role for the performance of such systems. However, the high number of components significantly increases the probability of failure.
Additionally, failures in the interconnection network may isolate a large fraction of the machine. It is therefore critical to provide an
efficient fault-tolerant mechanism to keep the system running, even in the presence of faults. This paper presents a new fault-tolerant
routing methodology that does not degrade performance in the absence of faults and tolerates a reasonably large number of faults
without disabling any healthy node. In order to avoid faults, for some source-destination pairs, packets are first sent to an intermediate
node and then from this node to the destination node. Fully adaptive routing is used along both subpaths. The methodology assumes a
static fault model and the use of a checkpoint/restart mechanism. However, there are scenarios where the faults cannot be avoided
solely by using an intermediate node. Thus, we also provide some extensions to the methodology. Specifically, we propose disabling
adaptive routing and/or using misrouting on a per-packet basis. We also propose the use of more than one intermediate node for some
paths. The proposed fault-tolerant routing methodology is extensively evaluated in terms of fault tolerance, complexity, and
performance.
Index Terms—Fault tolerance, direct networks, adaptive routing, virtual channels, bubble flow control.
æ
1INTRODUCTION
T
HERE exist many compute-intensive applications that
require a huge amount of processing power (nuclear
weapons simulations, protein fold ing, global cli mate
modeling, galaxy interaction simulations, etc.). These
applications require continued research and technology
development to deliver computers with steadily increasing
computing power. The required levels of computing
power can only be achieved with massively parallel
computers, such as the Earth Simulator [19], the ASCI
Red [1], and the BlueGene/L [5].
The huge number of processors and associated devices
(memories, switches, and links, etc.) significantly affects the
probability of failure. Each individual component can fail
and, thus, the probability of failure of the entire system
increases dramatically. One of the JASON Defense Advi-
sory Panel reports from 2003, about the requirements for
ASCI, states that “Scaling to PetaFlop using present
machine architectures implies very large number of
processors—of order 100,000, perhaps—might be needed.
Such large numbers raises serious questions of scalability of
code performance and of machine reliability.”
Thus, in these systems, it is critical to keep the system
running, even in the presence of failures. In addition,
failures in the interconnection network may isolate a large
fraction of the machine, containing many healthy proces-
sors that otherwise could have been used. Although
network components, like switches and links, are robust,
they are working close to their technological limits and,
therefore, they are prone to failure s. Increasing clock
frequencies leads to a higher power dissipation, which
again could lead to premature failures. Therefore, fault-
tolerant mechanisms for interconnection networks are
becoming a critical design issue for large massively parallel
computers [25], [47], [26], [48], [37], [38].
Faults can be classified as transient or permanent.
Transient faults are usually handled by communication
protocols, using CRCs to detect faults and retransmitting
packets. In order to deal with permanent faults in a system,
two fault models can be used: static or dynamic. In a static
fault model, it is assumed that all the faults are known in
advance when the machine is (re)booted. In order to
implement it, once a fault is detected, all the processes in
the system are halted, the network is emptied, and a
management application is run in order to deal with the
faulty component. The management application detects
where the fault is, computes the information required by
the nodes in order to tolerate the fault, and distributes the
information. Then, the system is rebooted and the processes
are resumed. This fault model needs to be combined with
checkpointing techniques in order to be effective. Applying
checkpointing minimizes the fault’s impact on applications
because they are restarted from the latest checkpoint. In a
dynamic fault model, once a new fault is found, actions are
400 IEEE TRANSACTIONS ON COMPUTERS, VOL. 55, NO. 4, APRIL 2006
. M.E. Go
´
mez, J. Flich, P. Lo
´
pez, A. Robles, and J. Duato are with the
Department of Computer Engineering, Universidad Polite
´
cnica de
Valencia, Camino de Vera, 14, 46071-Valencia, Spain.
E-mail: {megomez, jflich, plopez, arobles, jduato}@disca.upv.es.
. N.A. Nordbotten, T. Skeie, and O. Lysne are with the Simula Research
Laboratory, PO Box 134, N-1325, Lysaker, Norway.
E-mail: {nilsno, tskeie, olavly}@simula.no.
The first two authors are listed in alphabetical order.
Manuscript received 7 Feb. 2005; revised 5 Aug. 2005; accepted 5 Oct. 2005;
published online 22 Feb. 2006.
For information on obtaining reprints of this article, please send e-mail to:
tc@computer.org, and reference IEEECS Log Number TC-0037-0205.
0018-9340/06/$20.00 ß 2006 IEEE Published by the IEEE Computer Society
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on November 4, 2009 at 11:13 from IEEE Xplore. Restrictions apply.

taken in order to appropriately handle the faulty compo-
nent while the system keeps running. For instance, a source
node that detects a faulty component in a path can switch to
a different path.
Although there are different ways to deal with faults in
the interconnection network (see Section 2), most of the
solutions proposed in the literature are based on designing
fault-tolerant routing algorithms that are able to find an
alternative path when a packet meets a fault along the path
to its destination. Most of the proposed fault-tolerant
routing strategies require a significant amount of extra
hardware resources (e.g., virtual channels) to route packets
around faulty components in the case of failure. Alterna-
tively, there exist some fault-tolerant routing strategies that
use a very small number of extra resources to handle
failures, at the expense of either disabling certain healthy
nodes [25], thus reducing the processing power, or
dramatically increasing the latencies for some packets
[42]. Moreover, when faults occur, with most of those
fault-tolerant strategies, link utilization may become sig-
nificantly unbalanced, thus leading to premature network
saturation and consequently degrading network perfor-
mance even more.
In this paper, we overcome all these limitations,
proposing a fault-tolerant methodology for interconnection
networks that does not degrade performance at all in the
absence of faults, inflicts minimal performance degradation
in the presence of faults, and tolerates a reasonably large
number of faults. Indeed, fault tolerance is achieved
without disabling any healthy node, without requiring too
many extra hardware resources, and without introducing
any significant penalty (e.g., extra latency) when routing
packets in a faulty network. The methodology assumes a
static fault model and the use of a checkpoint/restart
mechanism.
The methodology is based on the use of intermediate
nodes for routing [22].
1
That is, for some source-destination
pairs, packets are first forwarded to an intermediate node
and then from that node to the destination, in this way
splitting the routing path into two subpaths. Basically, we
avoid faulty links by reducing the number of possible
adaptive paths between source and destination nodes. In
particular, we remove those paths along which packets
could encounter any of the faults. Notice, though, that
adaptive routing is still used along both subpaths. How-
ever, for some few paths, an intermediate node is not
enough to avoid the faults. In order to increase the fault
tolerance degree, two extensions of the methodology are
proposed. These extensions restrict routing even more to
ensure that packets will not encounter the fault. The first
one disables adaptive routing and/or uses misrouting on a
per packet basis ([23], [24]). The second extension extends
the idea of intermediate nodes. Instead of using one
intermediate node, it allows the use of multiple intermedi-
ate nodes for some paths [34], enabling adaptive routing to
be used for all subpaths.
The methodology is valid for any network topology
2
and
allows the use of fully adaptive routing even in the presence
of failures.
3
To avoid deadlock, only three virtual channels
are required even for tori.
4
In this paper, we present the fault-tolerant rou ting
methodology, fully exploring the impact of each of its
mechanisms ([22], [23], [34]). We compare them from a
uniform point of view and in the same scenarios, trying to
meet the trade-offs between fault tolerance, network
performance, and complexity.
The rest of the paper is organized as follows: Section 2
describes related work on fault tolerance. In Section 3, the
methodology based on using intermediate nodes is pre-
sented. The extensions to the methodology are presented in
Section 4. In Section 5, the different combinations of the
proposed methodology are evaluated and compared using
the same scenarios in terms of complexity, fault tolerance,
and performance. Finally, in Section 6, some conclusions
are drawn.
2RELATED WORK
Basically, there are three ways to tolerate faults in
interconnection networks: component redundancy, fault-
tolerant routing algorithms, and reconfiguration. Using
component redundancy has been the easiest way to provide
fault tolerance. Components in the system are replicated
and, once a failed component is detected, it is simply
replaced by its redundant copy. The main drawbacks of this
approach are the high extra cost of the spare components
and the nonnegligible probability that the circuits required
to switch to the spare components may fail. An enhanced
version of this technique does not require spare compo-
nents. It simply bypasses faulty components, together with
some healthy components, to maintain network regularity.
For instance, in the BlueGene/L project [5], the nodes are
connected by using a 3D torus. The full BlueGene/L
supercomputer is constituted by 65,536 nodes, which are
allocated over 64 racks of 1,024-nodes, with two 512-node
midplanes per rack [20]. Once a failure is detected, all the
nodes included in the midplane (512-nodes) that contains
the faulty node/link are marked as faulty.
Another powerful technique is based on reconfiguring
the routing tables in the case of failure, adapting them to the
new topology after the failure [7], [44], [32], [33]. This
approach is appropriate for switch-based networks
(Myrinet [2], Quadrics [35], InfiniBand [27]) in which the
topology is defined by the end user. When using reconfi-
guration, any number of faults is tolerated without
requiring additional resources [38], as long as the network
remains connected. This technique is extremely flexible, but
this flexibility may also kill performance due to the need of
using generic routing algorithms as a consequence of the
irregularity in the resulting network topology. Often,
generic routing algorithms achieve poor performance when
applied to regular networks (e.g., 3D tori), as shown in [39].
This is because, in these cases, generic routing schemes are
usually not able to provide minimal routing in all cases,
regardless of whether they are deterministic or adaptive
G
OOMEZ ET AL.: A ROUTING METHODOLOGY FOR ACHIEVING FAULT TOLERANCE IN DIRECT NETWORKS 401
1. Intermediate nodes were introduced by Valiant [45] for other
purposes, such as traffic balancing.
2. For the sake of simplicity, we will focus on torus and mesh networks.
3. If intermediate nodes are used, fully adaptive routing refers to each
subpath.
4. Note that two virtual channels are already required to provide
deadlock-free fully adaptive routing [36].
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on November 4, 2009 at 11:13 from IEEE Xplore. Restrictions apply.

[41].
5
Furthermore, they often provide worse traffic balance
than that provided by the routing schemes specifically
designed for these networks.
A large number of fault-tolerant routing algorithms for
multiprocessor systems have been proposed, especially for
mesh and torus topologies. Some approaches [30], [46] use
global status information of the network, whereas others
use only local status information. In particular, adaptive
routing is used together with link status in [15] and [31].
However, these approaches require using a large number of
virtual channels, depending on either t he n umber of
tolerated faults [15] or the number of dimensions of the
topology [31]. Real systems implement very simple me-
chanisms that partly address the problem, such as the
direction order routing used in the Cray T3E [43]. Other
solutions consist of using routing algorithms together with
additional resources (virtual channels). Some of these
solutions are based on block faults [8], [4], [9], [10], [11],
[47], whereas others allow individual faults [21], [17], [12].
In the former case, several healthy nodes must be marked as
faulty, reducing the system’s processing capacity, in order
to build fault regions (either rectangular or nonconvex
regions). Packets are routed around these fault regions. To
this end, several virtual channels must be used. Finally,
some routing solutions ([28], [16]) are based on performing
misrouting and backtracking of packet headers. Despite
tolerating any number of faults , these strategi es often
strongly penalize the network performance.
To overcome these drawbacks, a software-based fault-
tolerant routing approach [42] can be used. When a packet
encounters a fault, it is ejected from the network and is later
forwarded through an alternative path. This mechanism is
very flexible and supports many failure patterns, without
either marking healthy nodes as faulty or requiring
additional virtual channels. However, some packets may
suffer high latencies due to the pac ket ejection and
reinjection, and these packets also consume memory
bandwidth in the node where the injection/reinjection is
performed. An approach that minimizes the number of
required virtual channels and tolerates a fairl y l arge
number of faults, at the expense of disabling some healthy
nodes, was recently proposed in [25]. This algorithm is
based on a static fault model and only requires two virtual
channels per link. In the absence of faults, dimension-order
routing (DOR) is used. When faults prevent the use of DOR,
a set of nodes must be sacrificed (lamb nodes) in order to
guarantee that every survivor node, a node that is neither
faulty nor a lamb, can reach every survivor node by at most
two rounds of DOR. Deadlock-freedom is guaranteed
provided that a different virtual channel is used during
each round. The main drawbacks of this routing algorithm
are that a significant number of nodes must be disabled for
packet transmission/reception (but not for routing) in order
to support communication among the remaining nodes and
that it does not support adaptive routing.
It is important to highlight the main differences between
our proposal, described in Sections 3-4, a nd other
approaches in the literature that also use a small number
of extra resources. In particular, unlike [42], in no case does
our proposed methodology require ejecting/reinjecting a
packet at an intermediate node, thus reducing latency
drastically. Moreover, unlike [25], our fault-tolerant meth-
odology does not need to deactivate any lamb node to
achieve good fault tolerance. Furthermore, the proposed
methodology allows packets to be adaptively routed, thus
increasing the overall network throughput.
3METHODOLOGY
In this section, we will describe the basic mechanism for
achieving fault tolerance, that is, routing through inter-
mediate nodes. For this purpose, we will assume a k-ary
n-cube (torus) or an n-dimensional mesh network with
minimal adaptive routing based on Duato’s protocol [18]. In
the absence of faults, packets are routed using fully
adaptive routing, with at least two virtual channels (i.e.,
one adaptive channel and one escape channel) per physical
link. The adaptive channels enable routing through any
minimal path, whereas the escape channels guarantee
deadlock freedom by using a deterministic routing function
free from cyclic dependencies. At each hop, packets that
cannot use any of the adaptive channels that provide a
minimal path to their respective destinations use the escape
channel provided by the deterministic routing function.
6
Also, a static fault model with checkpointing is assumed.
Detection of faults, checkpointing, and distribution of
routing info is performed as part of the static fault model
and, thus, will not be further discussed in this paper.
If faulty components can be encountered when routing
packets between a source-destination pair, the methodology
avoids these faults by using intermediate nodes for routing.
Packets are first forwarded to a suitable intermediate node
and, then, from this node to their final destination. This
way, intermediate nodes are used in order to obtain greater
control over the paths followed by packets, thereby
avoiding the faults. Notice that the packets are not ejected
from the network at the intermediate node. Fig. 1 shows a
source-destination pair that uses an intermediate node. The
original routing algorithm (based on Duato’s protocol) is
used in both subpaths. By using intermediate nodes, areas
402 IEEE TRANSACTIONS ON COMPUTERS, VOL. 55, NO. 4, APRIL 2006
Fig. 1. The use of an intermediate node (I) limits the number of possible
paths, from the source (S) to the destination (D), enabling the fault (F )to
be avoided.
5. Some generic routing strategies are able to provide minimal paths at
the expense of using a large number of virtual channels, often depending on
the network size.
6. Notice that packets can again be routed through adaptive channels
(when free) after using an escape channel.
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on November 4, 2009 at 11:13 from IEEE Xplore. Restrictions apply.

containing faults are avoided at the expense of reducing the
number of possible paths.
Dead locks are avoided by using a separate escape
channel for each phase. That is, one escape channel is used
(if required) from the source to the intermediate node and
another one from the intermediate node to the destination.
This way, each phase defines a separate virtual network
and packets change virtual network at the intermediate
node. Thus, there are no dependencies between the two
subpaths, allowing two independent paths to be followed.
7
Although each virtual network relies on a different escape
channel, they share the same adaptive channel(s). Thus, a
total of at least three virtual channels are required (one
adaptive and two esc ape). Notice that deadlock-free
minimal adaptive routing, based on Duato’ s pro tocol,
requires at least two virtual channels, so only one additional
virtual channel is required.
The escape channels use deterministic dimension order
routing (DOR) and, for tori, also the bubble flow control
mechanism [6]. The bubble flow control mechanism avoids
deadlocks in the dimension rings of torus topologies by
ensuring that there is always an empty buffer that allows
packets to advance along the ring. With this mechanism, a
packet that is injected into the network, crosses a network
dimension, or originates from an adaptive channel requires
two free buffers (i.e., one for the packet and one additional
free buffer) in order to be allowed in the escape channel.
Indeed, a packet changing virtual network at an inter-
mediate node should also be considered as entering the ring
and, therefore, requires two free buffers. Alternatively, if
the bubble flow control mechanism is not used, deadlock
freedom can be provided in torus topologies through the
use of additional virtual channels [13]. If this latter
approach were used, each subpath (i.e., virtual network)
would require two escape channels, resulting in a total of
five virtual channels (i.e., four escape channels and at least
one adaptive channel). Anyway, this is an implementation
issue, aimed at guaranteeing deadlock-freedom along the
escape paths, that has no influence on the proposed fault-
tolerant routing methodology.
8
Next, a methodology for identifying the intermediate
nodes is presented.
3.1 Intermediate Nodes for Adaptive Routing
In what follows, we will denote the source node as S and the
destination node as D. The intermediate node is denoted as I.
Faulty links are denoted as F
i
. A node failure can easily be
modeled as the failure of all the links of a node.
When minimal adaptive routing is used, the intermediate
node I should have the following properties so that the
fault(s) F
i
are avoided when routing packets from S via I to D:
1. I is reachable from S.
2. D is reachable from I.
3. There is no I
0
(fulfilling the previous requirements)
giving a shorter path than I.
The first requirement guarantees that packets can be
routed from S to I and the second one that packets can be
routed from I to D. The third requirement guarantees that
the final path is the shortest possible. We define that, when
minimal adaptive routing is used, a node N
2
is reachable
from a node N
1
if and only if, for all i, F
i
is not on any
minimal path from N
1
to N
2
.
To identify the possible intermediate nodes, let T
RS
be
the set of nodes reachable from S and T
D
the set of nodes
from which D is reachable. Furthermore, let lðx; yÞ be the
length of the minimal path, in the fault-free case, from x to
y. We then define T
j
(for j 0) in the following way: A
node N is in T
j
if and only if lðS; N ÞþlðN; DÞ¼lðS; DÞþj.
This way, T
j
for different values of j defines nonoverlap-
ping sets of nodes, as shown in Fig. 2. These sets can easily
be identified by starting with the nodes that are traversed
on any minimal path from S to D (i.e., j ¼ 0) and continuing
outward.
Theorem 1. Let j be the smallest integer for which T
j
\T
RS
\
T
D
is nonempty. A node N fulfills all three requirements of an
intermediate node I if and only if N 2T
j
\T
RS
\T
D
.
Proof. We prove the theorem by induction. The theorem is
true for j ¼ 0 (i.e., for minimal routes):
. Let us assume that there is one node N in the set
that does not fulfill the requirements of an
intermediate node. Then, N would either have
to be unreachable from S, not have a valid route
to D, or not be on a minimal path from S to D.If
N is unreachable from S, it is by definition not in
T
RS
.IfN does not have a valid route to D,itisby
definition not in T
D
.IfN is not on a minimal path
from S to D, it is by definition not in T
0
. Because
of the properties of set intersections, N must be in
G
OOMEZ ET AL.: A ROUTING METHODOLOGY FOR ACHIEVING FAULT TOLERANCE IN DIRECT NETWORKS 403
7. Note that, in certain cases, it may be unnecessary to carry out the
virtual network transition at the intermediate node as long as cyclic channel
dependencies are not introduced. However, these situations cannot be
foreseen when using adaptive routing. Applying deterministic routing
along the first subpath could help in some cases. However, the difficulties
in guaranteeing in all cases that cyclic channel dependencies are not
introduced prevent us from removing the need of using an additional
virtual network.
8. We suggest the use of the bubble flow control mechanism in tori
because it allows a more efficient implementation of the proposed
methodology (i.e., it requires a smaller number of virtual channels to
implement the escape paths). The bubble flow control mechanism is
currently being used in the BlueGene/L supercomputer [5].
Fig. 2. The nodes in the sets T
j
, for j 5, for a particular source-
destination pair in a 2D torus.
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on November 4, 2009 at 11:13 from IEEE Xplore. Restrictions apply.

all three sets, T
RS
, T
D
, and T
0
, to be in the set
T
0
\T
RS
\T
D
. Thus, we have a contradiction.
. Let us then assume that there is one node N,
outside the set, which fulfills the requirements of
an intermediate node. N would then have to be
outside at least one of the sets T
RS
, T
D
,orT
0
.If
N is outside T
RS
, it is unreachable from S and,
therefore, does not fulfill requirement one. If N is
outside T
D
, it has no valid route to D and,
therefore, does not fulfill requirement two. If N is
outside T
0
, N is not on a minimal path from S to
D. Thus, we have a contradiction in all three
cases.
If the theorem is true for j ¼ m, then the theorem is
also true for j ¼ m þ 1: Concerning requirements one
and two, the arguments made for j ¼ 0 also hold for
j ¼ m þ 1. Furthermore, when j ¼ m þ 1, no route S-I-D
exists for j<mþ 1. Indeed, as each increase of j adds
one additional hop to the path S-I-D, all the intermediate
nodes found when j ¼ m þ 1 yield paths S-I-D of equal
lengths. Finally, for the same reason, no shorter path can
be found for j>mþ 1. The theorem therefore fulfills all
three requirements. tu
This way, to identify possible intermediate nodes, we
start by considering the minimal paths (j ¼ 0) and then, if
necessary, nonminimal paths (j>0) to avoid the fault(s). By
minimizing j, preference is given to the shortest connected
paths. We illustrate the intermediate node selection in the
next section by applying The orem 1 in two example
scenarios.
3.2 Example Scenarios
Fig. 3 shows a scenario with five link faults. Because there
are faults present in some of the minimal paths between S
and D, an intermediate node is needed. In order to find a
minimal path, we look for an intermediate node within T
0
.
As shown in Fig. 3, there are several nodes within T
0
that
are either reachable from S or are able to reach D. However,
we are only interested in nodes with all of these attributes,
that is, the nodes given by the set T
0
\T
RS
\T
D
. In this
scenario, there is only one such node, that is, the one
identified as a possible intermediate node in Fig. 3. By using
this node as the intermediate node, it is guaranteed that the
faults are not encountered when packets are first routed
from S to I and then from I to D.
Notice that, in a mesh, if all the minimal paths are faulty,
it is not possible to find a suitable intermediate node even
when considering nonminimal paths (i.e., T
j
for j>0)
when fully adaptive routing is used. This is because it is
then impossible to position the intermediate node in such a
way that all the minimal paths from S to I and from I to D
are fault free. In a torus, however, such faults can be
avoided by using a nonminimal path given by taking the
opposite direction to the minimal path. Thus, if all the
minimal paths in the torus shown in Fig. 2 were blocked by
faults, one could, for example, use the node two hops to the
left of the source as an intermediate node and, thus, get a
nonminimal path in the opposite direction of the ring.
Because this node is in T
2
, the path length of this path
equals the minimal path plus two. To handle such a
situation in a mesh, it is necessary to use one of the
complementary mechanisms described in the next section.
4COMPLEMENTARY MECHANISMS
In some situations, like the one in the last example, it is
impossible to avoid all the faults by using the methodology
presented in the previous section. Therefore, we now
present some alternative extensions to the methodology.
First, we present how the intermediate node concept can be
extended to use more than one intermediate node for some
source-destination pairs. Then, w e pr esent alternative
solutions based on using additional mechanisms, that is,
disabling adaptive routing and/or using misrouting for
some paths.
4.1 Multiple Intermediate Nodes
By using more than one intermediate node, additional
control over the paths followed by packets is gained,
enabling more faults to be avoided while still using
adaptive routing for all the paths. In order to still guarantee
deadlock freedom, an additional virtual channel is needed
for each additional intermediate node.
9
This way, each
subpath continues to use a different escape channel. So,
when at most two intermediate nodes are used in each path,
a total of four virtual channels is required (i.e., three escape
channels and one adaptive channel).
When using multiple intermediate nodes, we refer to the
intermediate nodes as I
x
,whereI
1
denotes the first
intermediate node in a route. We will first present a
methodology for using two intermediate nodes. Then, we
generalize this methodology so that it can be used, in a
recursive way, for any number of intermediate nodes.
4.1.1 Two Intermediate Nodes
When using two intermediate nodes, we are looking for
intermediate nodes I
1
and I
2
so that:
. I
1
is reachable from S.
. I
2
is reachable from I
1
.
. D is reachable from I
2
.
404 IEEE TRANSACTIONS ON COMPUTERS, VOL. 55, NO. 4, APRIL 2006
9. If the bubble flow control mechanism was not used in a torus
topology, two virtual channels would be required for each additional
intermediate node.
Fig. 3. The faults are avoided by the use of an intermediate node. The
shaded area identifies the nodes in T
0
.
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on November 4, 2009 at 11:13 from IEEE Xplore. Restrictions apply.

Citations
More filters
Journal ArticleDOI

Elevator-First: A Deadlock-Free Distributed Routing Algorithm for Vertically Partially Connected 3D-NoCs

TL;DR: It is formally proved that independently of the shape and dimensions of the planar topologies and of the number and placement of the TSVs, the proposed routing algorithm using two virtual channels in the plane is deadlock and livelock free.
Proceedings ArticleDOI

Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing

TL;DR: Universal Logic-Based Distributed Routing (uLBDR) as mentioned in this paper is an efficient logic-based mechanism that adapts to any irregular topology derived from 2D meshes, being an alternative to the use of routing tables.
Proceedings ArticleDOI

Understanding the interconnection network of SpiNNaker

TL;DR: The novel emergency routing mechanism, implemented within the routers, allows the topology of SpiNNaker to be more robust than the 3-dimensional torus, regardless of the latter having better topological characteristics.
Journal ArticleDOI

Practical Deadlock-Free Fault-Tolerant Routing in Meshes Based on the Planar Network Fault Model

TL;DR: A new deadlock avoidance technique is proposed for 3D meshes using only two virtual channels by making full use of the idle channels in a deadlock-free adaptive fault-tolerant routing scheme based on a planar network (PN) fault model.
Journal ArticleDOI

Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems

TL;DR: ULBDR is presented, an efficient logic-based mechanism that adapts to any irregular topology derived from 2-D meshes, instead of using routing tables, that requires a small set of configuration bits, thus being more practical than large routing tables implemented in memories.
References
More filters
Journal ArticleDOI

Deadlock-Free Message Routing in Multiprocessor Interconnection Networks

TL;DR: In this article, a deadlock-free routing algorithm for arbitrary interconnection networks using the concept of virtual channels is presented, where the necessary and sufficient condition for deadlock free routing is the absence of cycles in a channel dependency graph.
Book

Deadlock-free message routing in multiprocessor interconnection networks

TL;DR: A deadlock-free routing algorithm can be generated for arbitrary interconnection networks using the concept of virtual channels, which is used to develop deadlocked routing algorithms for k-ary n-cubes, for cube-connected cycles, and for shuffle-exchange networks.
Journal ArticleDOI

Myrinet: a gigabit-per-second local area network

TL;DR: The Myrinet local area network employs the same technology used for packet communication and switching within massively parallel processors, but with the highest performance per unit cost of any current LAN.
Journal ArticleDOI

Virtual cut-through: A new computer communication switching technique

TL;DR: The analysis shows that cut-through switching is superior (and at worst identical) to message switching with respect to the above three performance measures.
Book

Virtual-channel flow control

TL;DR: Simulation studies show that, given a fixed amount of buffer storage per link, virtual-channel flow control increases throughput by a factor of 3.5, approaching the capacity of the network.
Related Papers (5)
Frequently Asked Questions (15)
Q1. How many nodes are allocated to the full bluegene/l supercomputer?

The full BlueGene/L supercomputer is constituted by 65,536 nodes, which are allocated over 64 racks of 1,024-nodes, with two 512-node midplanes per rack [20]. 

Each physical input port is split into several virtual channels (four or five), each providing buffering resources in order to store two packets. 

when using direction order (X þ Y þ Z þX Y Z ) routing instead of dimension order routing, only three faults were tolerated by I+D. 

packet basis (D), it is possible to guarantee an acceptablefault tolerance degree (up to five link faults are tolerated ina 3 3 3 torus) without significantly affecting the implementation cost or performance. 

A large number of fault-tolerant routing algorithms for multiprocessor systems have been proposed, especially for mesh and torus topologies. 

In light of the evaluation results, the authors observe that routing through an intermediate node is the fault-tolerant mechanism most widely used by the methodology. 

THERE exist many compute-intensive applications thatrequire a huge amount of processing power (nuclear weapons simulations, protein folding, global climate modeling, galaxy interaction simulations, etc.). 

If faulty components can be encountered when routing packets between a source-destination pair, the methodology avoids these faults by using intermediate nodes for routing. 

The first of these is used to implement the escape paths for the adaptive routing, whereas the second one is reserved for high priority packets. 

The huge number of processors and associated devices (memories, switches, and links, etc.) significantly affects the probability of failure. 

When using reconfiguration, any number of faults is tolerated without requiring additional resources [38], as long as the network remains connected. 

there are three ways to tolerate faults in interconnection networks: component redundancy, faulttolerant routing algorithms, and reconfiguration. 

This way, intermediate nodes are used in order to obtain greater control over the paths followed by packets, thereby avoiding the faults. 

16 Furthermore, because most paths (all in this case) can be resolved using two intermediate nodes alone, the performance of The author2+D is very similar to that of The author2 and has not been included in the results. 

Notice that the amount of memory required to store the routing info is low, e.g., if the The author2 combination is used in a large system with 65,536 nodes, 256KB of memory would be required.