scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A two-stage switch with load balancing scheme maintaining packet sequence

19 Jun 2006-IEEE Communications Letters (IEEE)-Vol. 10, Iss: 4, pp 290-292
TL;DR: A novel load-balancing scheme for two-stage switches, which does not disturb the sequence of packets and can achieve 100% throughput under not only uniform but also non-uniform traffic.
Abstract: In this letter, we propose a novel load-balancing scheme for two-stage switches, which does not disturb the sequence of packets. The proposed scheme uses chamber queues (CQs) in front of the second crossbar fabric as well as VOQs in front of the first crossbar. While the proposed scheme is very simple, it can achieve 100% throughput under not only uniform but also non-uniform traffic. Moreover, the simulation results show that the average delay of packets in the proposed two-stage switch is lower than that of the original two-stage switch.
Citations
More filters
Journal ArticleDOI
TL;DR: A framework for designing feedback-based scheduling algorithms is proposed for elegantly solving the notorious packet missequencing problem of a load-balanced switch and it is shown that the efforts made in load balancing and keeping packets in order can complement each other.
Abstract: A framework for designing feedback-based scheduling algorithms is proposed for elegantly solving the notorious packet missequencing problem of a load-balanced switch. Unlike existing approaches, we show that the efforts made in load balancing and keeping packets in order can complement each other. Specifically, at each middle-stage port between the two switch fabrics of a load-balanced switch, only a single-packet buffer for each virtual output queueing (VOQ) is required. Although packets belonging to the same flow pass through different middle-stage VOQs, the delays they experience at different middle-stage ports will be identical. This is made possible by properly selecting and coordinating the two sequences of switch configurations to form a joint sequence with both staggered symmetry property and in-order packet delivery property. Based on the staggered symmetry property, an efficient feedback mechanism is designed to allow the right middle-stage port occupancy vector to be delivered to the right input port at the right time. As a result, the performance of load balancing as well as the switch throughput is significantly improved. We further extend this feedback mechanism to support the multicabinet implementation of a load-balanced switch, where the propagation delay between switch linecards and switch fabrics is nonnegligible. As compared to the existing load-balanced switch architectures and scheduling algorithms, our solutions impose a modest requirement on switch hardware, but consistently yield better delay-throughput performance. Last but not least, some extensions and refinements are made to address the scalability, implementation, and fairness issues of our solutions.

37 citations


Cites background or methods from "A two-stage switch with load balanc..."

  • ...Load-balanced switches have received a great deal of attention recently [8]–[19] because they are more scalable and can provide close to 100% throughput....

    [...]

  • ...As a result, the duration of a time slot in [17] and [19] would be much longer than that shown in Fig....

    [...]

  • ...Our work in this paper, developed independently, is most closely related to [19]....

    [...]

  • ...If it is used for implementing feedback path (as in [17] and [19]), occupancy vector cannot be piggybacked onto data packet....

    [...]

  • ...Unlike [19], this gives additional flexibility of selecting an optimal joint sequence for a given traffic pattern....

    [...]

Journal ArticleDOI
TL;DR: The presented results show that, by running a properly constrained scheduling algorithm to avoid or minimize crosstalk, it is possible to operate an AWG-based switch with large port counts without significant performance degradation.
Abstract: Array waveguide grating (AWG)-based optical switching fabrics are receiving increasing attention due to their simplicity and good performance. However, AWGs are affected by coherent crosstalk that can significantly impair system operation when the same wavelength is used simultaneously on several input ports. To permit large port counts in a N × N AWG, a possible solution is to schedule data transmissions across the AWG preventing switch configurations that generate large crosstalk. We study the properties and the existence conditions of switch configurations able to control coherent crosstalk. The presented results show that, by running a properly constrained scheduling algorithm to avoid or minimize crosstalk, it is possible to operate an AWG-based switch with large port counts without significant performance degradation.

15 citations


Cites background from "A two-stage switch with load balanc..."

  • ...To avoid this, either resequencing modules must be introduced at the outputs of the second stage, or more complex queuing structures and policies must be used between the two stages [16]–[18]....

    [...]

Journal ArticleDOI
TL;DR: A number of simple electro-optic switch architectures based on successive wavelength selection, WDM multiplexing and space switching are evaluated, attempting to achieve scalable switching fabrics with good throughput performance on average, thus lower total insertion losses as well as lower power consumption.

10 citations

Proceedings ArticleDOI
14 Jun 2009
TL;DR: It is shown that distributed schedulers with predetermined connection patterns can be used to avoid these harmful arrangements, and more realistic port count limits are calculated for both scheduler types.
Abstract: Packet switches with optical fabrics can potentially scale to higher capacities. It is also potentially possible to improve their reliability, and reduce both their footprint and power consumption. A well-known alternative for implementing hardwired switches is Arrayed Waveguide Grating (AWG). Ideally, AWG insertion losses do not depend on the number of input-output ports, meaning that scalability is theoretically infinite. However, accurate second-order assessment has demonstrated that in-band crosstalk exponentially increases the power penalty, limiting the realistic useful size of AWG commercial devices to about 10-15 ports (13-18 dB) [1]. On the other hand, the in-band crosstalk at AWG outputs depends on the connection pattern set by the scheduling algorithm and this port count limitation is calculated for worst-case scenarios. In this paper, we show that distributed schedulers with predetermined connection patterns can be used to avoid these harmful arrangements. We also show that the probability of worst-case patterns is very low, allowing us to set a more realistic port limit for general centralized schedulers and very small losses. With these results, we calculate more realistic port count limits for both scheduler types.

9 citations


Cites background from "A two-stage switch with load balanc..."

  • ...Insertion losses in WR architectures should ideally be independent of the number of input-output ports, in which case scalability would theoretically be infinite....

    [...]

Proceedings ArticleDOI
08 Dec 2008
TL;DR: This paper shows the asymptotically minimal node degree for any topology to achieve a constant ideal throughput under uniform traffic pattern when the channel bandwidth is fixed and introduces a unidirectional direct interconnection topology, named Plus 2^i (P2i), which is the first load-balanced architecture constructed on multi-hop direct inter connection topologies without packet reordering problem.
Abstract: Load-balanced architectures appear to be a promising way to scale Internet to extra high capacity. However, architectures based on mesh topology have a node-degree of N, which prevents these architectures from large node numbers. This consideration motivates us to study the properties of node degree and its impact on the corresponding load-balanced architectures. In this paper we first show the asymptotically minimal node degree for any topology to achieve a constant ideal throughput under uniform traffic pattern when the channel bandwidth is fixed. We further introduce a unidirectional direct interconnection topology, named Plus 2^i (P2i), with this minimal node degree and prove that it has an ideal throughput of no less than twice the channel bandwidth under uniform traffic pattern. Based on the property, we provide the P2i load-balanced (PLB) architecture. Using this architecture, we show that scalability, 100% throughput and packet ordering can be all achieved and the scheduling algorithm is easy to implement. To the best of our knowledge, this is the first load-balanced architecture constructed on multi-hop direct interconnection topologies without packet reordering problem.

8 citations

References
More filters
Journal ArticleDOI
TL;DR: The main objective of this sequel is to solve the out-of-sequence problem that occurs in the load balanced Birkhoff-von Neumann switch with one-stage buffering by adding a load-balancing buffer in front of the first stage and a resequencing-and-output buffer after the second stage.

328 citations


"A two-stage switch with load balanc..." refers background in this paper

  • ...Most of recent research works on two-stage switches have been focused on solving this out-of-sequence problem [2]- [4]....

    [...]

  • ...The first solution proposed in [2] exploits a resequencing buffer after the second stage....

    [...]

Journal ArticleDOI
TL;DR: A switch architecture with two-stage switching fabrics and one- stage switching fabrics that scales up with the speed of fiber optics is proposed.

232 citations

Proceedings ArticleDOI
07 Nov 2002
TL;DR: This paper presents an algorithm called full frames first (FFF), that prevents mis-sequencing while maintaining the performance benefits (in terms of throughput and delay) of the basic two-stage switch.
Abstract: High performance packet switches frequently use a centralized scheduler (also known as an arbiter) to determine the configuration of a non-blocking crossbar. The scheduler often limits the scalability of the system because of the frequency and complexity of its decisions. A paper by C.-S. Chang et al. (2001) introduced an interesting two-stage switch, in which each stage uses a trivial deterministic sequence of configurations. The switch is simple to implement at high speed and has been proved to provide 100% throughput for a broad class of traffic. Furthermore, there is a bound between the average delay of the two-stage switch and that of an ideal output-queued switch. However, in its simplest form, the switch mis-sequences packets by an arbitrary amount. In this paper, building on the two-stage switch, we present an algorithm called full frames first (FFF), that prevents mis-sequencing while maintaining the performance benefits (in terms of throughput and delay) of the basic two-stage switch. FFF comes at some additional cost, which we evaluate in this paper.

141 citations

Proceedings ArticleDOI
01 Dec 2001
TL;DR: A novel architecture, a combined input-crosspoint-output buffered (CIXOB-k, where k is the size of the crosspoint buffer) Switch, which provides 100% throughput under uniform and unbalanced traffic and provides timing relaxation and scalability.
Abstract: We propose a novel architecture, a combined input-crosspoint-output buffered (CIXOB-k, where k is the size of the crosspoint buffer) Switch. CIXOB-k architecture provides 100% throughput under uniform and unbalanced traffic. It also provides timing relaxation and scalability. CIXOB-k is based on a switch with combined input-crosspoint buffering (CIXB-k) and round-robin arbitration. CIXB-k has a better performance than a non-buffered crossbar that uses iSLIP arbitration scheme. CIXOB-k uses a small speedup to provide 100% throughput under unbalanced traffic. We analyze the effect of the crosspoint buffer size and the switch size under uniform and unbalanced traffic for CIXB-k. We also describe solutions for relaxing the crosspoint memory amount and scalability for a CIXOB-k switch with a large number of ports.

129 citations

Proceedings ArticleDOI
07 Mar 2004
TL;DR: This paper proposes a scalable solution, called the mailbox switch, that solves the out-of-sequence problem in the two-stage switch architecture and proposes a recursive way to construct the switch fabrics for the set of symmetric connection patterns.
Abstract: Traditionally, conflict resolution in an input-buffered switch is solved by finding a matching between inputs and outputs per time slot. To do this, a switch not only needs to gather the information of the virtual output queues at the inputs, hut also uses the gathered information to compute a matching. As such, both the communication overhead and the computation overhead make it difficult to scale. Recent works on the two-stage switch architecture in (6|, [7], [12], (8| showed that conflict resolution can be easily solved over time and space without communication and computation overhead. However, the main problem of such a two-stage switch architecture is that packets might be out of sequence. The main objective of this paper is to propose a scalable solution, called the mailbox switch, that solves the out-of-sequence problem in the two-stage switch architecture. The key idea of the mailbox switch is to use a set of symmetric connection patterns to create a feedback path for packet departure times. With the information of packet departure times, the mailbox switch can schedule packets so that they depart in the order of their arrivals. Despite the simplicity of the mailbox switch, we show via both the theoretical models and simulations that the throughput of the mailbox switch can be as high as 75%. With limited resequencing delay, a modified version of the mailbox switch achieves 95% throughput. We also propose a recursive way to construct the switch fabrics for the set of symmetric connection patterns. If the number of inputs, N, is a power of 2, we show that the switch fabric for the mailbox switch can be built with N/2 log2 N 2times2 switches

74 citations