Showing papers on "Multistage interconnection networks published in 1997"

PDF

Open Access

Journal Article•DOI•

Low-cost scalable switching solutions for broadband networking: the ATLANTA architecture and chipset

[...]

Fabio M. Chiussi, Joseph George Kneuer, Vijay Pochampalli Kumar

01 Dec 1997-IEEE Communications Magazine

TL;DR: The ATLANTA/sup TM/ switching architecture as discussed by the authors uses an innovative structure with ingress and egress buffers, where selective backpressure is applied from the fabric to the ingress cards, achieving "sharing" of the distributed buffers and buffer utilization comparable with a centralized shared memory switch.

...read moreread less

Abstract: The ATLANTA/sup TM/ switching architecture has the following distinguishing characteristics: (1) is nonblocking, (2) scales modularly over a wide range of switching and buffering capacities using commonly available implementation technology, (3) achieves high buffer utilization while using distributed buffers, (4) has low complexity, and (5) provides a clear path for future growth in features. The ATLANTA architecture uses an innovative structure with ingress and egress buffers, where selective backpressure is applied from the fabric to the ingress cards. Selective backpressure makes the buffers in the ingress cards act as an extension of the output buffers in the fabric, achieving "sharing" of the distributed buffers and buffer utilization comparable with a centralized shared-memory switch. The advantage is that the majority of the buffers are in the ingress and egress port cards, and are implemented using low-cost off-the-shelf memories regardless of the total switching capacity. Different arrangements are possible for the switch fabric. In the smallest configuration, the fabric consists of a single standalone switching module; for larger switching capacities, the fabric is a modular three-stage memory/space/memory (MSM) arrangement. The ATLANTA architecture provides optimal support of multicast traffic. The ATLANTA chipset provides the complete set of building blocks for implementing ATM switches ranging in capacity from 622 Mb/s to 25 Gb/s. The chipset consists of four chips, two devices to be used in the fabric and two in the port cards. The port devices provide full-duplex ingress and egress functionality at 622 Mb/s port rate (plus the overhead due to the local header used internally to the switch). The physical interface to the incoming/outgoing lines supports the UTOPIA II multiplexing standard, and the port devices manage multiplexing/demultiplexing from/to a maximum of 30 subports per port. Although our current implementation of the architecture is targeted primarily to ATM, the principles behind the architecture are more general, and apply to IP switching and routing technologies.

...read moreread less

109 citations

Journal Article•DOI•

Performance analysis of buffering schemes in wormhole routers

[...]

Y. M. Boura¹, Chita R. Das²•Institutions (2)

Siemens¹, Pennsylvania State University²

01 Jun 1997-IEEE Transactions on Computers

TL;DR: It is demonstrated that middle buffering with virtual channels provides better performance than input bufferingwith virtual channels in multistage interconnection networks, two-dimensional meshes, and hypercubes.

...read moreread less

Abstract: Wormhole switched input-buffered and middle-buffered routers with virtual channels are analyzed in this paper. Middle buffering refers to the placement of virtual channels between the demultiplexers and multiplexers of a crossbar switch. An analytical model for multistage interconnection networks using middle-buffered switches is developed. In addition, extensive simulation is conducted to assess the performance of the two buffering techniques in different network topologies. The study demonstrates that middle buffering with virtual channels provides better performance than input buffering with virtual channels in multistage interconnection networks, two-dimensional meshes, and hypercubes.

...read moreread less

40 citations

Journal Article•DOI•

Performance evaluation of switch-based wormhole networks

[...]

Lionel M. Ni¹, Yadong Gui², Sherry Moore³•Institutions (3)

Michigan State University¹, Chinese Academy of Sciences², Sun Microsystems³

01 May 1997-IEEE Transactions on Parallel and Distributed Systems

TL;DR: In this article, four wormhole multistage interconnection networks (MINs) are considered: traditional MINs, dilated MINs (DMINs), MINs with virtual channels (VMINs) and bidirectional MINs.

...read moreread less

Abstract: Multistage interconnection networks (MINs) are a popular class of switch-based network architectures for constructing scalable parallel computers. Four wormhole MINs built from k/spl times/k switches, where k=2/sup i/ for some j, are considered in this paper: traditional MINs (TMINs), dilated MINs (DMINs), MINs with virtual channels (VMINs), and bidirectional MINs (BMINs). The first three MINs are unidirectional networks, and we show that the cube interconnection pattern can provide contention-free and channel-balanced partitioning of binary cube clusters. BMINs based on butterfly interconnection are essentially a fat tree, and their routing properties are described. Performance comparison among these four networks using simulation experiments is presented with respect to different network traffic patterns. Both DMINs (dilation two) and BMINs have a similar hardware complexity. We conclude that a two-dilated MIN outperforms the corresponding BMIN (or fat tree) for most of the traffic conditions and is a better choice for the design of scalable parallel computers.

...read moreread less

35 citations

Journal Article•DOI•

The single-queue switch: a building block for switches with programmable scheduling

[...]

M.R. Hashemi¹, A. Leon-Garcia•Institutions (1)

University of Toronto¹

01 Jun 1997-IEEE Journal on Selected Areas in Communications

TL;DR: An ATM switch architecture which uses only a single shift-register-type buffering element to store and queue cells, and within the same (physical) queue, switches the cells by organizing them in logical queues destined for different output lines is proposed.

...read moreread less

Abstract: We introduce a new approach to ATM switching. We propose an ATM switch architecture which uses only a single shift-register-type buffering element to store and queue cells, and within the same (physical) queue, switches the cells by organizing them in logical queues destined for different output lines. The buffer is also a sequencer which allows flexible ordering of the cells in each logical queue to achieve any appropriate scheduling algorithm. This switch is proposed for use as the building block of large-stale multistage ATM switches because of low hardware complexity and flexibility in providing (per-VC) scheduling among the cells. The switch can also be used as scheduler/controller for RAM-based switches. The single-queue switch implements output queueing and performs full buffer sharing. The hardware complexity is low. The number of input and output lines can vary independently without affecting the switch core. The size of the buffering space can be increased simply by cascading the buffering elements.

...read moreread less

32 citations

Proceedings Article•DOI•

OPTIMA: Tb/s ATM switching system architecture

[...]

Naoaki Yamanaka, Seisho Yasukawa¹, Eiji Oki¹, Takashi Kurimoto¹•Institutions (1)

Nippon Telegraph and Telephone¹

25 May 1997

TL;DR: The proposed OPTIMA architecture and 640 Gb/s system can be applied to realize future broadband ATM networks and an 8/spl times/8 interconnection is realized.

...read moreread less

Abstract: A Tb/s throughput ATM switching architecture, OPTIMA, is proposed for a quasi-non-blocking large switch. The switch uses hardware self-rearrangement with a three stage network, that is traffic control is automatically performed by hardware. The switch thus acts as a non-blocking switch. In addition, optical wavelength routing is used to avoid interconnection limitations. An 8/spl times/8 interconnection is realized that uses 8 wavelengths to transfer 10 Gb/s signals. A 640 Gb/s OPTIMA prototype is described. The proposed OPTIMA architecture and 640 Gb/s system can be applied to realize future broadband ATM networks.

...read moreread less

28 citations

Journal Article•DOI•

Optimal software multicast in wormhole-routed multistage networks

[...]

Hong Xu¹, Yadong Gui, Lionel M. Ni²•Institutions (2)

Cisco Systems, Inc.¹, Michigan State University²

01 Jun 1997-IEEE Transactions on Parallel and Distributed Systems

TL;DR: The results of implementations on a 64-node SP-1 show that the proposed algorithm significantly outperforms the application-level broadcast primitives provided by currently existing collective communication libraries including the public domain MPI.

...read moreread less

Abstract: Multistage interconnection networks are a popular class of interconnection architecture for constructing scalable parallel computers (SPCs). The focus of this paper is on the multistage network system which supports wormhole routed turnaround routing. Existing machines characterized by such a system model include the IBM SP-1 and SP-2, TMC CM-5, and Meiko CS-2. Efficient collective communication among processor nodes is critical to the performance of SPCs. A system-level multicast service, in which the same message is delivered from a source node to an arbitrary number of destination nodes, is fundamental in supporting collective communication primitives including the application-level broadcast, reduction, and barrier synchronization. This paper addresses how to efficiently implement multicast services in wormhole-routed multistage networks, in the absence of hardware multicast support, by exploiting the properties of the turnaround switching technology. An optimal multicast algorithm is proposed. The results of implementations on a 64-node SP-1 show that the proposed algorithm significantly outperforms the application-level broadcast primitives provided by currently existing collective communication libraries including the public domain MPI.

...read moreread less

26 citations

Journal Article•DOI•

Performance of multistage bus networks for a distributed shared memory multiprocessor

[...]

Laxmi N. Bhuyan¹, Ravi Iyer, T. Askar, Ashwini K. Nanda, M. Kumar - Show less +1 more•Institutions (1)

Texas A&M University¹

01 Jan 1997-IEEE Transactions on Parallel and Distributed Systems

TL;DR: The authors develop self routing techniques for the various paths, present an algorithm to route a request along the path with minimum distance, and analyze the probabilities of a packet taking different routes to show that the MBN provides similar performance to a BMIN while offering simplicity in hardware and more fault-tolerance than a conventional MIN.

...read moreread less

Abstract: A multistage bus network (MEN) is proposed to overcome some of the shortcomings of the conventional multistage interconnection networks (MINs), single bus, and hierarchical bus interconnection networks. The MBN consists of multiple stages of buses connected in a manner similar to the MINs and has the same bandwidth at each stage. A switch in an MBN is similar to that in a MIN switch except that there is a single bus connection instead of a crossbar. MBNs support bidirectional routing and there exists a number of paths between any source and destination pair. The authors develop self routing techniques for the various paths, present an algorithm to route a request along the path with minimum distance, and analyze the probabilities of a packet taking different routes. Further, they derive a performance analysis of a synchronous packet-switched MBN in a distributed shared memory environment and compare the results with those of an equivalent bidirectional MIN (BMIN). Finally, they present the execution time of various applications on the MBN and the BMIN through an execution-driven simulation. They show that the MBN provides similar performance to a BMIN while offering simplicity in hardware and more fault-tolerance than a conventional MIN.

...read moreread less

22 citations

Journal Article•DOI•

Design and performance evaluation of a banyan network based interconnection structure for ATM switches

[...]

Sema Oktug¹, Mehmet Ufuk Çağlayan•Institutions (1)

Boğaziçi University¹

01 Jun 1997-IEEE Journal on Selected Areas in Communications

TL;DR: It is shown that, for the proposed design, a higher degree of heterogeneity results in better performance than the baseline network and another banyan network based parallel interconnection network.

...read moreread less

Abstract: This paper presents a new self-routing packet network called the plane interconnected parallel network (PIPN). In the proposed design, the traffic arriving at the network is shaped and routed through two banyan network based interconnected planes. The interconnections between the planes distribute the incoming load more homogeneously over the network. The throughput of the network under uniform and heterogeneous traffic requirements is studied analytically and by simulation. The results are compared with the results of the baseline network and another banyan network based parallel interconnection network. It is shown that, for the proposed design, a higher degree of heterogeneity results in better performance.

...read moreread less

22 citations

Journal Article•DOI•

The performance of the Cedar multistage switching network

[...]

Josep Torrellas¹, Zheng Zhang¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Apr 1997-IEEE Transactions on Parallel and Distributed Systems

TL;DR: It is argued that intuitive optimizations for multistage switching networks may not be cost-effective, and changes to increase the network bandwidth at the root of the traffic convergence tree and to delay traffic convergence up until the final stages of the network are suggested.

...read moreread less

Abstract: While multistage switching networks for vector multiprocessors have been studied extensively, detailed evaluations of their performance are rare. Indeed, analytical models, simulations with pseudosynthetic loads, studies focused on average-value parameters, and measurements of networks disconnected from the machine, all provide limited information. In this paper, instead, we present an in-depth empirical analysis of a multistage switching network in a realistic setting: We use hardware probes to examine the performance of the omega network of the Cedar shared-memory machine executing real applications. The machine is configured with 16 vector processors. The analysis suggests that the performance of multistage switching networks is limited by traffic nonuniformities. We identify two major nonuniformities that degrade Cedar's performance and are likely to slow down other networks too. The first one is the contention caused by the return messages in a vector access as they converge from the memories to one processor port. This traffic convergence penalizes vector reads and, more importantly, causes tree saturation. The second nonuniformity is the uneven contention delays induced by a relatively fair scheme to resolve message collisions. Based on our observations, we argue that intuitive optimizations for multistage switching networks may not be the most cost-effective ones. Instead, we suggest changes to increase the network bandwidth at the root of the traffic convergence tree and to delay traffic convergence up until the final stages of the network.

...read moreread less

21 citations

Journal Article•DOI•

Design principles for practical self-routing nonblocking switching networks with O(N/spl middot/log N) bit-complexity

[...]

Ted H. Szymanski¹•Institutions (1)

McGill University¹

01 Oct 1997-IEEE Transactions on Computers

TL;DR: The designs of electrical and optical switch cores with Terabits of bisection bandwidth for Networks-of-Workstations (NOWs) are described and meet Shannon's lower bound on memory requirements.

...read moreread less

Abstract: Principles for designing practical self-routing nonblocking N/spl times/N circuit-switched connection networks with optimal /spl theta/(N/spl middot/log N) hardware at the bit-level of complexity are described. The overall principles behind the architecture can be described as "Expand-Route-Contract". A self-routing nonblocking network with w-bit wide datapaths can be achieved by expanding the datapaths to w+z independent bit-serial connections, routing these connections through self-routing networks with blocking, and by contracting the data at the output and recovering the w-bit wide datapaths. For an appropriate redundancy z, the blocking probability can be made arbitrarily small and the fault tolerance arbitrarily high. By using efficient space domain concentrators, the architecture yields self-routing nonblocking switching networks with an optimal O(N/spl middot/log N) bits of memory or O(N/spl middot/log N/spl middot/log log log N) logic gates. By using a linear-cost time domain concentrator, the architecture yields self-routing nonblocking switching networks with an optimal /spl theta/(N/spl middot/log N) bits of memory or logic gates. These designs meet Shannon's lower bound on memory requirements, established in the 1950s. The number of stages of crossbars can match the theoretical minimum, which has not been achieved by previous self-routing networks. The architecture is feasible with existing electrical or optical technologies. The designs of electrical and optical switch cores with Terabits of bisection bandwidth for Networks-of-Workstations (NOWs) are described.

...read moreread less

17 citations

Journal Article•DOI•

Design and analysis of high performance multistage interconnection networks

[...]

S.K. Bhogavilli¹, H. Abu-Amara²•Institutions (2)

University of Nevada, Las Vegas¹, Nortel²

01 Jan 1997-IEEE Transactions on Computers

TL;DR: A new control design for single queue MINs is proposed that reduces the duration of the clock period by making use of output buffers and acknowledgments and develops an analytical model to compare its performance with the existing designs reported in the literature.

...read moreread less

Abstract: Small switching elements are the key components of multistage interconnection networks (MINs) used in multiprocessors and in high speed switching fabrics. Clock design for synchronous MINs is an important issue. The existing models assume that the clock period consists of two parts. The control messages are transferred between switching stages during the first part, and the actual data transfer takes place during the second part. We propose a new control design for single queue MINs that reduces the duration of the clock period by making use of output buffers and acknowledgments. The reduction in the clock period comes from the addition of two-unit output buffers, introducing a sophisticated hardware control mechanism, and sacrificing the FIFO feature. We develop an analytical model to compare its performance with the existing designs reported in the literature. We validate our model with extensive simulation studies.

...read moreread less

Book Chapter•DOI•

A General Performance Model for Multistage Interconnection Networks

[...]

Christos Bouras¹, John Garofalakis¹, Paul G. Spirakis¹, Vassilis Triantafillou¹•Institutions (1)

Research Academic Computer Technology Institute¹

26 Aug 1997

TL;DR: This paper analyzes the general case of Multistage Interconnection Networks, made of k × k switches with finite, infinite or zero length buffers (unbuffered) and derives an approximation for the steady state distributions in the second stage and beyond.

...read moreread less

Abstract: In this paper we analyze the general case of Multistage Interconnection Networks (MINs), made of k × k switches with finite, infinite or zero length buffers (unbuffered). The exact solution of the steady state distribution of the first stage is derived for all cases. We use this to get an approximation for the steady state distributions in the second stage and beyond. In the case of unbuffered switches we reach the known exact solution for all the stages of the MIN. Our results are validated by extensive simulations.

...read moreread less

Journal Article•DOI•

Beam-array combination with planar integrated optics for three-dimensional multistage interconnection networks

[...]

Seok Ho Song¹, Jong Sool Jeong¹, El-Hang Lee¹•Institutions (1)

Electronics and Telecommunications Research Institute¹

10 Aug 1997-Applied Optics

TL;DR: Experimental results on the beam combination of signal- and power-beam arrays at a node stage for three-dimensional multistage interconnection networks show the feasibility of cascading operations in the planar integrated optics.

...read moreread less

Abstract: We propose a configuration of planar integrated optics for three-dimensional multistage interconnection networks. To show the feasibility of cascading operations in the planar integrated optics, we present experimental results on the beam combination of signal- and power-beam arrays at a node stage. The beam-combination efficiency measured in the experiment is ∼42% of the theoretical limit.

...read moreread less

Journal Article•DOI•

Sharing memory in banyan-based ATM switches

[...]

D. Basak¹, A.K. Choudhury, E.L. Hahne•Institutions (1)

Bell Labs¹

01 Jun 1997-IEEE Journal on Selected Areas in Communications

TL;DR: A buffer management technique called delayed pushout is applied to a multistage ATM switch in which shared-memory switching elements are arranged in a banyan topology, and a synergy emerges when pushout, backpressure, and this threshold are all employed together.

...read moreread less

Abstract: We study a multistage ATM switch in which shared-memory switching elements are arranged in a banyan topology. By "shared-memory," we mean that each switching element uses output queueing and shares its local cell buffer memory among all its output ports. We apply a buffer management technique called delayed pushout that was originally designed for multistage ATM switches with hierarchical topologies. Delayed pushout combines a pushout mechanism, for sharing memory efficiently among queues within the same switching element, and a backpressure mechanism, for sharing memory across switch stages. The backpressure component has a threshold to restrict the amount of sharing between stages. A synergy emerges when pushout, backpressure, and this threshold are all employed together. Using a computer simulation of the switch under bursty traffic, we study delayed pushout as well as several simpler pushout and backpressure schemes under a variety of traffic conditions. Of the five schemes we simulate, delayed pushout is the only one that performs well under all load conditions.

...read moreread less

Proceedings Article•DOI•

Tree-based multicasting on wormhole routed multistage interconnection networks

[...]

V. Varavithy¹, P. Mohapatra•Institutions (1)

Iowa State University¹

11 Aug 1997

TL;DR: An asynchronous tree-based multicasting algorithm is developed in which deadlocks are prevented by serializing the initiations of branching operations that have potential for creating deadlocks.

...read moreread less

Abstract: In this peeper, we propose a tree-based multicasting algorithm for Multistage Interconnection Networks. We first analyze the necessary conditions for deadlocks in MINs. Based on these observations, an asynchronous tree-based multicasting algorithm is developed in which deadlocks are prevented by serializing the initiations of branching operations that have potential for creating deadlocks. The serialization is done using a technique based on grouping of the switching elements. The preliminary simulation results are encouraging as it lowers the latency by almost a factor of 4 when compared with the software multicasting approach proposed earlier.

...read moreread less

Proceedings Article•DOI•

BATMAN: a new architectural design of a very large next generation gigabit switch

[...]

Muh-rong Yang, Gin-Kou Ma

08 Jun 1997

TL;DR: A new architectural design of a very large next generation gigabit switch, called BATMAN (Banyan ATM Architectural Network), is introduced, which allows for the modular growth of its size from small to very large dimensions without sacrificing its overall delay/throughput performance.

...read moreread less

Abstract: In spite of the recent advances of technology, the limitation on the switching size is the primary implementation constraint. Practical dimensions are limited to the small size of a module. To build a larger dimension, more than one module is interconnected in a multistage configuration. Moreover, internal switching fabrics of these interconnected modules are usually speed-up to a higher data rate in order to reduce excessive queuing delay. In this paper, a new architectural design of a very large next generation gigabit switch, called BATMAN (Banyan ATM Architectural Network), is introduced. The proposed switch has the structure of an N/spl times/N Banyan network, and recursively followed by 2/sup k/ groups of shared buffers, and N/2k 2/sup k//spl times/2/sub k/ Banyan networks, where k is incremented from 1 to [log/sub 2/N/log/sub 2//sup /spl rho//], and /spl rho/ is the speed-up factor. In its simplest form, it has the structure of an N/spl times/N Banyan network, N/4 groups of shared buffers, and N/4 4/spl times/4 Banyan routing networks. In each Banyan network module, universal packet timeslot (UPTS) is adopted. Because the hardware complexity of the proposed switch architecture is low, the architecture allows for the modular growth of its size from small to very large dimensions without sacrificing its overall delay/throughput performance.

...read moreread less

Proceedings Article•DOI•

Performance of buffered multistage interconnection networks in case of packet multicasting

[...]

Dietmar Tutsch, Günter Hommel

19 Mar 1997

TL;DR: A timed Petri net model is used to derive the performance of buffered Banyan networks, in which messages may also be multicasted, and the automatic generation of timedPetri net models is possible for arbitrary destination patterns of the packets.

...read moreread less

Abstract: Multistage Banyan networks are frequently proposed as connections in multiprocessor systems. There exist several studies to determine the performance of networks in which messages are unicasted. (One processor sends a message to one and only one other processor.) In this paper, a timed Petri net model is used to derive the performance of buffered Banyan networks, in which messages may also be multicasted (One processor can send a message to more than one other processor). We consider a Banyan network with 2/spl times/2-switches and the two cases of complete and partial broadcasting within the switching elements, An algorithm is presented to calculate the destination distribution in all network stages for arbitrary destination patterns of incoming uniform packet traffic. Thus, the automatic generation of timed Petri net models is possible for arbitrary destination patterns of the packets. The dependency upon the network size is also considered.

...read moreread less

Journal Article•DOI•

Evaluation of multi-queue buffered multistage interconnection networks under uniform and non-uniform traffic patterns

[...]

Jianxun Jason Ding¹, Laxmi N. Bhuyan²•Institutions (2)

Intel¹, Texas A&M University²

01 Jul 1997-International Journal of Systems Science

TL;DR: A unified model for analysing multistage interconnection networks with multi-queue buffered strategies shows that the DAFC scheme has the best performance over all the four buffer allocation schemes under both uniform and non-uniform load.

...read moreread less

Abstract: This paper presents a unified model for analysing multistage interconnection networks with multi-queue buffered strategies. Buffering strategies include SAFC (Statically Allocated Fully Connected), SAMQ (Statically Allocated Multi-Queue), DAMQ (Dynamically Allocated Multi-Queue), and DAFC (Dynamically Allocated Fully Connected) schemes. We develop a unified model to evaluate the performance of all these buffer allocation schemes under the uniform and non-uniform traffic patterns. The analytical model is validated through extensive simulations. Using the unified model, we conducted performance comparisons for the four buffer allocation schemes under both uniform and non-uniform traffic load. It is shown that the DAFC scheme has the best performance over all the four buffer allocation schemes under both uniform and non-uniform load.

...read moreread less

Proceedings Article•DOI•

A study of an SCI switch fabric

[...]

H. Liebhart¹, E. Brenner², A. Bogaerts³•Institutions (3)

Information Technology University¹, Graz University of Technology², CERN³

12 Jan 1997

TL;DR: The scalable coherent interface (SCI) defines a high-speed interconnect system that provides a coherent memory system that specifies a topology-independent communication protocol with the possibility of connecting up to 64 K nodes.

...read moreread less

Abstract: The scalable coherent interface (SCI) defines a high-speed interconnect system that provides a coherent memory system. It specifies a topology-independent communication protocol with the possibility of connecting up to 64 K nodes. SCI switches are the key components in building large SCI systems effectively. An SCI switch which uses several internal buses is studied as well as more complex systems composed of several switches. Computer simulations are used to compare the different models and to determine system parameters.

...read moreread less

Characterizing bit permutation networks

[...]

Gerard J. Chang, Frank K. Hwang, Li-Da Tong

01 Jan 1997

TL;DR: This paper studied a more general class of networks, which is called (m / 1)-stage d-nary bit permutation networks, and characterized the equivalence of such networks by sequence of positive integers.

...read moreread less

Abstract: In recent years, many multistage interconnection networks using 2 x 2 switching elements have been proposed for parallel architectures. Typical examples are baseline networks, banyan networks, shuffle-exchange networks, and their inverses. As these networks are blocking, such networks with extra stages have also been studied extensively. These include Benes networks and Δ ○+ Δ' networks. Recently, Hwang et al. studied k-extra-stage networks, which are a generalization of the above networks. They also investigated the equivalence issue among some of these networks. In this paper, we studied a more general class of networks, which we call (m + 1)-stage d-nary bit permutation networks. We characterize the equivalence of such networks by sequence of positive integers.

...read moreread less

Proceedings Article•DOI•

Two-bounce free-space arbitrary interconnection architecture

[...]

Marc P. Christensen¹, Michael W. Haney¹•Institutions (1)

George Mason University¹

22 Jun 1997

TL;DR: In this article, the two bounce free-space arbitrary interconnection architecture is introduced, which combines the global optical interconnection with the minimum nonblocking multistage interconnection network, the Benes network, to achieve arbitrary interconnections across a multichip backplane.

...read moreread less

Abstract: The two bounce free-space arbitrary interconnection architecture is introduced. It is requires 3 stages of local electronic routing and 2 passes, or bounces, through a common retro-reflective optical system. The concept combines the global optical interconnection with the minimum nonblocking multistage interconnection network, the Benes network, to achieve arbitrary interconnections across a multichip backplane. The arbitrary interconnection requires only one additional pass through the optical system. The architecture is experimentally validated with a optical module and a fiber coupled LED and detector array to simulate the smart pixel I/O placement in the backplane of the module. The architecture is further evaluated using VCSEL arrays and a CCD camera for resolution and registration measurements.

...read moreread less

Proceedings Article•DOI•

Generalized non-blocking copy networks

[...]

P.P. To¹, Tony T. Lee•Institutions (1)

The Chinese University of Hong Kong¹

08 Jun 1997

TL;DR: It is shown that if the set of input connection requests is ordered, the broadcast Clos network is non-blocking and route assignment can be done by using the rank of each connection request, and the proposed copy network is the generalization of Lee's architecture (1988).

...read moreread less

Abstract: A generalized non-blocking copy network based on a broadcast Clos (1953) network is proposed. We show that if the set of input connection requests is ordered, the broadcast Clos network is non-blocking and route assignment can be done by using the rank of each connection request. Packet replications and routing are achieved by the generalized interval splitting algorithm. We show that the broadcast Clos network can be considered as the cascade combination of a reverse omega network and a broadcast omega network. The construction of copy network is therefore no longer limited to 2/spl times/2 switching elements. By recursively constructing the reverse omega and the omega networks using 2/spl times/2 switching elements, we show that the proposed copy network is the generalization of Lee's architecture (1988).

...read moreread less

Proceedings Article•DOI•

Performance and complexity of multicast cross-path ATM switches

[...]

R.H. Lin¹, C.H. Lam, T.T. Lee•Institutions (1)

The Chinese University of Hong Kong¹

09 Apr 1997

TL;DR: A newly proposed large-scale ATM switch called the cross-path switch has been shown to be capable of handling multirate traffic efficiently and it is observed that, to achieve the same throughput and loss requirement, the second architecture may require fewer switching elements than the first one.

...read moreread less

Abstract: A newly proposed large-scale ATM switch called the cross-path switch has been shown to be capable of handling multirate traffic efficiently. We study two replication approaches to enhance the switch to support multicasting. The first approach replicates multicast cells at both the input and output stages, while the second one replicates cells at the input stage only. A feasible configuration for each scheme is considered and the effect of multicast traffic on the switch performance in terms of the throughput and cell loss probability is studied. We observed that, to achieve the same throughput and loss requirement, the second architecture may require fewer switching elements than the first one.

...read moreread less

Journal Article•DOI•

An analytical model on network blocking probability

[...]

Yuanyuan Yang

01 Sep 1997-IEEE Communications Letters

TL;DR: A new analytical model on the blocking probability of the three-stage Clos (1953) network is presented that can more accurately describe the blocking behavior of the network and is consistent with the deterministic nonblocking condition.

...read moreread less

Abstract: We present a new analytical model on the blocking probability of the three-stage Clos (1953) network. Due to the effect of approximations, a common problem with previously proposed analytical models is that they may not be very accurate in some cases. In particular, the blocking probability in these models contradicts the well-known deterministic nonblocking condition for the Clos network. The most notable feature of the newly proposed model is that it can more accurately describe the blocking behavior of the network and is consistent with the deterministic nonblocking condition.

...read moreread less

Proceedings Article•DOI•

The Dual-Banyan (DB) switch: a high-performance buffered-Banyan ATM switch

[...]

C. Kolias¹, Leonard Kleinrock•Institutions (1)

University of California, Los Angeles¹

08 Jun 1997

TL;DR: A high-performance buffered-Banyan switch which encompasses multiple input-queueing as its buffering strategy is presented and described, and simulation results are given to demonstrate its throughput, mean waiting time and cell-loss performance considering different switch and buffer sizes.

...read moreread less

Abstract: Multistage interconnection networks (MINs) are very popular in ATM switching since they can achieve high-performance switching and are easy to implement and expand due to their modular design. In this paper we present and describe in detail a high-performance buffered-Banyan switch which encompasses multiple input-queueing as its buffering strategy. We call this switching architecture Dual-Banyan switch. Simulation results are given to demonstrate its throughput, mean waiting time and cell-loss performance considering different switch and buffer sizes. We further compare it to the simple, single-queue buffered Banyan network, assuming, for reasons of fairness, the same total buffer capacity with respect to uniform and non-uniform traffic patterns.

...read moreread less

Journal Article•DOI•

Optical TDM sorting networks for high-speed switching

[...]

R. Kannan¹, Daeshik Lee, K.Y. Lee, Harry F. Jordan•Institutions (1)

University of Michigan¹

01 Jun 1997-IEEE Transactions on Communications

TL;DR: Two multichannel time slot sorters which sort N/Sup 2/ time-division multiplexed (TDM) optical inputs, arranged as N frames with N time slots per frame using O(Nlog/sup 2/N) optical switch elements are proposed.

...read moreread less

Abstract: The general time-space-time switching problem in telecommunications requires the use of multichannel time slot interchangers. We propose two multichannel time slot sorters which sort N/sup 2/ time-division multiplexed (TDM) optical inputs, arranged as N frames with N time slots per frame using O(Nlog/sup 2/N) optical switch elements. The TDM optical inputs are sorted in place without expanding the space-time fabric into a space-division switch. The hardware components used are 2/spl times/2 optical switches (LiNbO/sub 3/ directional couplers) and optical delay lines connected in a feedforward fashion. Two space-time variants of the spatial odd-even merge algorithm are used to design the sorters. By maintaining the number of shift-exchange operations invariant at each stage, the proposed sorters use fewer switches than previously proposed sorters using switches with feedback line delays. The use of local control at each 2/spl times/2 switch makes the proposed sorters more practical for high-speed optical inputs than Benes-based time slot permuters with global control and high latency, which affects interframe distance. Both time slot sorters support pipelining of input frames and sorted outputs are available at each time slot after an initial frame delay. The proposed sorters find practical application in the time-domain equivalents of space-division, nonblocking, self-routing packet switches using the sort-banyan architecture, such as the Starlite switch, Sunshine switch, etc.

...read moreread less

Proceedings Article•DOI•

Performance and implementation aspects of higher order head-of-line blocking switch boxes

[...]

M. Jurczyk¹•Institutions (1)

Purdue University¹

11 Aug 1997

TL;DR: An analytical upper bound of the achievable network bandwidth under nonuniform traffic patterns is derived and compared to simulation results and it is discussed how central memory buffered switch boxes can be efficiently changed into higher order HOL-blocking switch boxes through only minor changes in the switch box control path.

...read moreread less

Abstract: Nonuniform traffic can degrade the overall performance of multistage interconnection networks substantially. This performance degradation was traced back to higher order head-of-line blocking (higher order HOL-blocking) effects within the network in the literature. This paper further elaborates on higher order HOL-blocking networks, on their performance under nonuniform traffic patterns, and on methods on how to efficiently implement switch boxes to construct higher order HOL-blocking networks. An analytical upper bound of the achievable network bandwidth under nonuniform traffic patterns is derived and compared to simulation results. Furthermore, it is discussed how central memory buffered switch boxes can be efficiently changed into higher order HOL-blocking switch boxes through only minor changes in the switch box control path. With those switch boxes, high network performance under nonuniform traffic patterns can be achieved with regular hardware effort.

...read moreread less

Journal Article•DOI•

Interconnection network front-end controller combining to reduce hot spots effects

[...]

Hamid Sharif¹, Hamid Vakilzadian¹•Institutions (1)

University of Nebraska–Lincoln¹

01 Nov 1997-Computer Communications

TL;DR: This paper proposes a new request combining based architecture to reduce the hot spot performance degradation in multistage interconnection networks, referred to as interconnection network front-end controller combining (IN-FEC).

...read moreread less

Proceedings Article•DOI•

A growable ATM switch with embedded multi-channel/multicasting property

[...]

Keun-Bae Kim¹, Peter Yifey Yan, Kyeong Soo Kim, Otto Schmid, Paul S. Min - Show less +1 more•Institutions (1)

Electronics and Telecommunications Research Institute¹

03 Nov 1997

TL;DR: This work proposes a high capacity switch network called the Multi-channel ATM Switch with Crossbar Oriented Network (MASCON), implementing this switch network as a custom ASIC, and construct multi-stage interconnected networks (MINs) using the multi-channel concept to give greater throughput than that possible from MINs constructed from single-channel modules.

...read moreread less

Abstract: We propose a high capacity switch network called the Multi-channel ATM Switch with Crossbar Oriented Network (MASCON). Implementing this switch network as a custom ASIC, we construct multi-stage interconnected networks (MINs) using the multi-channel concept to give greater throughput than that possible from MINs constructed from single-channel modules. Flexible multi-channel switching is supported in which any k ports may be grouped logically to form a higher bandwidth pipe. Multi-channel switching results in improved cell loss and delay performance due to the economy of scale in aggregating shared resources. MASCON is an internally non-blocking switch. Sixteen inputs/outputs are implemented with a 622 Mbps port speed at each port for a total module capacity of 10 Gbps. A fully shared buffer is incorporated into the design. MASCON's shared buffering can interact well with MIN input and output buffering through the use of our backpressure flow-control scheme.

...read moreread less

Proceedings Article•DOI•

An LSI implementation of the simple serial synchronized multistage interconnection network

[...]

T. Kamei¹, M. Sasahara, Hideharu Amano•Institutions (1)

Keio University¹

28 Jan 1997

TL;DR: The SSS-PBSF chip uses the PBSF connection structure which can obtain a higher bandwidth than that of crossbar with connecting banyan networks in a 3D direction and solve the pin-limitation problem.

...read moreread less

Abstract: A high speed switch is a critical component of multiprocessors. Multistage interconnection network (MIN) has been utilized as a switch for connection processors and memory modules in multiprocessors. Unlike the crossbar, it consists of small switching elements, and provides a high bandwidth with relatively small hardware. Most of traditional MINs are blocking networks and packets are transferred in the store-and-forward manner between switching elements with bit-parallel (8-64bits) lines. Since the width of communication paths and transferred manner cause pin-limitation problems and complicated structure, the high density implementation and high speed clock is not utilized. In order to solve these problems, we implemented the SSS-PBSF chip. This switch uses the PBSF connection structure which can obtain a higher bandwidth than that of crossbar with connecting banyan networks in a 3D direction. A simple serial synchronized (SSS) style control mechanism is adopted both for high speed operation and solving the pin-limitation problem.

...read moreread less