Showing papers on "Latency (engineering) published in 2003"

PDF

Open Access

Proceedings Article•DOI•

Effect of latency on presence in stressful virtual environments

[...]

Michael J. Meehan¹, Sharif Razzaque², Mary C. Whitton², Frederick P. Brooks²•Institutions (2)

Stanford University¹, University of North Carolina at Chapel Hill²

22 Mar 2003

TL;DR: An experiment investigating the effect of latency on other metrics of VE effectiveness: physiological response, simulator sickness, and self-reported sense of presence found participants in the low latency condition had a higher self- reported sense of Presence and a statistically higher change in heart rate between the two rooms.

...read moreread less

Abstract: Previous research has shown that even low end-to-end latency can have adverse effects on performance in virtual environments (VE). This paper reports on an experiment investigating the effect of latency on other metrics of VE effectiveness: physiological response, simulator sickness, and self-reported sense of presence. The VE used in the study includes two rooms: the first is normal and non-threatening; the second is designed to evoke a fear/stress response. Participants were assigned to either a low latency (/spl sim/50 ms) or high latency (/spl sim/90 ms) group. Participants in the low latency condition had a higher self-reported sense of presence and a statistically higher change in heart rate between the two rooms than did those in the high latency condition. There were no significant relationships between latency and simulator sickness.

...read moreread less

278 citations

Proceedings Article•DOI•

A low-latency and energy-efficient algorithm for convergecast in wireless sensor networks

[...]

S. Upadhyayula¹, V. Annamalai¹, Sandeep K. S. Gupta¹•Institutions (1)

Arizona State University¹

01 Dec 2003

TL;DR: This paper proposes a heuristic solution for the problem of minimum energy convergecast which also works toward minimizing data latency and results show that this algorithms performance for broadcasting is better compared to other broadcast techniques.

...read moreread less

Abstract: In wireless sensor networks (WSN) the process of dissemination of data among various sensors (broadcast) and collection of data from all sensors (convergecast or data aggregation) are common communication operations. With increasing demands on efficient use of battery power, many efficient broadcast tree construction and channel allocation algorithms have been proposed. Generally convergecast is preceded by broadcast. Hence the tree used for broadcast is also used for convergecast. Our research shows that this approach is inefficient in terms of latency and energy consumption. In this paper we propose a heuristic solution for the problem of minimum energy convergecast which also works toward minimizing data latency. This algorithm constructs a tree using a greedy approach where new nodes are added to the tree such that weight on the branch to which it is added is less. The algorithm then allocates direct sequence spread spectrum or frequency hopping spread spectrum codes. Simulation results show that energy consumed and communication latency of our approach is lower than some of the existing approaches for convergecast. We have then used our algorithm to perform broadcast. Surprisingly our results show that this algorithms performance for broadcasting is better compared to other broadcast techniques.

...read moreread less

84 citations

Proceedings Article•DOI•

Highly scalable network on chip for reconfigurable systems

[...]

T.A. Bartic¹, J.-Y. Mignolet¹, V. Nollet¹, Théodore Marescaux¹, Diederik Verkest¹, Serge Vernalde¹, Rudy Lauwereins¹ - Show less +3 more•Institutions (1)

Katholieke Universiteit Leuven¹

01 Nov 2003

TL;DR: The present design allows us to instantiate arbitrary network topologies, has a low latency and high throughput, and is part of the platform the author is developing for reconfigurable systems.

...read moreread less

Abstract: An efficient methodology for building the billion-transistors systems on chip of tomorrow is a necessity. Networks on chip promise to be the solution for the numerous technological, economical and productivity problems. We believe that different types of networks are required for each application domains. Our approach therefore is to have a very flexible network design, highly scalable, that allows to easily accommodate the various needs. This paper presents the design of our network on chip, which is part of the platform we are developing for reconfigurable systems. The present design allows us to instantiate arbitrary network topologies, has a low latency and high throughput.

...read moreread less

72 citations

Proceedings Article•DOI•

Shielded processors: guaranteeing sub-millisecond response in standard Linux

[...]

S. Brosky, S. Rotolo

22 Apr 2003

TL;DR: The implementation of shielded processors in RedHawk Linux and their benefits are described and the results of real time performance benchmarks are presented.

...read moreread less

Abstract: The low latency and preemption patches provide significant progress making standard Linux into a more responsive system for real-time applications. These patches allow guarantees on worst case interrupt response time at slightly above a millisecond. However, these guarantees can only be met when there is no networking or graphics activity in the system. This paper describes the implementation of shielded processors in RedHawk Linux and their benefits. It also presents the results of real time performance benchmarks. Interrupt response time guarantees are significantly below one millisecond and can be guaranteed even in the presence of networking and graphics activity.

...read moreread less

46 citations

Proceedings Article•DOI•

A low-latency and bandwidth-efficient distributed optical burst switching architecture for metro ring

[...]

Andrea Fumagalli, P. Krishnamoorthy

11 May 2003

TL;DR: The authors propose the use of OBS to realize a geographically distributed packet switch for metro rings by combining a multi-token based protocol for contention less and loss-free transmission of bursts, known as the lightring protocol, with the creation of bursts that contain packets belonging to multiple traffic flows.

...read moreread less

Abstract: Optical burst switching (OBS) provides statistical multiplexing capabilities at the optical layer with relaxed hardware requirements when compared to optical packet switching. One of the open challenges of OBS is to assemble as many packets as possible in the same burst, while at the same time ensuring low latency of the transmitted packets. The authors propose the use of OBS to realize a geographically distributed packet switch for metro rings. High efficiency of the ring bandwidth and low packet latency are obtained at the ring node by combining a multi-token based protocol for contention less and loss-free transmission of bursts, known as the lightring protocol, with the creation of bursts that contain packets belonging to multiple traffic flows (classified by priority and destination). As illustrated in the paper, the proposed solution yields throughput that is significantly higher than that one offered by a centralized packet switch connected to the ring nodes via dedicated optical circuits. Latency of real time packets is kept at few dozens of milliseconds under a variety of network scenarios. The solution scales well geographically for metro applications.

...read moreread less

31 citations

Patent•

Low latency frequency switching

[...]

Da-Shan Shiu¹, Li Zhang¹, Eugene Sy¹•Institutions (1)

Qualcomm¹

24 Apr 2003

TL;DR: In this paper, a controller receives a frequency switch command and generates a signal at a time determined in accordance with a system timer, with DC cancellation control and gain control, with signaling to control the iterations without need for processor intervention.

...read moreread less

Abstract: Techniques for improved low latency frequency switching are disclosed. In one embodiment, a controller receives a frequency switch command and generates a frequency switch signal at a time determined in accordance with a system timer. In another embodiment, gain calibration is initiated subsequent to the frequency switch signal delayed by the expected frequency synthesizer settling time. In yet another embodiment, DC cancellation control and gain control are iterated to perform gain calibration, with signaling to control the iterations without need for processor intervention. Various other embodiments are also presented. Aspects of the embodiments disclosed may yield the benefit of reducing latency during frequency switching, allowing for increased measurements at alternate frequencies, reduced time spent on alternate frequencies, and the capacity and throughput improvements that follow from minimization of disruption of an active communication session and improved neighbor selection.

...read moreread less

30 citations

Patent•

Achieving high priority and bandwidth efficiency in a shared communications medium

[...]

Maarten Menzo Wentink

06 Oct 2003

TL;DR: In this article, a technique that enables a shared communications medium to achieve an increased data rate under lossy conditions while maintaining low latency is disclosed, which incorporates two aspects that enable the improved performance.

...read moreread less

Abstract: A technique that enables a shared communications medium to achieve an increased data rate under lossy conditions while maintaining low latency is disclosed. The technique incorporates two aspects that enable the improved performance. The first aspect comprises the rigorous use of a single message flow between two stations at any given time with interframe spaces that are adjusted to allow an uninterrupted flow of frames. An admission control protocol enforces the single flow. The second aspect is the creation of high shared channel utilization (i.e., 'efficiency'). Efficiency is achieved by generating enough opportunities for stations to get on the air, in part by minimizing backoff intervals when a priority flow is needed.

...read moreread less

27 citations

Proceedings Article•DOI•

Micro-benchmark level performance comparison of high-speed cluster interconnects

[...]

Jiuxing Liu¹, B. Chandrasekaran¹, Weikuan Yu¹, Jiesheng Wu¹, Darius Buntinas¹, Sushmitha P. Kini¹, P. Wyckoff², Dhabaleswar K. Panda¹ - Show less +4 more•Institutions (2)

Ohio State University¹, Ohio Supercomputer Center²

15 Sep 2003

TL;DR: The performance results show that all three interconnects achieve low latency, high bandwidth and low host overhead, however, they show quite different performance behaviors when handling completion notification, unbalanced communication patterns and different communication buffer reuse patterns.

...read moreread less

Abstract: In this paper we present a comprehensive performance evaluation of three high speed cluster interconnects: Infini-Band, Myrinet and Quadrics. We propose a set of micro-benchmarks to characterize different performance aspects of these interconnects. Our micro-benchmark suite includes not only traditional tests and performance parameters, but also those specifically tailored to the interconnects advanced features such as user-level access for performing communication and remote direct memory access. In order to explore the full communication capability of the interconnects, we have implemented the micro-benchmark suite at the low level messaging layer provided by each interconnect. Our performance results show that all three interconnects achieve low latency, high bandwidth and low host overhead. However, they show quite different performance behaviors when handling completion notification, unbalanced communication patterns and different communication buffer reuse patterns.

...read moreread less

26 citations

Proceedings Article•DOI•

Performance evaluation of the post-registration method, a low latency handoff in MIPv4

[...]

O. Casals¹, Ll. Cerda¹, Gert Willems², Chris Blondia², N. Van den Wijngaert² - Show less +1 more•Institutions (2)

University of Barcelona¹, University of Antwerp²

11 May 2003

TL;DR: A low latency handoff protocol for MIPv4, the post-registration handoff method, is evaluated and a simple queuing model is proposed to study the influence of various parameters on the protocol performance.

...read moreread less

Abstract: In this paper, we evaluate a low latency handoff protocol for MIPv4, the post-registration handoff method. This mechanism proposed by the IETF tries to improve the performance of hierarchical mobile IP by decreasing the handoff latency. We give a detailed description of the protocol behavior by means of an ns simulation and propose a simple queuing model to study the influence of various parameters on the protocol performance.

...read moreread less

24 citations

Proceedings Article•

Performance comparison of low latency mobile IP schemes

[...]

Chris Blondia, O. Casals, L. CerdÃ, Nik van den Wijngaert, Gert Willems, P. De Cleyn - Show less +2 more

01 Jan 2003

24 citations

Journal Article•DOI•

A high-performance communication service for parallel computing on distributed DSP systems

[...]

James Kohout¹, Alan D. George¹•Institutions (1)

University of Florida¹

01 Jul 2003

TL;DR: This paper presents the design and analysis of a lightweight service for message-passing communication and parallel process coordination, based on the message passing interface specification, for unicast and collective communications.

...read moreread less

Abstract: Rapid increases in the complexity of algorithms for real-time signal processing applications have led to performance requirements exceeding the capabilities of conventional digital signal processor (DSP) architectures. Many applications, such as autonomous sonar arrays, are distributed in nature and amenable to parallel computing on embedded systems constructed from multiple DSPs networked together. However, to realize the full potential of such applications, a lightweight service for message-passing communication and parallel process coordination is needed that is able to provide high throughput and low latency while minimizing processor and memory utilization. This paper presents the design and analysis of such a service, based on the message passing interface specification, for unicast and collective communications.

...read moreread less

Book Chapter•DOI•

Low latency handoff mechanisms and their implementation in an IEEE 802.11 network

[...]

Chris Blondia¹, O. Casals², Ll. Cerda², N. Van den Wijngaert¹, Gert Willems¹, P. De Cleyn¹ - Show less +2 more•Institutions (2)

University of Antwerp¹, Polytechnic University of Catalonia²

01 Jan 2003-Teletraffic Science and Engineering

TL;DR: An analytical model is proposed to study the influence of various system parameters on the performance of the two protocols and the results are compared to show the impact of packet loss on handoff latency.

...read moreread less

Abstract: In this paper we compare the performance of two low latency handoff protocols for MIPv4, Pre- and Post-Registration Handoff. These mechanisms proposed by the IETF aim at improving the performance of Hierarchical Mobile IP with respect to handoff latency and packet loss. We propose an analytical model to study the influence of various system parameters on the performance of the two protocols, followed by a comparison of the two schemes. We describe several handoff implementations over a wireless access based on the IEEE 802.11 standard and analyze them by means of an ns simulation.

...read moreread less

Proceedings Article•

Truncation error and dynamics in very low latency phonetic recognition

[...]

Giampiero Salvi¹•Institutions (1)

Royal Institute of Technology¹

01 Jan 2003

TL;DR: The truncation error for a two-pass decoder is analyzed in a problem of phonetic speech recognition for very demanding latency constraints and for applications where applications where look-ahead length < 100ms is required.

...read moreread less

Abstract: The truncation error for a two-pass decoder is analyzed in a problem of phonetic speech recognition for very demanding latency constraints (look-ahead length < 100ms) and for applications where ...

...read moreread less

DSZOOM--Low Latency Software-Based Shared Memory

[...]

Erik Hagersten¹•Institutions (1)

Uppsala University¹

15 Oct 2003

TL;DR: This paper describes how all the asynchronous overhead can be completely removed by instead running the entire coherence protocol in the requesting processor, and how this technique is applicable to both page-based and fine-grain software shared memory.

...read moreread less

Abstract: Software-implementations of shared memory are still far behind the performance of hardwarebased shared memory implementations and are not viable options for most fine-grain sharedmemory applications. The major source for their inefficiency comes from the cost of interruptbased asynchronous protocol processing, not from the actual network latency. As the raw hardware latency of inter-node communication decreases, the asynchronous overhead in the communication becomes more dominant. Elaborate schemes, involving dedicated hardware and/or dedicated protocol processors, have been suggested to cut the overhead. This paper describes how all the asynchronous overhead can be completely removed by instead running the entire coherence protocol in the requesting processor. This not only removes the asynchronous overhead, but also makes use of a processor that otherwise would stall. The technique is applicable to both page-based and fine-grain software shared memory. Our proof-of-concept implementation—DSZOOM-EMU—is a fine-grained software-based shared memory. It demonstrates a protocol-handling overhead below a microsecond for all the actions involved in a remote load operation, to be compared to the fastest implementation to date of around ten microseconds. The all-software protocol is implemented assuming only some basic low-level primitives in the cluster interconnect. Based on a remote atomic and simple remote put/get operations the requesting processor can assume the role of the directory agent, traditionally assumed by a remote protocol agent in the home node in other implementations. The implementation is thread-safe and allows all processors in a node to simultaneously perform remote operations.

...read moreread less

Book Chapter•DOI•

Secure session key exchange for mobile IP low latency handoffs

[...]

Hyun Gon Kim¹, Dooho Choi¹, Daeyoung Kim²•Institutions (2)

Electronics and Telecommunications Research Institute¹, Chungnam National University²

18 May 2003

TL;DR: The proposed method allows the mobile node to perform Low Latency Handoff with fast as well as secure operation and re-uses the previously assigned session keys in the phase of key exchange between old FA and new FA.

...read moreread less

Abstract: Mobile IP Low Latency Handoffs[1] allow greater support for real-time services on a Mobile IPv4 network by minimising the period of time when a mobile node is unable to send or receive IP packets due to the delay in the Mobile IP Registration process. However, on Mobile IP network with AAA servers that are capable of performing Authentication, Authorization, and Accounting(AAA) services, every Regional Registration has to be traversed to the home network to achieve new session keys, that are distributed by home AAA server, for a new Mobile IP session. This communication delay is the time taken to reauthenticate the mobile node and to traverse between foreign and home network even if the mobile node has been previously authorized to old foreign agent. In order to reduce these extra time overheads, we present a method that performs Low Latency Handoff without requiring further involvement by home AAA server. The method re-uses the previously assigned session keys. To provide the confidentiality of session keys in the phase of key exchange between old FA and new FA, it uses a key sharing method with a trusted third party. The proposed method allows the mobile node to perform Low Latency Handoff with fast as well as secure operation.

...read moreread less

Proceedings Article•DOI•

Design and performance of compressed interconnects for high performance servers

[...]

Krishna Kant¹, Ravi Iyer¹•Institutions (1)

Intel¹

13 Oct 2003

TL;DR: The results indicate that the proposed technique has a potential to reduce address bus width in most cases and data bus widths in some cases while maintaining equal or better performance than in the uncompressed case.

...read moreread less

Abstract: As microprocessors scale rapidly in frequency, the design of fast and efficient interconnects becomes extremely important for low latency data access and high performance. We evaluate a technique for reducing the interconnect width by exploiting the spatial and temporal locality in communication transfers (addresses & data). The width reduction implies a number of other advantages including higher operating frequency, reduced pin-count, lower chip & board cost, etc. We evaluate the effectiveness of the proposed scheme by performing trace-driven simulations for two well-known commercial server workloads (SPECWeb99 and TPC-C). We also study the sensitivity of the compression hit ratio with respect to the number of bits compressed, size of the encoding/decoding table used and the replacement policy. The results indicate that the proposed technique has a potential to reduce address bus width in most cases and data bus widths in some cases while maintaining equal or better performance than in the uncompressed case.

...read moreread less

Wavelength striped semi-synchronous optical local area networks

[...]

L.B. James, G.F. Roberts, Madeleine Glick, D McAuley, Kevin A. Williams, Richard V. Penty, Ian H. White - Show less +3 more

09 Sep 2003

TL;DR: A local area sub-network design avoiding optical buffering, bit level synchronization and regeneration is presented, and uses wavelength striping of the packets as a method of achieving low latency and high capacity for use in the target systems.

...read moreread less

Abstract: We present a local area sub-network design avoiding optical buffering, bit level synchronization and regeneration. Using currently available components we calculate acceptable utilisation when scalability is limited to local, system, storage and desk area networks. The architecture draws upon well-understood computer networking concepts, and uses wavelength striping of the packets as a method of achieving low latency and high capacity for use in the target systems.

...read moreread less

Journal Article•

Adaptive power-aware prefetching schemes for mobile broadcast environments

[...]

Haibo Hu¹, Jianliang Xu¹, Dik Lun Lee¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Jan 2003-Lecture Notes in Computer Science

TL;DR: In this paper, two adaptive power-aware schemes, PT with secure region (PTSR) and adaptive prefetching with sliding caches (APSC), are proposed to achieve low latency and low power consumption for dynamic request patterns.

...read moreread less

Abstract: Caching and prefetching are common techniques in mobile broadcast environments. Several prefetching policies have been proposed in the literature to either achieve the minimum access latency or save power as much as possible. However, little work has been carried out to adaptively balance these two metrics in a dynamic environment where the user request arrival patterns change over time. This paper investigates how both low latency and low power consumption can be achieved for dynamic request patterns. Two adaptive power-aware schemes, PT with secure region (PTSR) and adaptive prefetching with sliding caches (APSC), are proposed. Experimental results show the proposed policies, in particular APSC, achieve a much lower power consumption while maintaining a similar access latency as the existing policies.

...read moreread less

Proceedings Article•

SensorBox: practical audio interface for gestural performance

[...]

Jesse T. Allison¹, Timothy A. Place¹•Institutions (1)

University of Missouri–Kansas City¹

22 May 2003

TL;DR: SensorBox is a low cost, low latency, high-resolution interface for obtaining gestural data from sensors for use in realtime with a computer-based interactive system.

...read moreread less

Abstract: SensorBox is a low cost, low latency, high-resolution interface for obtaining gestural data from sensors for use in realtime with a computer-based interactive system. We discuss its implementation, benefits, current limitations, and compare it with several popular interfaces for gestural data acquisition.

...read moreread less

VLSI implementation of a switch for on-chip networks

[...]

Tomas Henriksson¹, D. Wiklund, Dake Liu•Institutions (1)

Linköping University¹

01 Jan 2003

TL;DR: A memory buffer free switch node for connection circuits set up by packet routing and implemented with a test chip with two switch nodes to solve the SoC integration problem.

...read moreread less

Abstract: Switch nodes in a 2D mesh SoC connection network have been suggested to solve the SoC integration problem. We have presented a memory buffer free switch node for connection circuits set up by packet routing. A test chip with two switch nodes has been implemented. The function of a switch node is to set up and tear down connections based on small packets and then to transport the payload data without buffering and with very low latency. The silicon cost of a node is 0.5 mm2 based on a 2 metal layer 0.8 micrometer AMS CMOS technology. The connection latency cost is one clock cycle/switch node. The test chip works properly at 50 MHz.

...read moreread less

Proceedings Article•DOI•

High-bandwidth low-latency global interconnect

[...]

Christer Svensson, Peter Caputa

23 Apr 2003-Proceedings of SPIE

TL;DR: A critical analysis of intrinsic limitations of electrical interconnect indicates that these limitations can be overcome, and a scheme for this is presented, based on the utilization of upper-level metals as transmission lines.

...read moreread less

Abstract: Global interconnects have been identified as a serious limitation to chip scaling, due to their limited bandwidth and large delay. A critical analysis of intrinsic limitations of electrical interconnect indicates that these limitations can be overcome. This basic analysis is presented, together with design constraints. We demonstrate a scheme for this, based on the utilization of upper-level metals as transmission lines. A global communication architecture based on a global mesochronous, local synchronous approach allows very high data-rate per wire and therefore very high bandwidth in buses of limited width. As an example, we demonstrate a 320μm wide bus with a capacity of 160Gb/s in a nearly standard 0.18μm process.

...read moreread less

Book Chapter•DOI•

An Efficient and Self-Configurable Publish-Subscribe System

[...]

Tao Xue¹, Boqin Feng¹•Institutions (1)

Xi'an Jiaotong University¹

07 Dec 2003

TL;DR: A hybrid routing algorithm is presented, which can fully exploit multicast in order to reduce the used network bandwidth and make the overlay network dynamically adapt to the changing of lower network topologic raised by node or link failure.

...read moreread less

Abstract: Efficient routing algorithms and self-configuration are two key challenges in the area of large-scale content-based publish-subscribe systems In this paper we first propose a hierarchical system model with multicast clustering Then a hybrid routing algorithm is presented, which can fully exploit multicast in order to reduce the used network bandwidth We also propose multicast clustering replication protocol and content-based multicast tree protocol to make the overlay network dynamically adapt to the changing of lower network topologic raised by node or link failure, requiring no manual tuning or system administration Simulation results show that the system has low cost, and event delivered over it experiences moderately low latency

...read moreread less

Proceedings Article•

Modelling and characterisation of an asynchronous optical packet switch for direct IP over WDM

[...]

Wim Vanderbauwhede, David Harle

01 Jan 2003

TL;DR: This paper directly addresses the performance of the core OPS module and results obtained from simulation models show that the proposed asynchronous OPS architecture exhibits low latency and packet losses allied with relatively high throughput.

...read moreread less

Abstract: The prime objective of the EPSRC-funded OPSnet project is the design and demonstration of an asynchronous DWDM optical packet switch (OPS) capable of directly carrying IP packets over DWDM-based core networks at transport rates in the order of 100 Gb/s and above. To achieve such an objective demands a highly flexible and innovative core switch architecture. The operation and performance of such an architecture is the subject of this paper. The paper directly addresses the performance of the core OPS module and results obtained from simulation models show that the proposed asynchronous OPS architecture exhibits low latency and packet losses allied with relatively high throughput

...read moreread less

Proceedings Article•DOI•

SoC platform architecture for a network processor

[...]

H. Ghattas¹, M. Mbaye¹, J. Pepga Bissou¹, Yvon Savaria¹•Institutions (1)

École Polytechnique de Montréal¹

19 Nov 2003

TL;DR: This research aims at implementing a SoC platform that could support high throughput and low latency real time video streaming and presents a custom embedded processor used at the core of the platform.

...read moreread less

Abstract: Network processing devices emerged as a result of the growing demand for enhanced, flexible next generation communication services Vendors are increasingly in need of a network processor solution that allows meeting bandwidth requirements and new features while shortening time-to-market This research aims at implementing a SoC platform that could support high throughput and low latency real time video streaming This paper explores a hardware/software system on chip (SoC) solution for a protocol conversion application A methodology to develop an architecture for a SoC platform will be discussed We also present a custom embedded processor used at the core of our platform

...read moreread less

Patent•

Outboard clustered computer systems manager utilizing commodity components

[...]

Michael J. Heideman¹, Dennis R. Konrad¹, David A. Novak¹•Institutions (1)

Unisys¹

17 Jan 2003

TL;DR: An apparatus for and method of implementing a cluster lock processing system using highly scalable, off-the-shelf commodity processors is described in this article, which is the central component of a clustered computer system, providing locking and coordination between multiple host systems.

...read moreread less

Abstract: An apparatus for and method of implementing a cluster lock processing system using highly scalable, off-the-shelf commodity processors. The cluster lock processing system is the central component of a clustered computer system, providing locking and coordination between multiple host systems. The host systems are coupled to the cluster lock processing system using off-the-shelf, low latency interconnects. The cluster lock processing system is composed of multiple commodity platforms that are also coupled to each other using low latency interconnects. Failure of one of the commodity platforms that comprise the cluster lock processing system results in no loss of functionality or interruption of service. This is made possible through the use of specialized software that runs on the commodity platforms. Through the use of custom software and inexpensive hardware the overall system cost is dramatically reduced when compared to typical solutions that use custom built hardware. By allowing the individual commodity platforms to be physically separated, the cluster lock processing system also provides for resiliency against physical damage to an individual platform that may be caused by a catastrophic site failure.

...read moreread less

Journal Article•

Design and Implementation of a High Performance VIA Based on Myrinet

[...]

Chen Yu

01 Jan 2003-Journal of Software

TL;DR: The current development, the principle and implementations of VIA are analyzed, and a user-level high-performance communication software MyVIA based on Myrinet is presented, which is comfortable with VIA specification.

...read moreread less

Abstract: Virtual interface architecture (VIA) established a communication model with low latency and high bandwidth, and defined the standard of user-level high-performance communication specification in cluster system. In this paper, the current development, the principle and implementations of VIA are analyzed, and a user-level high-performance communication software MyVIA based on Myrinet is presented, which is comfortable with VIA specification. First, the design principle and the framework of MyVIA are described, and then the optimized technologies for MyVIA are proposed, which include UTLB, continued physical memory and varied NIC buffer, the pipelining process based on resource and DMA chain, physical descriptor ring and dynamic cache. The experimental results indicate that the bandwidth of MyVIA for 4KB message is 250MB/s, the lowest one-way latency is 8.46ms, which show that the performance of MyVIA surpasses that of other VIA.

...read moreread less

Proceedings Article•DOI•

Planar integrated free-space optics for coupling and fan-in/-out in a low-latency processor memory interconnection

[...]

Manfred Jarczynski¹, Matthias Gruber¹, Jürgen Jahns¹•Institutions (1)

Rolf C. Hagen Group¹

07 Apr 2003

TL;DR: The European Union project 'HOLMS' aims at demonstrating the feasibility of an optical bus system for CPU memory access based on planar integrated free-space optics (PIFSO) in combination with fibre and PCB integrated waveguide optics to demonstrate a novel architecture of low latency memory access.

...read moreread less

Abstract: In computer architecture bandwidth and memory latency represent a major bottleneck. One possibility for solving these problems is the use of optical interconnections with their inherent capability for large fanin and fanout, low skew, etc. Today the possibilities to produce integrated chips with optical and electronic connections are advanced and the barrier for their adoption in computer systems gets smaller. The European Union project 'High-Speed Opto-Electronic Memory Systems' (HOLMS) aims at demonstrating the feasibility of an optical bus system for CPU memory access. The bus system is based on planar integrated free-space optics (PIFSO) in combination with fibre and PCB integrated waveguide optics. The goal is to demonstrate a novel architecture of low latency memory access. Here, we will discuss the task of the free-space optics. The assignment of the PIFSO is to perform all fanin and fanout operations for the interconnection between CPU and memory. Longer distances like connections between CPU and memory will be broadcasted by waveguides in the PCB; and fibres are used to combine two PCBs to a multiprocessor system. The first task consists of the design and the realization of the interface between the PIFSO and the PCB integrated waveguides. Besides the optical coupling, it is the main aspect to find an optical solution that allows large mechanical tolerances in the packaging of the different parts of the system. The large number of optical lines and their fanout and fanin are a challenge for design and construction, too. Design issues will be discussed and first experimental results will be presented.

...read moreread less

Proceedings Article•DOI•

Combining optical and electronical design constraints in the HOLMS opto-electronic MCM components

[...]

Marco Wirz¹, Manfred Jarczynski¹, Paul Lukowicz¹, Roberto Barbieri Carrera¹, Gerhard Troester¹, Juergen Jahns² - Show less +2 more•Institutions (2)

ETH Zurich¹, FernUniversität Hagen²

16 Jun 2003

TL;DR: In this article, the issues involved in combining electronic and optical design constraints in opto-electronic multichip (OE-MCM) modules are exploited to implement a low latency optoelectronic memory system.

...read moreread less

Abstract: This paper exploits the issues involved in combining electronic and optical design constraints in opto-electronic multichip (OE-MCM) modules. It focuses on the OE-MCM components used in the HOLMS EU-project aimed at implementing a low latency opto-electronic memory system.

...read moreread less

Low latency and high throughput message passing solutions

[...]

Andrea Sanna, Marco Maniezzo

01 Jan 2003

TL;DR: Performance comparisons between PVM and MPI, as well as the optimizations achieved exploiting the GAMMA (Genoa Active Message MAchine) Active Message paradigm, are presented, and a GAMMA implementation for 3COM 3c966 NICs is presented.

...read moreread less

Abstract: The main solutions currently adopted in deploying parallel applications are based on the use of highperformance parallel platforms and Networks of Workstations (NOW) exploiting off-the-shelf communication hardware. However, the former solutions are highly expensive, while the latter ones only achieve limited performances. An optimal solution consists in employing NOW (workstations or high-end PCs) combined with high performance Network Cards, effective parallel environments such as PVM or MPI and in modifying the standard communication protocol layer. In this paper, performance comparisons between PVM and MPI, as well as the optimizations achieved exploiting the GAMMA (Genoa Active Message MAchine) Active Message paradigm, are presented, and a GAMMA implementation for 3COM 3c966 NICs is pro-

...read moreread less

Journal Article•

Scalable multithreading in a low latency Myrinet cluster

[...]

Albano Alves, António Pina, José Exposto, José Rufino

01 Jan 2003-Lecture Notes in Computer Science

TL;DR: The pCoR programming model as mentioned in this paper combines primitives to launch remote processes and threads with communication over Myrinet and achieves high performance communication among threads of parallel/distributed applications.

...read moreread less

Abstract: In this paper we present some implementation details of a programming model - pCoR - that combines primitives to launch remote processes and threads with communication over Myrinet. Basically, we present the efforts we have made to achieve high performance communication among threads of parallel/distributed applications. The expected advantages of multiple threads launched across a low latency cluster of SMP workstations are emphasized with a graphical application that manages huge maps consisting of several JPEG images.

...read moreread less