scispace - formally typeset
Search or ask a question

Showing papers presented at "Parallel and Distributed Processing Techniques and Applications in 1995"


Proceedings Article
01 Jan 1995
TL;DR: SVMFortran is an extension of Fortran77 for programming shared virtual memory systems that provides special notations for work distribution to optimize data locality and load balance and special emphasis is set on source code related methods of OPAL and the underlying trace generation with the performance monitor SAM.
Abstract: Programming distributed memory parallel computers with message passing is often considered to be a difficult task. To overcome the drawbacks of this programming style, several efforts have been made in the field of new parallel programming languages. One example is the shared virtual memory programming model. SVMFortran is an extension of Fortran77 for programming shared virtual memory systems. It provides special notations for work distribution to optimize data locality and load balance. There exists a number of different tools for performance debugging in message passing systems, but none of these would fulfill the special requirements of the SVM programming model. Moreover, we observed that the user requirements for performance debugging strongly recommend tools which are adapted to the programmers view of the program. Therefore, special emphasis is set on source code related methods of OPAL and the underlying trace generation with the performance monitor SAM.

21 citations


Proceedings Article
01 Jan 1995
TL;DR: This work describes a Bulk-Synchronous Processing model implementation of a plasma simulation and use of BSP analysis techniques for tuning the program for arbitrary architectures and compares the performance of the BSP implementation with a version using MPI.
Abstract: Computationally intensive applications with frequent communication and synchronization require careful design for eecient execution on networks of workstations. We describe a Bulk-Synchronous Processing (BSP) model implementation of a plasma simulation and use of BSP analysis techniques for tuning the program for arbitrary architectures. In addition, we compare the performance of the BSP implementation with a version using MPI. Our results indicate that the BSP model, serving as a basis for an eecient implementation, compares favorably with MPI.

19 citations


Proceedings Article
Ted Herman1
01 Jan 1995
TL;DR: An impossibility theorem for a class of superstabilizing mutual exclusion protocols for mutual exclusion in a ring of processors, where a local fault consists of any transient fault at a single processor; the passage predicate specifies that there be at most one token in the ring.
Abstract: A superstabilizing protocol is a protocol that (i) is self-stabilizing, meaning that it can recover from an arbitrarily severe transient fault; and (ii) can recover from a local transient fault while satisfying a passage predicate during recovery. This paper investigates the possibility of superstabilizing protocols for mutual exclusion in a ring of processors, where a local fault consists of any transient fault at a single processor; the passage predicate specifies that there be at most one token in the ring, with the single exception of a spurious token colocated with the transient fault. The first result of the paper is an impossibility theorem for a class of superstabilizing mutual exclusion protocols. Two unidirectional protocols are then presented to show that conditions for impossibility can independently be relaxed so that superstabilization is possible using either additional time or communication registers. A bidirectional protocol subsequently demonstrates that superstabilization in O(1) time is possible. All three superstabilizing protocols are optimal with respect to the number of communication registers used.

11 citations


Proceedings Article
01 Jan 1995
TL;DR: The results of performance measurements are presented which indicate that dynamic object replication is a feasible approach to improving the eeciency of object invocation in distributed environments.
Abstract: Replication is a technique which is widely used for speeding up access to passive objects. Consistency requirements of distributed object oriented programming languages, however, limit the scope of object replication to coarse grain objects or objects which are modiied infrequently. This is due to the fact that the overhead for global synchronization and preserving replica consistency often outweighs the beneets of local object access. This paper presents an algorithm which dynamically chooses whether an object invocation should be realized either through remote invocation or through ac-cessing a local object replica. The algorithm uses heuristics which consider recent object access patterns as well as the cost for access synchronization and replica conistency. Finally, the results of performance measurements are presented which indicate that dynamic object replication is a feasible approach to improving the eeciency of object invocation in distributed environments.

10 citations


Proceedings Article
01 Jan 1995
TL;DR: The CellFlow method for object space subdivision that exploits frame coherency to implement a look-ahead scheme of object dataflow is presented, which exploits the communication features of modern scalable multicomputers to achieve near linear speedup by means of latency hiding.
Abstract: We present the CellFlow method for object space subdivision that exploits frame coherency to implement a look-ahead scheme of object dataflow. The implementation of this scheme exploits the communication features of modern scalable multicomputers to achieve near linear speedup by means of latency hiding. We demonstrate the performance of our approach in the field of volume rendering by implementing incremental rotation of the volumetric object about its center. The simplicity of the algorithm, its optimal embedding in popular network topologies, and minimal congestionfree communication among processors are its main advantages. Results are shown for implementation on the Cray T3D, a distributed memory 3D torus architecture. Computation and communication load balancing issues among processors are also addressed.

10 citations


Proceedings Article
01 Jan 1995
TL;DR: The factors that affect the derivation of computation and data partitions on scalable shared memory multiprocessors (SSMMs) are identified and it is shown that these factors necessitate an SSMM-conscious approach.
Abstract: In this paper we identify the factors that affect the derivation of computation and data partitions on scalable shared memory multiprocessors (SSMMs) We show that these factors necessitate an SSMM-conscious approach In addition to remote memory access, which is the sole factor on distributed memory multiprocessors, cache affinity, memory contention and false sharing are important factors that must be considered Experimental evidence is presented to demonstrate the impact of these factors on performance using three applications on the KSR1 and the Hector multiprocessors

8 citations


Proceedings Article
01 Jan 1995
TL;DR: The Hypercube Sandwich Network (HSN) as discussed by the authors is a cascaded conference network that provides distributed processing and signal transmission among members of disjoint sets of generic send/receive devices called conferees.
Abstract: This paper presents a novel cascaded conference network that provides distributed processing and signal transmission among members of disjoint sets of generic send/receive devices called conferees. It assumes an online request model in which idle groups of conferees may request the formation of a conference interconnection. Once a conference is established, all conferees remain connected until the entire conference is dissolved. The Hypercube Sandwich Network (HSN) consists of two components. A bidirectional permutation network is used for routing purposes to and from a hypercube of special processing elements for the purpose of conference formation. The HSN achieves strictly nonblocking performance for N conferees using O(N√log N) processing elements, and this is shown to be tight to within a log 1/4N factor. Previous constructions required a quadratic number of processing elements for strictly nonblocking performance or could only provide wide-sense nonblocking conferencing. If the stronger requirement is made that the communication delay is logarithmic in the conference size, a simple algorithm is presented for wide-sense nonblocking conferencing in an HSN with O(N log N) processing elements.

7 citations


Proceedings Article
01 Jan 1995
TL;DR: A pair of elliptic gear wheels to be installed in a metering chamber casing of positive displacement type flow meters and have a modified gear wheel has features that metering errors caused by the trapping phenomenon between meshed gear teeth will be eliminated by an effect of the cycloid gear form and the involute gear form side of long radius.
Abstract: A pair of elliptic gear wheels to be installed in a metering chamber casing of positive displacement type flow meters and have a modified gear wheel and the change of gear form of an effective gearing pressure angle of an involute gear form side of long radius and of an elimination of trapping phenomenon of cycloid gear form of short radius of wheels. This pair of gear wheels has super precision measuring functions and excellent repeatability, by utilizing a computer which compensates flow rate errors instantaneously and send signals for indicating corrected values. Meshing of the pair of wheels at a top of an involute gear form on a middle portion of a pitch circle can keep meshing of the pair of wheels and a cycloid gear form side of short radius eliminates trapping phenomenon. This pair of modified gear wheels has features that metering errors caused by the trapping phenomenon between meshed gear teeth will be eliminated by an effect of the cycloid gear form and the involute gear form side of long radius keeps meshing of each coupled gear wheel.

7 citations


Proceedings Article
31 Dec 1995
TL;DR: This project shows that a real-time portable software MPEG decoder is feasible in a general-purpose parallel machine and can be easil ported to other parallel machines.
Abstract: We present a real-time MPEG software decoder that uses message-passing libraries such as MPL, p4 and MPI. The parallel MPEG decoder currently runs on the IBM SP system but can be easil ported to other parallel machines. This paper discusses our parallel MPEG decoding algorithm as well as the parallel programming environment under which it uses. Several technical issues are discussed, including balancing of decoding speed, memory limitation, 1/0 capacities, and optimization of MPEG decoding components. This project shows that a real-time portable software MPEG decoder is feasible in a general-purpose parallel machine.

5 citations


Proceedings Article
01 Jan 1995
TL;DR: A unique communication protocol is presented and shown to provide single cycle transfers between nodes and a mechanism for efficient flow control communication is discussed.
Abstract: The NuMesh system defines a high-speed communication substrate optimized for off-line routing By determining possible communication paths at compile time, highly efficient hardware and software constructs can be exploited to yield superior network performance Limited gate delays between NuMesh registers, as well as single cycle message transfers, allow for a high clock frequency and low network latency A highly pipelined architecture for this communication is presented and a mechanism for efficient flow control communication is discussed A unique communication protocol is presented and shown to provide single cycle transfers between nodes Virtual pipes are discussed as a communication protocol for nodes to communicate when running applications on the NuMesh system Preliminary results and a description of the current hardware and software status are listed

5 citations


Proceedings Article
01 Jan 1995
TL;DR: A simple and relatively inexpensive containerized vehicle storage system for holding self-parked vehicles that includes a building housing having an upper level and a lower level, with the lower level being situated below level of vehicle entrance into the housing.
Abstract: A simple and relatively inexpensive containerized vehicle storage system for holding self-parked vehicles. In one embodiment, the system includes a building housing having an upper level and a lower level, with the lower level being situated below level of vehicle entrance into the housing. A plurality of containers are positioned in at least two vertically stacked columns in the housing. Each container is identically configured, and includes a weight tolerant structural shell. The shell is formed by a floor, sidewall and roof arranged to define a shell entrance and an oppositely situated shell exit to permit respective entry and exit of a vehicle into and from the shell of the container. The shell is typically configured to support the weight of a conventional automobile positioned inside the shell, and further support a stack of about ten similarly loaded and configured containers. Optionally, the shell entrance and shell exit are identical, with the vehicle exiting by backing out from the shell entrance/exit. In this embodiment, the container can include an integrally formed endwall positioned opposite the shell entrance. Endwalls of containers in a first column are positioned adjacent to shell entrances of containers in a second column.





Proceedings Article
01 Jan 1995
TL;DR: A pump assembly for conventionally driven centrifugal pumps having a rolling element comprised of product lubricated ceramic or hybrid anti-friction bearings to perform well with poor lubrication.
Abstract: A pump assembly for conventionally driven centrifugal pumps having a rolling element comprised of product lubricated ceramic or hybrid anti-friction bearings. The ceramic bearings are comprised of ceramic balls and ceramic races, whereas the hybrid bearings are comprised of ceramic balls with races made of another material. The ability of these bearings to perform well with poor lubrication allows the fluid that is being pumped to be used to lubricate and cool the bearings. An alternate embodiment comprises a double suction pump with the same durability and cost saving advantages.

Proceedings Article
01 Jan 1995
TL;DR: A connector includes a contact housing and a signal contact that includes a finger portion, a base portion, and a foot that fits into cooperating retention channels.
Abstract: A connector includes a contact housing and a signal contact. The contact housing has at least two cooperating retention channels. The signal contact is coupled to the contact housing. The signal contact includes a finger portion, a base portion, and a foot. The base portion has first and second retaining tabs. The first and second retaining tabs are interference fit into cooperating retention channels. The foot is defined in the base portion between the first and second retaining tabs.

Proceedings Article
01 Jan 1995
TL;DR: The general conclusion proposed is that a MP computer containing a small number of high-performance processors may not be the best choice for I/O-dominant applications area.
Abstract: A general law is proposed that states "a large numbers of slower processors may be better than a small number of faster processors for I/O-dominant applications". The need for such a guideline is demonstrated because simple linear sums of individual processor performances do not provide an accurate estimation of I/O performance for a parallel computer. Furthermore, the law was formulated to allow better cost estimations when choosing the number and type processor for a Massively Parallel (MP) I/O application. The law is confirmed with a simple proof, analytical model, simulation, and benchmarking. A Distributed Cache Subsystem (DCS) technique is proposed to further improve the performance of the MP computers running I/O-dominant applications. Using simulations and benchmarks the DCS technique has shown the potential to achieve very high performance using standard sequential file systems. The general conclusion proposed is that a MP computer containing a small number of high-performance processors may not be the best choice for I/O-dominant applications area.


Proceedings Article
01 Jan 1995
TL;DR: The method begins by surgically implanting at least one dental implant fixture within the mouth in the area of the occlusion and an abutment is added to each dental implant fixtures.
Abstract: A system and method for filling a dental occlusion with a metal free prosthesis. The method begins by surgically implanting at least one dental implant fixture within the mouth in the area of the occlusion. Once the implant fixture is healed, an abutment is added to each dental implant fixture. An impression of the area of the occlusion is taken directly over any abutment. The impression is used to create a dental model of the area of the occlusion. A metal free prosthesis is fabricated from the model. The metal free prosthesis is then anchored to the abutments.

Proceedings Article
01 Jan 1995
TL;DR: A high volume low pressure (hereinafter HVLP) blower has a blower housing having mounted therein the blower motor and blower turbine for delivering HV LP air to an air outlet.
Abstract: A high volume low pressure (hereinafter HVLP) blower having a blower motor with a cooling fan for passing cooling air over the blower motor and a blower turbine connected to and driven by the blower motor for delivering HVLP air to an air outlet. An air intake communicates fresh incoming air to the cooling fan and to the blower turbine. The HVLP blower has a blower housing having mounted therein the blower motor and blower turbine. The blower housing has exhaust vents therein for exhausting cooling air exhaust to the atmosphere. An exhaust plenum within the blower housing prevents mixing of the incoming air with the cooling air exhaust and provides a first passageway for cooling air exhaust to be communicated from the cooling fan to the exhaust vents, such that the exhaust vents direct the cooling air exhaust away from the air intake.

Proceedings Article
01 Jan 1995
TL;DR: A new class of topologies for constructing multicomputers called Small ATM Switch based Interconnection Networks (SASIN) is introduced, which is hierarchical, symmetrical, has a low diameter and is easy to implement.
Abstract: Networks of workstations are becoming a popular platform for parallel computing. Asynchronous Transfer Mode (ATM) network connections can improve the communication characteristics of these parallel clusters. This paper introduces a new class of topologies for constructing multicomputers called Small ATM Switch based Interconnection Networks (SASIN). The Round Table Network (one kind of SASIN) is introduced and analyzed. It is hierarchical, symmetrical, has a low diameter and is easy to implement. The bisection width, diameter, number of switching elements and eeciency of this network are compared to other interconnection topologies.

Proceedings Article
01 Jan 1995
TL;DR: This analysis shows that sophisticated local bisection heuristics combined with the mul-tilevel method result in high quality orderings that can be computed in a reasonable amount of time.
Abstract: In this paper we compare nested dissection orderings obtained by diierent graph bisection heuristics. In the context of parallel sparse matrix factorization the quality of an ordering is not only determined by its ll reducing capability, but also depends on the dif-culty with which a balanced mapping of the load onto the processors of the parallel computer can be found. Our analysis shows that sophisticated local bisection heuristics combined with the mul-tilevel method result in high quality orderings. Furthermore, these orderings can be computed in a reasonable amount of time.

Proceedings Article
01 Jan 1995
TL;DR: A binding apparatus includes a cutting mechanism for cutting a primary continuous form vertically into two equal sheets of secondary continuous form, a pressing mechanism for pressing the predetermined number of sets of the pieces of superposed sheets so that the pieces are pasted together to be bound into a booklet.
Abstract: A binding apparatus includes a cutting mechanism for cutting a primary continuous form vertically into two equal sheets of secondary continuous form, a primary pasting mechanism for pasting a side edge of one of the secondary continuous form sheets in the form of a line or band, a secondary continuous form superposing and feeding mechanism for superposing the other secondary continuous form sheet on the one secondary continuous form sheet so that cross perforation lines are overlapped with each other and feeding the superposed continuous form sheets, a secondary pasting mechanism for pasting a side edge of the other secondary continuous form sheet in the form of the line or band, a separating mechanism for separating the superposed secondary continuous form sheets along the cross perforation lines so that pieces of superposed sheets are obtained, a piling mechanism for piling two pieces of superposed sheets so that the piece of superposed sheet fed from the separating mechanism creeps under the previously fed piece of superposed sheet, the piling mechanism piling a predetermined number of sets of the pieces of superposed sheets, and a pressing mechanism for pressing the predetermined number of sets of the pieces of superposed sheets so that the pieces are pasted together to be bound into a booklet.

Proceedings Article
01 Jan 1995
TL;DR: This work presents a parallel/ distributed algorithm for mul-tivariate numerical integration and examines its performance on nCUBE-2 and PVM and shows that good speedups can be achieved for a variety of integration problems, in particular for problems with integrand singulari-ties.
Abstract: In this paper, we present a parallel/ distributed algorithm for mul-tivariate numerical integration and examine its performance on nCUBE-2 and PVM. The test results address the eeect of algorithm elements such as a heuristic load balancing technique. We show that good speedups can be achieved for a variety of integration problems, in particular for problems with integrand singulari-ties. This work is a part of a project (ParInt) whose main goal is to package a number of practical multivariate integration algorithms on a variety of multi-processor systems and make them available to researchers and practitioners in various disciplines of science and engineering.

Proceedings Article
01 Jan 1995
TL;DR: A highly reliable electrode of high strength which undergoes little change even with the lapse of time is provided, and a method for making the same, as well as a vacuum valve using such electrode and a vacuum circuit breaker using such vacuum valve.
Abstract: According to the present invention there are provided a highly reliable electrode of high strength which undergoes little change even with the lapse of time, and a method for making the same, as well as a vacuum valve using such electrode and a vacuum circuit breaker using such vacuum valve. The vacuum circuit breaker has a fixed electrode and a movable electrode, each comprising an arc electrode, an arc electrode support member for supporting the arc electrode, and a coil electrode contiguous to the arc electrode support member, the arc electrode, the arc electrode support member and the coil electrode being formed as an integral structure by melting, not by bonding, particularly the arc electrode support member and the coil electrode being constituted by a Cu alloy containing 0.05-2.5% by weight of at least one of Cr, Ag, W, V and Zr.

Proceedings Article
01 Jan 1995
TL;DR: A vehicle having a lifting boom, comprising a load-carrying frame having a front section and a rear section, has a platform on which a lifting device is mounted, in particular operatively associated to a three-point hitch, whereby the vehicle can be used also as an agricultural machine.
Abstract: A vehicle having a lifting boom, comprising a load-carrying frame (12) having a front section (12a) and a rear section (12b), a lifting boom (18) extending parallel to the longitudinal axis of the vehicle and articulated to the frame (12) in the rear section (12b), an operating and driving cab (28) placed on a side of the vehicle (18) and an internal combustion engine placed on the opposite side of the boom (18) with respect to the operating cab (28). The rear section (12b) of the frame has a platform on which a lifting device is mounted, in particular operatively associated to a three-point hitch, whereby the vehicle can be used also as an agricultural machine.


Proceedings Article
01 Jan 1995
TL;DR: A reversible cable reel which includes two cover shells having a respective center shaft axially connected to each other, two cable wheels respectively mounted around the center shafts inside the cover shells, and each spiral spring having an inner end connected to the respective cover shell and an outer end connectedto the respective cable wheel.
Abstract: A reversible cable reel which includes two cover shells having a respective center shaft axially connected to each other, two cable wheels respectively mounted around the center shafts inside the cover shells, two spiral springs respectively mounted around the center shafts inside the cover shells, each spiral spring having an inner end connected to the respective cover shell and an outer end connected to the respective cable wheel, two cables respectively wound round the cable wheels, each cable having a fixed end fastened to the respective cable wheel and a free end extended out of the respective cover shell and mounted with a module plug, a first terminal unit mounted in one cable wheel and connected to conductors in one cable, a second terminal unit mounted in the other cable wheel and connected to conductors in the other cable and separated from the first terminal unit by a cover plate, the second terminal unit having contact means projecting into respective through holes on the cover plate into contact with the first terminal unit.

Proceedings Article
01 Jan 1995
TL;DR: An improved method and apparatus for emptying a load of turkeys from multilayer containers is disclosed, and the mechanisms for moving the push member may be mounted externally of the coop structure.
Abstract: An improved method and apparatus for emptying a load of turkeys from multilayer containers is disclosed. Each container or coop includes a liftable gate that enables access into the container and a slidably moveable push member that may also comprise the back wall of the container. The push member is moved from the rear of the container and toward the access port thereof. As the push member moves and engages turkeys in the coop, the turkeys are moved by the push member across the container floor and out of the access port onto a series of conveyors. The push members of a plurality of coops can be interconnected to simultaneously unload a plurality of containers. The mechanisms for moving the push member may be mounted externally of the coop structure. The same mechanism may also be used for returning the push member to and for locking it in its initial transport position.