scispace - formally typeset
Search or ask a question

Showing papers by "Miguel Castro published in 1994"


Proceedings ArticleDOI
14 Aug 1994
TL;DR: A checkpoint protocol for a multithreaded distributed shared memory system based on the entry consistency memory model that allows transparent recovery from single node failures and, in some cases, from multiple node failures.
Abstract: Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application’s execution, increases with the execution time and the number of workstations used. If no provision is made for handling failures, it is unlikely that long running applications will terminate successfully. One solution to this problem is process checkpointing. This paper presents a checkpoint protocol for a multithreaded distributed shared memory system based on the entry consistency memory model. The protocol allows transparent recovery from single node failures and, in some cases, from multiple node failures. A simple mechanism is used to determine if the system can be brought to a consistent state in the event of multiple machine crashes. The protocol keeps a distributed log of shared data accesses in the volatile memory of the processes, taking advantage of the independent failure characteristics of workstation clusters. Periodically, or whenever the log reaches a highwater mark, each process checkpoints its state, independently from the others. The protocol needs no extra messages during the failure-free period, since all checkpoint control information is piggybacked on the memory coherence protocol messages.

69 citations


Journal ArticleDOI
TL;DR: The binding energies, structural parameters, and vibrational frequencies of FeCO, FeCO−, and FeCO+ were studied with a linear combination of Gaussian-type orbitals-density functional (LCGTO•DF) method.
Abstract: The binding energies, structural parameters, and vibrational frequencies of FeCO, FeCO−, and FeCO+ were studied with a linear combination of Gaussian‐type orbitals‐density functional (LCGTO‐DF) method. The ground state of FeCO is found to be 3Σ− and the calculated dissociation energy, with respect to ground state Fe(5D,3d64s2) and CO (1Σ+), is 30 kcal/mol; after correcting for the atomic states separation of the iron atom this value becomes 17 kcal/mol, which is relatively close to the most recent experimental values 8.1±3.5–10.5±3.7 kcal/mol. Quartet ground states were found for both FeCO+ and FeCO− and the calculated dissociation energies (with respect to ground state Fe+, Fe−, and CO) are 50 and 31 kcal/mol, respectively. There is agreement between theory and experiment in that D(FeCO+)≳D(FeCO−)≳D(FeCO). The ωe’s we calculate for FeCO are, in cm−1, 658 (Fe–C stretch), 1982 (C–O stretch), and 368 (bend). These values are reasonably close to their experimental counterparts, 530±10, 1950±10, and 330±50. F...

56 citations


Proceedings ArticleDOI
12 Sep 1994
TL;DR: Good performance, achieved by a close integration of the programming model with the synchronization mechanisms, and fault-tolerance, with an efficient checkpointing algorithm that requires no extra messages during the failure-free period are addressed.
Abstract: DiSOM is a software-based distributed shared memory system for a multicomputer composed of heterogeneous nodes connected by a high-speed, low latency network [Guedes 93]. The current prototype comprises a Sun SPARCCenter 2000 with 10 processors and several SPARCStation 10, i486 PC and Dec Alpha, connected by ATM and Ethernet.Programs in DiSOM are written using a shared-memory multiprocessor model where synchronization objects are explicitly associated with data items. Programs are composed of a set of parallel threads of execution. These threads share data objects and synchronize by explicit calls to system provided synchronization constructs. The system traps these calls and uses the information to drive both distributed synchronization and the memory coherence protocol. DiSOM uses the entry consistency memory model [Bershad 93] to ensure coherence. This model guarantees memory consistency, as long as an access to a data item is enclosed between an acquire and a release on the synchronization object associated with the data item.DiSOM addresses two issues we believe to be crucial in distributed shared memory systems: good performance, achieved by a close integration of the programming model with the synchronization mechanisms, and fault-tolerance, with an efficient checkpointing algorithm that requires no extra messages during the failure-free period.

5 citations