scispace - formally typeset
Search or ask a question

Showing papers on "Data-intensive computing published in 1999"


Journal ArticleDOI
TL;DR: This paper shows how Java’s object-oriented features are used to build a flexible software framework that makes it easy for programmers to write different volunteer computing applications, while allowing researchers to study and develop the underlying mechanisms behind them.

162 citations


Proceedings ArticleDOI
01 May 1999
TL;DR: A cache in which the line (fetch) size is continuously adjusted by hardware based on observed application accesses to the line can improve the miss rate, even over the optimal for the fixed line size, as well as significantly reduce the memory traffic.
Abstract: A cache line size has a significant effect on miss rate and memory traffic. Today’s computers use a fixed line size, typically 32B, which may not be optimal for a given application. Optimal size may also change during application execution. This paper describes a cache in which the line (fetch) size is continuously adjusted by hardware based on observed application accesses to the line. The approach can improve the miss rate, even over the optimal for the fixed line size, as well as significantly reduce the memory traffic.

155 citations


Proceedings ArticleDOI
03 Aug 1999
TL;DR: This work describes an architecture for data intensive applications where a high-speed distributed data cache is used as a common element for all of the sources and sinks of data, and provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system.
Abstract: Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data at multiple sites around the world. The technologies, the middleware services, and the architectures that are used to build useful high-speed, wide area distributed systems, constitute the field of data intensive computing. We describe an architecture for data intensive applications where we use a high-speed distributed data cache as a common element for all of the sources and sinks of data. This cache-based approach provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system. We describe our implementation of this cache, how we have made it "network aware ", and how we do dynamic load balancing based on the current network conditions. We also show large increases in application throughput by access to knowledge of the network conditions.

99 citations


Journal ArticleDOI
TL;DR: A set of cost measures that can be applied to parallel algorithms to predict their computation, data access and communication performance make it possible to compare different parallel implementation strategies for data mining techniques without benchmarking each one.
Abstract: This article presents a set of cost measures that can be applied to parallel algorithms to predict their computation, data access and communication performance. These measures make it possible to compare different parallel implementation strategies for data mining techniques without benchmarking each one.

65 citations


Proceedings ArticleDOI
01 Jun 1999
TL;DR: The design and implementation of a system, called Ajents, which provides the software infrastructure necessary to support a similar level of seamless access to organization-wide or world-wide heterogeneous computing resources.
Abstract: The rapid proliferation of the World-Wide Web has been due to the seamless access it provides to information that is distributed both within organizations and around the world. In this paper, we describe the design and implementation of a system, called Ajents, which provides the software infrastructure necessary to support a similar level of seamless access to organization-wide or world-wide heterogeneous computing resources. Ajents introduces class libraries which are written entirely in Java and that run on any standard compliant Java virtual machine. These class libraries implement and combine several important features that are essential to supporting distributed and parallel computing using Java. These features include: the ability to easily create objects on remote hosts, to interact with those objects through either synchronous or asynchronous remote method invocations, and to freely migrate objects to heterogeneous hosts. While some of these features have been implemented in other systems, Ajents provides support for the combination of all of these features using techniques that permit them to operate together in a fashion that is more transparent and/or and less restrictive than existing systems. Our experimental results show that in our test environment: we are able to achieve good speedup on a sample parallel application; the overheads introduced by our implementation do not adversely a ect remote method invocation times; and (somewhat surprisingly) the cost of migration does not greatly impact the execution time of an example application.

54 citations


Journal ArticleDOI
TL;DR: IPG will be able to support larger applications than ever before, such as multidisciplinary collaboration environments that couple geographically dispersed compute, data, scientific instruments, and people resources together using a suite of grid-wide services.
Abstract: ASA’s Information Power Grid is an example of an emerging, exciting concept that can potentially make high-performance computing power accessible to general users as easily and seamlessly as electricity from an electrical power grid. In the IPG system, high-performance computers located at geographically distributed sites will be connected via a high-speed interconnection network. Users will be able to submit computational jobs at any site, and the system will seek the best available computational resources, transfer the user’s input data sets to that system, access other needed data sets from remote sites, perform the specified computations and analysis, and then return the resulting data sets to the user. Systems such as the IPG will be able to support larger applications than ever before. New types of applications will also be enabled, such as multidisciplinary collaboration environments that couple geographically dispersed compute, data, scientific instruments, and people resources together using a suite of grid-wide services. IPG’s fundamental technology comes from current research results in the area of large-scale computational grids. Figure 1 provides an intuitive view of a wide-area computational grid.

53 citations



Journal ArticleDOI
TL;DR: Efficient algorithms for computing the reliability of a distributed program running on other restricted classes of networks, including series-parallel, 2-tree, a tree, or a star structure are presented.

43 citations


ReportDOI
01 Dec 1999
TL;DR: This report presents a specification for the Portals 3.0 message passing interface, designed to support a parallel computing platform composed of clusters of commodity workstations connected by a commodity system area network fabric.
Abstract: This report presents a specification for the Portals 3.0 message passing interface. Portals 3.0 is intended to allow scalable, high-performance network communication between nodes of a parallel computing system. Specifically, it is designed to support a parallel computing platform composed of clusters of commodity workstations connected by a commodity system area network fabric. In addition, Portals 3.0 is well suited to massively parallel processing and embedded systems. Portals 3.0 represents an adoption of the data movement layer developed for massively parallel processing platforms, such as the 4500-node Intel TeraFLOPS machine.

42 citations


Proceedings ArticleDOI
05 Sep 1999
TL;DR: A Java-based platform, called JAMES, that provides support for parallel computing and a software module that supports the well-known model of master/worker, and some experimental results that compare the master/ worker model with the usual model of migratory agents are presented.
Abstract: Mobile code is a promising model for distributed computing and it has been exploited in several areas of applications. One of the areas that may benefit from the use of mobile agent technology is parallel processing. This paper describes a Java-based platform, called JAMES, that provides support for parallel computing. We have implemented a software module that supports the well-known model of master/worker and we have exploited the use of parallel computing in some distributed tasks. We present some experimental results that compare the master/worker model with the usual model of migratory agents. Then, we compare the use of mobile agents with two other solutions for parallel computing: MPI and JET. The results are quite promising.

27 citations


Journal ArticleDOI
01 Dec 1999
TL;DR: The evolution of heterogeneous concurrent computing, in the context of the parallel virtual machine (PVM) system, is discussed, which highlights the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments.
Abstract: Heterogeneous network-based distributed and parallel computing is gaining increasing acceptance as an alternative or complementary paradigm to multiprocessor-based parallel processing as well as to conventional supercomputing. While algorithmic and programming aspects of heterogeneous concurrent computing are similar to their parallel processing counterparts, system issues, partitioning and scheduling, and performance aspects are significantly different. In this paper, we discuss the evolution of heterogeneous concurrent computing, in the context of the parallel virtual machine (PVM) system, a widely adopted software system for network computing. In particular, we highlight the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments. We also present recent developments and experiences in the PVM project, and comment on ongoing and future work.

Book ChapterDOI
12 Apr 1999
TL;DR: The history and future directions of the field of data intensive computing are explored, including a specific medical application example, and some of the technologies used to build useful high-speed, wide area distributed systems are explored.
Abstract: Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data from around the world, as well as employing large-scale computation. The distributed systems that solve large-scale problems will always involve aggregating and scheduling many resources. Data must be located and staged, cache and network capacity must be available at the same time as computing capacity, etc. Every aspect of such a system is dynamic: locating and scheduling resources, adapting running application systems to availability and congestion in the middleware and infrastructure, responding to human interaction, etc. The technologies, the middleware services, and the architectures that are used to build useful high-speed, wide area distributed systems, constitute the field of data intensive computing. This paper explores some of the history and future directions of that field, and describes a specific medical application example.

ReportDOI
TL;DR: This paper describes an architecture for data intensive applications where a high-speed distributed data cache is used as a common element for all of the sources and sinks of data and provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system.
Abstract: Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data at multiple sites around the world. The technologies, the middleware services, and the architectures that are used to build useful high-speed, wide area distributed systems, constitute the field of data intensive computing. In this paper the authors describe an architecture for data intensive applications where they use a high-speed distributed data cache as a common element for all of the sources and sinks of data. This cache-based approach provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system. They describe their implementation of this cache, how they have made it network aware, and how they do dynamic load balancing based on the current network conditions. They also show large increases in application throughput by access to knowledge of the network conditions.



Proceedings Article
01 Jan 1999
TL;DR: A Gardens cluster monitoring system called Gardmon is designed and developed that is a portable, flexible, interactive, scalable, locationtransparent, and comprehensive environment for monitoring of Gardens runtime activities.
Abstract: The QUT’s Gardens project aims to create a virtual parallel machine out of a network of nondedicated computers (workstations/PCs). These systems are interconnected through low latency and high bandwidth communication links such as Myrinet. Gardens is an integrated programming language and system designed to utilize the idle workstation’s CPU cycles to support adaptive parallel computing. A Gardens computation consists of a network of communicating tasks, dynamically mapped onto a network of processors. Tasks are created dynamically and each task consists of a stack and a collection of heap segments in which dynamic data structures are stored. We designed and developed a Gardens cluster monitoring system called Gardmon. It is a portable, flexible, interactive, scalable, locationtransparent, and comprehensive environment for monitoring of Gardens runtime activities. It follow s client-server methodology and provides transparent access to all nodes to be monitored from a monitoring machine. The features of Gardmon in monitoring Gardens adaptive parallel computing system seem satisfactory.

Proceedings ArticleDOI
12 Apr 1999
TL;DR: This work evaluates the performance of a Java/WWW-based infrastructure to gain insights into the feasibility and projected cost of initializing and maintaining wide-area hierarchies that contain up to one million nodes.
Abstract: The millions of Java-capable computers on the Internet provide the hardware needed for a national and international computing infrastructure, a virtual parallel computer that can be tapped for many uses. To date, however, little is known about the cost and feasibility of building and maintaining such global, large scale structures. In this work, we evaluate the performance of a Java/WWW-based infrastructure to gain insights into the feasibility and projected cost of initializing and maintaining wide-area hierarchies that contain up to one million nodes.

Journal ArticleDOI
01 Mar 1999
TL;DR: A good time for parallel computer development and research in both academia and industry is described in this paper, which analyzes some of the reasons for the sudden acceptance of the relatively old parallel computing field, outlines the key properties of a successful parallel computer of the 1990's, and identifies some important research areas and key technologies for the future.
Abstract: This is a good time for parallel computer development and research in both academia and industry. The performance improvements predicted by Moore's Law have proven to be quite accurate over many years. However, the doubling of processor performance every 18 months cannot keep up with the growing demand of many applications. The performance of database applications has been doubling every nine to ten months. At last, parallel computer technology has come to play an important role in the commercial marketplace. Multiprocessing has been an active research area for almost 40 years and commercial parallel computers have been available for more than 35 years. After getting off to a slow start, this area has now taken off. Shared memory multiprocessors have dominated this development. This is an area that has sprung out of tireless research and numerous published breakthrough results. The article analyzes some of the reasons for the sudden acceptance of the relatively old parallel computing field, outlines the key properties of a successful parallel computer of the 1990's, and identifies some important research areas and key technologies for the future.

Book ChapterDOI
12 Apr 1999
TL;DR: This research demonstrates a Java based system that allows a naive user to make effective use of local resources for parallel computing and provides a “point-and-click” interface that manages idle workstations, dedicated clusters and remote computational resources so that they can be used for parallel Computing.
Abstract: Recent advances in software and hardware for clustered computing have allowed scientists and computing specialists to take advantage of commodity processors in solving challenging computational problems. The setup, management and coding involved in parallel programming along with the challenges of heterogeneous computing machinery prevent most non-technical users from taking advantage of compute resources that may be available to them. This research demonstrates a Java based system that allows a naive user to make effective use of local resources for parallel computing. The DOGMA system provides a “point-and-click” interface that manages idle workstations, dedicated clusters and remote computational resources so that they can be used for parallel computing. Just as the “web browser” enabled use of the Internet by the “Masses”, we see simplified user interfaces to parallel processing as being critical to widespread use. This paper describes many of the barriers to widespread use and shows how they are addressed in this research.

Proceedings ArticleDOI
31 May 1999
TL;DR: This study provides the common features of this problem domain and case studies for the application of the proposed approach, JAM (Java Applet in Massively parallel computing), which could have the most powerful high performance distributed computing scheme that has never been achieved for the authors' problem domain.
Abstract: Traditional high performance computing is based on massively parallel processor (MPP) supercomputers or high-end workstation clusters connected with high-speed networks. These approaches require a tight-coupled federation of computing resources that result in high cost and complex administration involvement. Today, the ubiquitous World Wide Web provides a new opportunity and paradigm for high performance distributed computing with millions of Internet connected computers. We present a novel approach to achieve high performance distributed computing in the Internet with distributed object technology: JAM (Java Applet in Massively parallel computing). Our study provides the common features of this problem domain and case studies for the application of the proposed approach. With the massive availability of Internet computers in this approach, we could have the most powerful high performance distributed computing scheme that has never been achieved for our problem domain.

Journal ArticleDOI
TL;DR: The effect of major factors on the performance and scalability of a cluster of workstations connected by an Ethernet network is illustrated by evaluating the performance of this computing environment in the execution of a parallel ray tracing application through analytical modeling and extensive experimentation.
Abstract: Parallel computing on clusters of workstations is receiving much attention from the research community. Unfortunately, many aspects of parallel computing over this parallel computing engine is not very well understood. Some of these issues include the workstation architectures, the network protocols, the communication-to-computation ratio, the load balancing strategies, and the data partitioning schemes. The aim of this paper is to assess the strengths and limitations of a cluster of workstations by capturing the effects of the above issues. This has been achieved by evaluating the performance of this computing environment in the execution of a parallel ray tracing application through analytical modeling and extensive experimentation. We were successful in illustrating the effect of major factors on the performance and scalability of a cluster of workstations connected by an Ethernet network. Moreover, our analytical model was accurate enough to agree closely with the experimental results. Thus, we feel that such an investigation would be helpful in understanding the strengths and weaknesses of an Ethernet cluster of workstation in the execution of parallel applications.

Proceedings Article
01 Jan 1999
TL;DR: An undergraduate distributed computing course that focuses on the fundamental principles common to multimedia, client-server, parallel, web and collaborative computing.
Abstract: This paper proposes an undergraduate distributed computing course that focuses on the fundamental principles common to multimedia, client-server, parallel, web and collaborative computing. This computer science course should actively engage the students in exploring the concepts of distributed computing. Several extended projects using the language Java

Journal ArticleDOI
TL;DR: DOGMA is presented, a Java based system which simplifies parallel computing on heterogeneous computers and provides a unified environment for developing high performance parallel applications on heterogeneity systems.
Abstract: Heterogeneous distributed computing has traditionally been a problematic undertaking which increases in complexity as heterogeneity increases. This paper presents results obtained using DOGMA--a Java based system which simplifies parallel computing on heterogeneous computers. The performance of Java just-in-time compilers currently approaches C++ for many applications, making Java a serious contender for parallel application development. DOGMA provides support for dedicated clusters as well as idle workstations through the use of a web based browse-in feature or the DOGMA screen saver. DOGMA supports parallel programming in both a traditional message passing form and a novel object-oriented approach. This research provides a unified environment for developing high performance parallel applications on heterogeneous systems.

Journal ArticleDOI
TL;DR: If a computational problem can be solved in a loosely-coupled distributed memory environment, a Beowulf cluster-or Pile of PCs (POP)-may be the answer; and it "weighs in" at a price point traditional parallel computer manufacturers cannot touch.
Abstract: Linux is just now making a significant impact on the computing industry, but it has been a powerful tool for computer scientists and computational scientists for a number of years. Aside from the obvious benefits of working with freely-available, reliable, and efficient open source operating system [1], the advent of Beowulf-style cluster computing-pioneered by Donald Becker, Thomas Sterling, et al. [2] at NASA's Goddard Space Flight Center-extends the utility of Linux to the realm of high performance parallel computing. Today, these commodity PC-based clusters are cropping up in federal research laboratories, industrial R&D centers, universities, and even small colleges [3, 4]. If a computational problem can be solved in a loosely-coupled distributed memory environment, a Beowulf cluster-or Pile of PCs (POP)-may be the answer; and it \"weighs in\" at a price point traditional parallel computer manufacturers cannot touch.


Journal ArticleDOI
TL;DR: Trends in many areas of computation are moving toward multifunctional applications in science and engineering, and continuing system performance improvement is necessary in areas ranging from CSE to ERP (enterprise resource planning), and parallelism can provide the increasing speed and memory sizes needed.
Abstract: Trends in many areas of computation are moving toward multifunctional applications in science and engineering. Developing multifunctional applications requires reusing existing applications. This has led many forward-looking independent software vendors (ISVs) to team up to produce joint products. The products that result provide users with more benefits but at the cost of requiring more capable computer systems. Continuing system performance improvement is necessary in areas ranging from CSE (computing in science and engineering) to ERP (enterprise resource planning), and parallelism can provide the increasing speed and memory sizes needed. Parallelism in computing is as old as the first two decades of electronic computers, and it is as new as four-processor parallel workstations or 4000-processor massively parallel supercomputers. Most ISVs are now motivated to develop parallel applications, but new efforts are often haunted by the broken schedules and superlinear resource demands of many past parallelism projects. While the Department of Energy's ASCI project, for example, can still afford to put multiyear efforts into new application development, most development projects cannot. Most of the software used today, even in large enterprises, is produced by ISVs-not users or enterprises. Because the 20000-odd ISVs are mostly small, discipline-focused companies, practical approaches to parallel software development are crucial.

Proceedings ArticleDOI
12 Apr 1999
TL;DR: The runtime system is extended to run on a cluster made up of a heterogeneous computer environment, and the same program in the homogeneous environment runs in the heterogeneous environment in this extension.
Abstract: A parallel programming system, called MPC++, provides parallel primitives such as remote function invocation, a global pointer, and a synchronization structure using the C++ template feature. The system has run on a cluster of homogeneous computers. In this paper, the runtime system is extended to run on a cluster made up of a heterogeneous computer environment. Unlike other distributed or parallel programming systems on heterogeneous computers, the same program in the homogeneous environment runs in the heterogeneous environment in this extension.


Proceedings ArticleDOI
01 May 1999
TL;DR: A novel data predictor, called the Cyclic Dependence based data Predictor (CDP), is proposed, which shows that significant reduction in memory latency can be obtained, especially for those using complex data pointer structures.
Abstract: Current work in data prediction and prefetching is mainly focused on one individual predictor per each class of data references. Despite the increasing complexity of these hybrid predictors, their prefetch coverage is still very limited. To reduce the complexity of the predictors and to expand the coverage of data prediction, we propose a novel data predictor, called the Cyclic Dependence based data Predictor (CDP), in this paper. Based on the runtime analysis for value dependence, registers used in the address calculation of a memory access instruction in a loop are classified into cyclic dependent registers and acyclic dependent registers. The complexity and predictability of data references will be determined by the path length of the cycle that contains the index pointer registers. To further illustrate its importance, we generalize our previously proposed Reference Value Prediction Cache [5] with this CDP predictor. Simulation shows that significant reduction in memory latency can be obtained, especially for those using complex data pointer structures.