scispace - formally typeset
Search or ask a question

Showing papers by "Carl Kesselman published in 2007"


Proceedings ArticleDOI
25 Jun 2007
TL;DR: This work uses trace-based simulations to compare the application performance and cost using the provisioned and the best effort approach with a number of artificially generated workflow-structured applications and a seismic hazard application from the earthquake science community.
Abstract: The resource availability in Grids is generally unpredictable due to the autonomous and shared nature of the Grid resources and stochastic nature of the workload resulting in a best effort quality of service. The resource providers optimize for throughput and utilization whereas the users optimize for application performance. We present a cost-based model where the providers advertise resource availability to the user community. We also present a multi-objective genetic algorithm formulation for selecting the set of resources to be provisioned that optimizes the application performance while minimizing the resource costs. We use trace-based simulations to compare the application performance and cost using the provisioned and the best effort approach with a number of artificially generated workflow-structured applications and a seismic hazard application from the earthquake science community. The provisioned approach shows promising results when the resources are under high utilization and/or the applications have significant resource requirements.

108 citations


Book ChapterDOI
01 Jan 2007
TL;DR: The Southern California Earthquake Center (SCEC) is a community of more than 400 scientists from over 54 research organizations that conducts geophysical research in order to develop a physics-based understanding of earthquake processes and to reduce the hazard from earthquakes in the Southern California region.
Abstract: The Southern California Earthquake Center (SCEC) is a community of more than 400 scientists from over 54 research organizations that conducts geophysical research in order to develop a physics-based understanding of earthquake processes and to reduce the hazard from earthquakes in the Southern California region [377].

79 citations


Journal Article
TL;DR: Grid technology, an informatics approach to securely federate independently operated computing, storage, and data management resources at the global scale over public networks, meets these core requirements.
Abstract: The Digital Imaging and Communications in Medicine (DICOM) standard defines Radiology medical device interoperability and image data exchange between modalities, image databases - Picture Archiving and Communication Systems (PACS) - and image review end-points. However the scope of DICOM and PACS technology is currently limited to the trusted and static environment of the hospital. In order to meet the demand for ad-hoc tele-radiology and image guided medical procedures within the global healthcare enterprise, a new technology must provide mobility, security, flexible scale of operations, and rapid responsiveness for DICOM medical devices and subsequently medical image data. Grid technology, an informatics approach to securely federate independently operated computing, storage, and data management resources at the global scale over public networks, meets these core requirements. Here we present an approach to federate DICOM and PACS devices for large-scale medical image workflows within a global healthcare enterprise. The Globus MEDICUS (Medical Imaging and Computing for Unified Information Sharing) project uses the standards-based Globus Toolkit Grid infrastructure to vertically integrate a new service for DICOM devices - the DICOM Grid Interface Service (DGIS). This new service translates between DICOM and Grid operations and thus transparently extends DICOM to Globus based Grid infrastructure. This Grid image workflow paradigm has been designed to provide not only solutions for global image communication, but fault-tolerance and disaster recovery using Grid data replication technology. Actual use-case of 40 MEDICUS Grid connected international hospitals of the Childerns Oncology Group and the Neuroblastoma Cancer Foundation and further clinical applications are discussed. The open-source Globus MEDICU http://dev.globus.org/wiki/Incubator/MEDICUS.

66 citations


Proceedings ArticleDOI
19 Sep 2007
TL;DR: In this paper, the authors suggest adaptive pricing as an alternative for allowing reservation of resources, where the price charged for allowing a reservation is based directly on the impact that the reservation has on other users sharing the resource.
Abstract: Application scheduling studies on large-scale shared resources have advocated the use of resource provisioning in the form of advance reservations for providing predictable and deterministic quality of service to applications. Resource scheduling studies however have shown the adverse impact of advance reservations in the form of reduced utilization and increased response time of the resources. Thus, resource providers either disallow reservations or impose restrictions such as minimum notice periods and this reduces the effectiveness of reservations as the means of allocating desired resources at a desired time. In this paper, we suggest adaptive pricing as an alternative for allowing reservation of resources. The price charged for allowing a reservation is based directly on the impact that the reservation has on other users sharing the resource. Using trace-based simulations, we show that adaptive pricing allows users to make reservations at the desired time while making it more expensive than best effort service. Thus, users arc induced to make the correct choice between reservations and best-effort service based on their real needs. Moreover, this pricing scheme is more cost effective and sensitive to the system load as compared to a flat pricing scheme and encourages load balancing across resources.

24 citations


Journal ArticleDOI
01 Jul 2007
TL;DR: These tools include data placement services for the reliable, high-performance, secure, and policy-driven placement of data within a distributed science environment; tools and techniques for the construction, operation, and provisioning of scalable science services; and tools for the detection and diagnosis of failures in end-to-end data placement and distributed application hosting configurations.
Abstract: Petascale science is an end-to-end endeavour, involving not only the creation of massive datasets at supercomputers or experimental facilities, but the subsequent analysis of that data by a user community that may be distributed across many laboratories and universities. The new SciDAC Center for Enabling Distributed Petascale Science (CEDPS) is developing tools to support this end-to-end process. These tools include data placement services for the reliable, high-performance, secure, and policy-driven placement of data within a distributed science environment; tools and techniques for the construction, operation, and provisioning of scalable science services; and tools for the detection and diagnosis of failures in end-to-end data placement and distributed application hosting configurations. In each area, we build on a strong base of existing technology and have made useful progress in the first year of the project. For example, we have recently achieved order-of-magnitude improvements in transfer times (for lots of small files) and implemented asynchronous data staging capabilities; demonstrated dynamic deployment of complex application stacks for the STAR experiment; and designed and deployed end-to-end troubleshooting services. We look forward to working with SciDAC application and technology projects to realize the promise of petascale science.

13 citations


Proceedings Article
25 Jun 2007
TL;DR: This year's HPDC features an exciting, high-quality single track program with topics across the spectrum of areas of interest to the high-performance distributed computing community and reflects the tireless efforts of the program co-chairs.
Abstract: Welcome to the High Performance Distributed Computing 2007. Once again, this years HPDC features an exciting, high-quality single track program with topics across the spectrum of areas of interest to the high-performance distributed computing community. The quality of the program reflects the tireless efforts of the program co-chairs Dr. Jack Dongarra from the University of Tennessee and Dr. David Walker from Cardiff University. Each paper received a careful and thorough review by members of the program committee and I wish to thank them for the effort and thought that they put into the reviewing process. I would also like to thank Dr. Ian Foster and the University of Chicago for hosting the program committee meeting. Over the years, the workshop program has become an important part of HPDC and this year is no exception. The workshops this year were organized by Dr. Ann Chervenak. The excellent range of workshops that we have is due in no small part to her efforts. I would also like to thank all of the workshop chairs the time and effort they have put into organizing their respective workshop programs and proceedings. I would also like to thank Kelly Sutton at the University of Arizona for her efforts in coordinating conference arrangements including the hotel and publications. This year marks the sixteenth year that HPDC has been held. Both the conference and HPDC as a field have come along way from the first meeting held in Syracuse, New York. We are now at a point where there are production high performance computing infrastructures operating at global level and HPDC is playing a critical role in solving some of the most challenging and important problems facing us today. I believe that the knowledge exchanged and the community built by HPDC over the past sixteen years has played a significant role in these advances. On a personal note, the past year saw the passing of two prominent members of the HPDC community. Both Ken and Jim had made significant contributions to the high performance distributed computing and they will be missed.

12 citations


Journal ArticleDOI
01 Jul 2007
TL;DR: The recent release of the Intergovernmental Panel on Climate Change (IPCC) 4th Assessment Report (AR4) has generated significant media attention and ESG is leading the effort in the climate community towards standardization of material for the global federation of metadata, security, and data services required to standardize, analyze, and access data worldwide.
Abstract: The recent release of the Intergovernmental Panel on Climate Change (IPCC) 4th Assessment Report (AR4) has generated significant media attention. Much has been said about the U.S. role in this report, which included significant support from the Department of Energy through the Scientific Discovery through Advanced Computing (SciDAC) and other Department of Energy (DOE) programs for climate model development and the production execution of simulations. The SciDAC-supported Earth System Grid Center for Enabling Technologies (ESG-CET) also played a major role in the IPCC AR4: all of the simulation data that went into the report was made available to climate scientists worldwide exclusively via the ESG-CET. At the same time as the IPCC AR4 database was being developed, the National Center for Atmospheric Research (NCAR), a leading U.S. climate science laboratory and a ESG participant, began publishing model runs from the Community Climate System Model (CCSM), and its predecessor the Parallel Coupled Model (PCM) through ESG. In aggregate, ESG-CET provides seamless access to over 180 terabytes of distributed climate simulation data to over 6,000 registered users worldwide, who have taken delivery of more than 250 terabytes from the archive. Not only does this represent a substantial advance in scientific knowledge, it is also a major step forward in how we conduct the research process on a global scale. Moving forward, the next IPCC assessment report, AR5, will demand multi-site metadata federation for data discovery and cross-domain identity management for single sign- on of users in a more diverse federation enterprise environment. Towards this aim, ESG is leading the effort in the climate community towards standardization of material for the global federation of metadata, security, and data services required to standardize, analyze, and access data worldwide.

8 citations


Journal ArticleDOI
TL;DR: The functional imaging laboratory (funcLAB/G) as mentioned in this paper uses Statistical Parametric Mapping (SPM) for image processing and its extension to secure sharing and availability for the community using standards-based Grid technology (Globus Toolkit).

5 citations


Posted Content
TL;DR: The Earth System Grid (ESG) as mentioned in this paper is a collaborative interdisciplinary project aimed at addressing the challenge of enabling management, discovery, access, and analysis of these critically important datasets in a distributed and heterogeneous computational environment.
Abstract: Understanding the earth's climate system and how it might be changing is a preeminent scientific challenge. Global climate models are used to simulate past, present, and future climates, and experiments are executed continuously on an array of distributed supercomputers. The resulting data archive, spread over several sites, currently contains upwards of 100 TB of simulation data and is growing rapidly. Looking toward mid-decade and beyond, we must anticipate and prepare for distributed climate research data holdings of many petabytes. The Earth System Grid (ESG) is a collaborative interdisciplinary project aimed at addressing the challenge of enabling management, discovery, access, and analysis of these critically important datasets in a distributed and heterogeneous computational environment. The problem is fundamentally a Grid problem. Building upon the Globus toolkit and a variety of other technologies, ESG is developing an environment that addresses authentication, authorization for data access, large-scale data transport and management, services and abstractions for high-performance remote data access, mechanisms for scalable data replication, cataloging with rich semantic and syntactic information, data discovery, distributed monitoring, and Web-based portals for using the system.

4 citations



01 Dec 2007
TL;DR: In this article, the authors developed a methodology that explicitly incorporates deterministic source and wave propagation effects within seismic hazard calculations through the use of physics-based 3D ground motion simulations.
Abstract: Deterministic source and wave propagation effects such as rupture directivity and basin response can have a significant impact on near-fault ground motion levels, particularly at longer shaking periods. CyberShake, as part of the Southern California Earthquake Center’s (SCEC) Community Modeling Environment, is developing a methodology that explicitly incorporates these effects within seismic hazard calculations through the use of physics-based 3D ground motion simulations. To calculate a waveform-based probabilistic hazard curve for a site of interest, we begin with Uniform California Earthquake Rupture Forecast, Version 2 (UCERF2) and identify all ruptures (excluding background seismicity) within 200 km of the site of interest. We convert the UCERF2 rupture definition into multiple rupture variations with differing hypocenter location and slip distribution, which results in about 400,000 rupture variations per site. Strain Green Tensors are calculated for the site of interest using the SCEC Community Velocity Model, Version 4 (CVM4), and then, using reciprocity, we calculate synthetic seismograms for each rupture variation. Peak intensity measures (e.g., spectral acceleration) are then extracted from these synthetics and combined with the original rupture probabilities to produce probabilistic seismic hazard curves for the site. Thus far, we have produced hazard curves for spectral acceleration at a suite of periods ranging from 3 to 10 seconds at about 20 sites in the Los Angeles region, with the ultimate goal being the production of full hazard maps. Our results indicate that the combination of rupture directivity and basin response effects can lead to an increase in the hazard level for some sites, relative to that given by a conventional Ground Motion Prediction Equation (GMPE). Additionally, and perhaps more importantly, we find that the physics-based hazard results are much more sensitive to the assumed magnitude-area relations and magnitude uncertainty estimates used in the definition of the ruptures than is found in the traditional GMPE approach. This reinforces the need for continued development of a better understanding of earthquake source characterization and the constitutive relations that govern the earthquake rupture process.