scispace - formally typeset
Proceedings ArticleDOI

Chimera: a virtual data system for representing, querying, and automating data derivation

Reads0
Chats0
TLDR
The Chimera virtual data system is developed, which combines avirtual data catalog for representing data derivation procedures and derived data, with a virtual data language interpreter that translates user requests into data definition and query operations on the database.
Abstract
A lot of scientific data is not obtained from measurements but rather derived from other data by the application of computational procedures. We hypothesize that explicit representation of these procedures can enable documentation of data provenance, discovery of available methods, and on-demand data generation (so-called "virtual data"). To explore this idea, we have developed the Chimera virtual data system, which combines a virtual data catalog for representing data derivation procedures and derived data, with a virtual data language interpreter that translates user requests into data definition and query operations on the database. We couple the Chimera system with distributed "data grid" services to enable on-demand execution of computation schedules constructed from database queries. We have applied this system to two challenge problems, the reconstruction of simulated collision event data from a high-energy physics experiment, and searching digital sky survey data for galactic clusters, with promising results.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Cloud Computing and Grid Computing 360-Degree Compared

TL;DR: In this article, the authors compare and contrast cloud computing with grid computing from various angles and give insights into the essential characteristics of both the two technologies, and compare the advantages of grid computing and cloud computing.
Journal ArticleDOI

Pegasus: A framework for mapping complex scientific workflows onto distributed systems

TL;DR: The results of improving application performance through workflow restructuring which clusters multiple tasks in a workflow into single entities are presented.
Journal ArticleDOI

A survey of data provenance in e-science

TL;DR: The main aspect of the taxonomy categorizes provenance systems based on why they record provenance, what they describe, how they represent and storeprovenance, and ways to disseminate it.
Journal ArticleDOI

Workflows and e-Science: An overview of workflow system features and capabilities

TL;DR: The taxonomy provides end users with a mechanism by which they can assess the suitability of workflow in general and how they might use these features to make an informed choice about which workflow system would be a good choice for their particular application.
Posted Content

A Taxonomy of Workflow Management Systems for Grid Computing

TL;DR: A taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids is proposed that highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.
References
More filters
Book

The Grid 2: Blueprint for a New Computing Infrastructure

TL;DR: The Globus Toolkit as discussed by the authors is a toolkit for high-throughput resource management for distributed supercomputing applications, focusing on real-time wide-distributed instrumentation systems.
Journal ArticleDOI

The GRID: Blueprint for a New Computing Infrastructure

TL;DR: The main purpose is to update the designers and users of parallel numerical algorithms with the latest research in the field and present the novel ideas, results and work in progress and advancing state-of-the-art techniques in the area of parallel and distributed computing for numerical and computational optimization problems in scientific and engineering application.
Proceedings ArticleDOI

Condor-a hunter of idle workstations

TL;DR: The design, implementation, and performance of the Condor scheduling system, which operates in a workstation environment, are presented and a performance profile of the system is presented that is based on data accumulated from 23 stations during one month.
Proceedings ArticleDOI

Storing and querying ordered XML using a relational database system

TL;DR: This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system, and proposes three order encoding methods that can be used to represent XML order in the relational data model, and also proposes algorithms for translating ordered XPath expressions into SQL using these encoding methods.
Proceedings ArticleDOI

Condor-G: a computation management agent for multi-institutional grids

TL;DR: It is asserted that Condor-G can serve as a general-purpose interface to Grid resources, for use by both end users and higher-level program development tools.
Related Papers (5)