Journal ArticleDOI
MetaStore: an adaptive metadata management framework for heterogeneous metadata models
Reads0
Chats0
TLDR
MetaStore is an adaptive metadata management framework based on a NoSQL database and an RDF triple store that automatically segregates the different categories of metadata in their corresponding data models to maximize the utilization of the data models supported by NoSQL databases.Abstract:
In this paper, we present MetaStore, a metadata management framework for scientific data repositories. Scientific experiments are generating a deluge of data, and the handling of associated metadata is critical, as it enables discovering, analyzing, reusing, and sharing of scientific data. Moreover, metadata produced by scientific experiments are heterogeneous and subject to frequent changes, demanding a flexible data model. Existing metadata management systems provide a broad range of features for handling scientific metadata. However, the principal limitation of these systems is their architecture design that is restricted towards either a single or at the most a few standard metadata models. Support for handling different types of metadata models, i.e., administrative, descriptive, structural, and provenance metadata, and including community-specific metadata models is not possible with these systems. To address this challenge, we present MetaStore, an adaptive metadata management framework based on a NoSQL database and an RDF triple store. MetaStore provides a set of core functionalities to handle heterogeneous metadata models by automatically generating the necessary software code (services) and on-the-fly extends the functionality of the framework. To handle dynamic metadata and to control metadata quality, MetaStore also provides an extended set of functionalities such as enabling annotation of images and text by integrating the Web Annotation Data Model, allowing communities to define discipline-specific vocabularies using Simple Knowledge Organization System, and providing advanced search and analytical capabilities by integrating the ElasticSearch. To maximize the utilization of the data models supported by NoSQL databases, MetaStore automatically segregates the different categories of metadata in their corresponding data models. Complex provenance graphs and dynamic metadata are modeled and stored in an RDF triple store, whereas the static metadata is stored in a NoSQL database. For enabling large-scale harvesting (sharing) of metadata using the METS standard over the OAI-PMH protocol, MetaStore is designed OAI-compliant. Finally, to show the practical usability of the MetaStore framework and that the requirements from the research communities have been realized, we describe our experience in the adoption of MetaStore for three communities.read more
Citations
More filters
Posted Content
Scientific Data Management in the Coming Decade
Jim Gray,David T. Liu,Maria Nieto-Santisteban,Alexander S. Szalay,David J. DeWitt,Gerd Heber +5 more
TL;DR: Analyzing this data to find the subtle effects missed by previous studies requires algorithms that can simultaneously deal with huge datasets and that can find very subtle effects --- finding both needles in the haystack and finding very small haystacks that were undetected in previous measurements.
SKOS Core: Simple Knowledge Organisation for the Web
TL;DR: The main purpose of this paper is to provide an initial basis for establishing clear recommendations for the use of SKOS Core and DCMI Metadata Terms in combination.
EUDAT: A New Cross-Disciplinary Data Infrastructure For Science
TL;DR: The EUDAT project is a pan-European data initiative that aims to build a sustainable cross-disciplinary and cross-national data infrastructure that provides a set of shared services for accessing and preserving research data.
Proceedings ArticleDOI
OCR-D: An end-to-end open source OCR framework for historical printed documents
Clemens Neudecker,Konstantin Baierer,Maria Federbusch,Matthias Boenig,Kay-Michael Würzner,Volker Hartmann,Elisa Herrmann +6 more
TL;DR: The background of OCR-D is introduced, the main challenges and shortcomings in the availability of open tools and resources for OCR of historical printed documents are introduced and the various software modules and related components that are being made available through O CR-D are discussed.
Dissertation
Methodology to sustain common information spaces for research collaborations
TL;DR: A methodology to sustain CIS and a conceptual framework that has its foundations on a set of agreed Core Concepts forming a Canonical Core (CC) are introduced that leverages and promotes reuse of existing standards: EPOS-DCAT-AP.
References
More filters
Web Services Business Process Execution Language Version 2.0
Charlton Barreto,Vaughn Bullard,Thomas Erl,John Evdemon,Diane Jordan,Khanderao Kand,Dieter König,Simon Moser,Ralph Stout,Ron Ten-Hove,Ivana Trickovic,Danny van der Rijn,Alex Yiu +12 more
TL;DR: The continuity of the basic conceptual model between Abstract and Executable Processes in WSBPEL makes it possible to export and import the public aspects embodied in Abstract Processes as process or role templates while maintaining the intent and structure of the observable behavior.
Proceedings Article
Relational Databases for Querying XML Documents: Limitations and Opportunities
Jayavel Shanmugasundaram,Kristin Tufte,Chun Zhang,Gang He,David J. DeWitt,Jeffrey F. Naughton +5 more
TL;DR: It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases.
Journal ArticleDOI
The Open Provenance Model core specification (v1.1)
Luc Moreau,Ben Clifford,Juliana Freire,Joe Futrelle,Yolanda Gil,Paul Groth,Natalia Kwasnikowska,Simon Miles,Paolo Missier,James D. Myers,Beth Plale,Yogesh Simmhan,Eric G. Stephan,Jan Van den Bussche +13 more
TL;DR: This document contains the specification of the Open Provenance Model (v1.1) resulting from a community effort to achieve inter-operability in the Provenances Challenge series.
Journal ArticleDOI
The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud
Katherine Wolstencroft,Robert Haines,Donal Fellows,Alan Williams,David Withers,Stuart Owen,Stian Soiland-Reyes,Ian Dunlop,Aleksandra Nenadic,Paul R. Fisher,Jiten Bhagat,Khalid Belhajjame,Finn Bacall,Alex Hardisty,Abraham Nieva de la Hidalga,Maria Paula Balcázar Vargas,Shoaib Sufi,Carole Goble +17 more
TL;DR: An update to the taverna tool suite is provided, highlighting new features and developments in the workbench and the Taverna Server.