scispace - formally typeset
Search or ask a question
Topic

Meta Data Services

About: Meta Data Services is a research topic. Over the lifetime, 2564 publications have been published within this topic receiving 40102 citations.


Papers
More filters
Patent
Tong Li1, Xin Sheng Mao1, Jia Tan1, Bo Yang1
28 Apr 2014
TL;DR: A technique for deploying an application in a cloud computing environment includes collecting metadata and instructions on deploying the application, the metadata comprising service metadata, application metadata and topology metadata as mentioned in this paper.
Abstract: A technique for deploying an application in a cloud computing environment includes: collecting, when a user is deploying an application, metadata and instructions on deploying the application, the metadata comprising service metadata, application metadata and topology metadata, wherein the service metadata comprise metadata on a service required for deploying the application, the application metadata comprise metadata on the application, and the topology metadata comprise metadata indicative of a relationship between the service and the application; and storing the collected metadata and instructions as a model for re-deploying the application.

22 citations

Proceedings Article
02 Sep 2013
TL;DR: Results indicate that Dryad's overall workflow builds metadata capital, with the total metadata reuse at 50% or greater for 8 of 12 metadata properties, and 5 of these 8 properties showing reuse at 80% or higher.
Abstract: This paper reports on a study exploring 'metadata capital' acquired via metadata reuse. Collaborative modeling and content analysis methods were used to study metadata capital in the Dryad data repository. A sample of 20 cases for two Dryad metadata workflows (Case A and Case B) consisting of 100 (60 metadata objects, 40 metadata activities) instantiations was analyzed. Results indicate that Dryad's overall workflow builds metadata capital, with the total metadata reuse at 50% or greater for 8 of 12 metadata properties, and 5 of these 8 properties showing reuse at 80% or higher. Metadata reuse is frequent for basic bibliographic properties (e.g., author, title, subject), although it is limited or absent for more complex scientific properties (e.g., taxonomic, spatial, and temporal information). This paper provides background context, reports the research approach and findings. Research implications and system design priorities that may contribute to metadata capital are also considered.

22 citations

Proceedings Article
01 Jan 2008
TL;DR: This paper presents how to compare the expressiveness and richness of a metadata schema for an application and to what extent music application domains and metadata schemas make use of the metadata field clusters.
Abstract: With the ever-increasing amount of digitized music becoming available, metadata is a key driver for different music related application domains. A service that combines different metadata sources should be aware of the existence of different schemas to store and exchange music metadata. The user of a metadata provider could benefit from knowledge about the metadata needs for different music application domains. In this paper, we present how we can compare the expressiveness and richness of a metadata schema for an application. To cope with different levels of granularity in metadata fields we defined clusters of semantically related metadata fields. Similarly, application domains were defined to tackle the fine-grained functionality space in music applications. Next is shown to what extent music application domains and metadata schemas make use of the metadata field clusters. Finally, we link the metadata schemas with the application domains. A decision table is presented that assists the user of a metadata provider in choosing the right metadata schema for his application.

22 citations

Journal ArticleDOI
TL;DR: The intuition that underpins cleaning by clustering is that, dividing keys into different clusters resolves the scalability issues for data observation and cleaning, and keys in the same cluster with duplicates and errors can easily be found.
Abstract: The ability to efficiently search and filter datasets depends on access to high quality metadata. While most biomedical repositories require data submitters to provide a minimal set of metadata, some such as the Gene Expression Omnibus (GEO) allows users to specify additional metadata in the form of textual key-value pairs (e.g. sex: female). However, since there is no structured vocabulary to guide the submitter regarding the metadata terms to use, consequently, the 44,000,000+ key-value pairs in GEO suffer from numerous quality issues including redundancy, heterogeneity, inconsistency, and incompleteness. Such issues hinder the ability of scientists to hone in on datasets that meet their requirements and point to a need for accurate, structured and complete description of the data. In this study, we propose a clustering-based approach to address data quality issues in biomedical, specifically gene expression, metadata. First, we present three different kinds of similarity measures to compare metadata keys. Second, we design a scalable agglomerative clustering algorithm to cluster similar keys together. Our agglomerative cluster algorithm identified metadata keys that were similar, based on (i) name, (ii) core concept and (iii) value similarities, to each other and grouped them together. We evaluated our method using a manually created gold standard in which 359 keys were grouped into 27 clusters based on six types of characteristics: (i) age, (ii) cell line, (iii) disease, (iv) strain, (v) tissue and (vi) treatment. As a result, the algorithm generated 18 clusters containing 355 keys (four clusters with only one key were excluded). In the 18 clusters, there were keys that were identified correctly to be related to that cluster, but there were 13 keys which were not related to that cluster. We compared our approach with four other published methods. Our approach significantly outperformed them for most metadata keys and achieved the best average F-Score (0.63). Our algorithm identified keys that were similar to each other and grouped them together. Our intuition that underpins cleaning by clustering is that, dividing keys into different clusters resolves the scalability issues for data observation and cleaning, and keys in the same cluster with duplicates and errors can easily be found. Our algorithm can also be applied to other biomedical data types.

22 citations

Proceedings ArticleDOI
01 Jan 2009
TL;DR: The Telematikplattform für Medizinische Forschungsnetze, an umbrella organization for medical research in Germany, aims at supporting and improving this process with a metadata repository, covering the variables and value lists used in databases of registers and trials.
Abstract: The planning of case report forms (CRFs) in clinical trials or databases in registers is mostly an informal process starting from scratch involving domain experts, biometricians, and documentation specialists. The Telematikplattform fur Medizinische Forschungsnetze, an um brella organization for medical research in Germany, ai ms at supporting and improving this process with a metadata repository, covering the variables and value lists used in databases of registers and trials. The use cases for the metadata repository range from a specification of case report forms to the harmonization of variable collections, variables, and value lists through a form al review. The warehouse used for the storage of the metadata should at least fulfill the definition of part 3 "Registry metamodel and basic attributes" of ISO/IEC 11179 Information technology - Metadata registries. An implementation of the metadata repository should offer an import and export of metadata in the Operational Data Model standard of the Clinical Data Interchange Standards Consortium. It will facilitate the creation of CRFs and data models, improve the quality of CRFs and data models, support the harmonization of variables and value lists, and support the mapping of metadata and data.

22 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Metadata
43.9K papers, 642.7K citations
82% related
Web service
57.6K papers, 989K citations
80% related
Ontology (information science)
57K papers, 869.1K citations
78% related
User interface
85.4K papers, 1.7M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202313
202261
20212
20202
20196
20188