scispace - formally typeset
Search or ask a question
Topic

Metadata repository

About: Metadata repository is a research topic. Over the lifetime, 5841 publications have been published within this topic receiving 121778 citations.


Papers
More filters
Posted Content
TL;DR: This paper describes a system for automatically classifying web sites into industry categories and present performance results based on different combinations of text features and training data that can serve as the basis for a generalized framework for automated metadata creation.
Abstract: In this paper we discuss several issues related to automated text classification of web sites. We analyze the nature of web content and metadata in relation to requirements for text features. We find that HTML metatags are a good source of text features, but are not in wide use despite their role in search engine rankings. We present an approach for targeted spidering including metadata extraction and opportunistic crawling of specific semantic hyperlinks. We describe a system for automatically classifying web sites into industry categories and present performance results based on different combinations of text features and training data. This system can serve as the basis for a generalized framework for automated metadata creation.

91 citations

Journal ArticleDOI
06 Nov 2017
TL;DR: A key element of this work is the definition of hierarchical metadata describing state-of-the-art electronic-structure calculations, which was agreed upon by two teams and is presented in this perspective paper.
Abstract: With big-data driven materials research, the new paradigm of materials science, sharing and wide accessibility of data are becoming crucial aspects. Obviously, a prerequisite for data exchange and big-data analytics is standardization, which means using consistent and unique conventions for, e.g., units, zero base lines, and file formats. There are two main strategies to achieve this goal. One accepts the heterogeneous nature of the community, which comprises scientists from physics, chemistry, bio-physics, and materials science, by complying with the diverse ecosystem of computer codes and thus develops “converters” for the input and output files of all important codes. These converters then translate the data of each code into a standardized, code-independent format. The other strategy is to provide standardized open libraries that code developers can adopt for shaping their inputs, outputs, and restart files, directly into the same code-independent format. In this perspective paper, we present both strategies and argue that they can and should be regarded as complementary, if not even synergetic. The represented appropriate format and conventions were agreed upon by two teams, the Electronic Structure Library (ESL) of the European Center for Atomic and Molecular Computations (CECAM) and the NOvel MAterials Discovery (NOMAD) Laboratory, a European Centre of Excellence (CoE). A key element of this work is the definition of hierarchical metadata describing state-of-the-art electronic-structure calculations.

91 citations

Patent
13 Oct 2006
TL;DR: In this article, a method for displaying and capturing metadata of documents within result presentations in information access or search systems, a metadata server is used for storing a metadata associated with any searchable document.
Abstract: In a method for displaying and capturing metadata of documents within result presentations in information access or search systems, a metadata server is used for storing a metadata associated with any searchable document and the end users are given the opportunity to view and edit metadata associated with documents returned from the metadata server which is capable of automatically creating metadata objects associated with any combination of document query and document position in a result set for a given query A search engine capable of implementing the method comprises a metadata server as part of or connected with its core search engine.

90 citations

Patent
25 Apr 2002
TL;DR: In this article, a data repository abstraction layer provides a logical view of the underlying data repository that is independent of the particular manner of data representation, and a query abstraction layer is also provided.
Abstract: The present invention generally is directed to a system, method and article of manufacture for accessing data independent of the particular manner in which the data is physically represented. In one embodiment, a data repository abstraction layer provides a logical view of the underlying data repository that is independent of the particular manner of data representation. In one embodiment, the data repository abstraction layer specifies a location of data in a repository and a method for accessing the data. A query abstraction layer is also provided and is based on the data repository abstraction layer. A runtime component performs translation of an abstract query into a form that can be used against a particular physical data representation.

90 citations

Proceedings ArticleDOI
21 Jun 2004
TL;DR: This paper presents a data model that can capture the complexity of the data publication and discovery process through the use of descriptive metadata, and identifies a set of interfaces and operations that need to be provided to support metadata management.
Abstract: Data sets being managed in grid environments today are growing at a rapid rate, expected to reach 100s of petabytes in the near future. Managing such large data sets poses challenges for efficient data access, data publication and data discovery. In this paper we focus on the data publication and discovery process through the use of descriptive metadata. This metadata describe the properties of individual data items and collections. We discuss issues of metadata services in service rich environments, such as the grid. We describe the requirements and the architecture for such services in the context of grid and the available grid services. We present a data model that can capture the complexity of the data publication and discovery process. Based on that model we identify a set of interfaces and operations that need to be provided to support metadata management. We present a particular implementation of a grid metadata service, basing it on existing grid services technologies. Finally we examine alternative implementations of that service.

90 citations


Network Information
Related Topics (5)
Information system
107.5K papers, 1.8M citations
85% related
User interface
85.4K papers, 1.7M citations
81% related
Software
130.5K papers, 2M citations
80% related
Mobile computing
51.3K papers, 1M citations
80% related
Support vector machine
73.6K papers, 1.7M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202332
202279
202113
202011
201921
201824