Topic

Metadata repository

About: Metadata repository is a research topic. Over the lifetime, 5841 publications have been published within this topic receiving 121778 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

HopsFS: Scaling Hierarchical File System Metadata Using NewSQL Databases

[...]

Salman Niazi¹, Mahmoud Ismail¹, Steffen Grohsschmiedt, Mikael Ronström², Seif Haridi¹, Jim Dowling¹ - Show less +2 more•Institutions (2)

Royal Institute of Technology¹, Oracle Corporation²

06 Jun 2016-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: HopsFS is introduced, a next generation distribution of the Hadoop Distributed File System that replaces HDFS' single node in-memory metadata service, with a distributed metadata service built on a NewSQL database that enables an order of magnitude larger and higher throughput clusters compared to HDFS.

...read moreread less

Abstract: Recent improvements in both the performance and scalability of shared-nothing, transactional, in-memory NewSQL databases have reopened the research question of whether distributed metadata for hierarchical file systems can be managed using commodity databases. In this paper, we introduce HopsFS, a next generation distribution of the Hadoop Distributed File System (HDFS) that replaces HDFS' single node in-memory metadata service, with a distributed metadata service built on a NewSQL database. By removing the metadata bottleneck, HopsFS enables an order of magnitude larger and higher throughput clusters compared to HDFS. Metadata capacity has been increased to at least 37 times HDFS' capacity, and in experiments based on a workload trace from Spotify, we show that HopsFS supports 16 to 37 times the throughput of Apache HDFS. HopsFS also has lower latency for many concurrent clients, and no downtime during failover. Finally, as metadata is now stored in a commodity database, it can be safely extended and easily exported to external systems for online analysis and free-text search.

...read moreread less

47 citations

Journal Article•DOI•

Metadata Topic Harmonization and Semantic Search for Linked‐Data‐Driven Geoportals: A Case Study Using ArcGIS Online

[...]

Yingjie Hu¹, Krzysztof Janowicz¹, Sathya Prasad², Song Gao¹•Institutions (2)

University of California, Santa Barbara¹, Esri²

01 Jun 2015-Transactions in Gis

TL;DR: A natural language processing method is employed, namely Labeled Latent Dirichlet Allocation (LLDA), and a regression model is trained via a human participants experiment to address the topic heterogeneity brought by multiple metadata standards and the lack of established semantic search in Linked‐Data‐driven geoportals.

...read moreread less

Abstract: Geoportals provide integrated access to geospatial resources, and enable both authorities and the general public to contribute and share data and services. An essential goal of geoportals is to facilitate the discovery of the available resources. Such process heavily relies on the quality of metadata. While multiple metadata standards have been established, data contributers may adopt different standards when sharing their data via the same geoportal. This is especially the case for user-generated content where various terms and topics can be introduced to describe similar datasets. While this heterogeneity provides a wealth of perspectives, it also complicates resource discovery. With the fast development of the Semantic Web technologies, there is a rise of Linked-Data-driven portals. Although these novel portals open up new ways to organizing metadata and retrieving resources, they lack effective semantic search methods. This paper addresses the two challenges discussed above, namely the topic heterogeneity brought by multiple metadata standards as well as the lack of established semantic search in Linked-Data-driven geoportals. To harmonize the metadata topics, we employ a natural language processing method, namely Labeled Latent Dirichlet Allocation (LLDA), and train it using standardized metadata from Data.gov. With respect to semantic search, we construct thematic and geographic matching features from the textual metadata descriptions, and train a regression model via a human participants experiment. We evaluate our methods by examining their performances in addressing the two issues. Finally, we implement a semantics-enabled and Linked-Data-driven prototypical geoportal using a sample dataset from Esri’s ArcGIS Online.

...read moreread less

47 citations

Patent•

Interest-driven business intelligence systems and methods of data analysis using interest-driven data pipelines

[...]

John Glenn Eshleman, Benjamin Mark Werther, Kevin Scott Beyer, Brian Babcock, Yewei Zhang - Show less +1 more

28 Feb 2013

TL;DR: In this paper, an interest-driven data pipeline is compiled based upon reporting data requirements automatically derived from at least one report specification defined using the metadata, and the pipeline is automatically compiled to generate reporting data using the raw data.

...read moreread less

Abstract: Interest-driven Business Intelligence (BI) systems in accordance with embodiments of the invention are illustrated. In one embodiment of the invention, a data processing system includes raw data storage containing raw data, metadata storage containing metadata that describes the raw data, and an interest-driven data pipeline that is automatically compiled to generate reporting data using the raw data, wherein the interest-driven data pipeline is compiled based upon reporting data requirements automatically derived from at least one report specification defined using the metadata.

...read moreread less

46 citations

Patent•

Data management server, data management system, data management method, and program

[...]

Gen Hamada¹, Kazuto Horimatsu¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

18 Apr 2011

TL;DR: In this paper, a data management server that is connectable to a plurality of content servers that store content data and metadata that includes content data attribute information and to a client device that acquires the content data based on the metadata.

...read moreread less

Abstract: There is provided a data management server that is connectable to a plurality of content servers that store content data and metadata that includes content data attribute information and to a client device that acquires the content data based on the metadata. The data management server includes a data collection portion, a data processing portion, and a transmission portion. The data collection portion collects the metadata from each of the plurality of the content servers. The data processing portion hierarchically structures the metadata that the data collection portion collected, based on the attribute information that is included in the metadata. The transmission portion, in response to a request from the client device, transmits to the client device the metadata that was hierarchically structured by the data processing portion.

...read moreread less

46 citations

Proceedings Article•DOI•

Developing practical automatic metadata assignment and evaluation tools for internet resources

[...]

Gordon W. Paynter¹•Institutions (1)

University of California, Riverside¹

07 Jun 2005

TL;DR: The form and function of common metadata fields are described, and appropriate performance measures for these fields are identified, and the automatic metadata assignment tools in the iVia virtual library software are described and their performance is measured.

...read moreread less

Abstract: This paper describes the development of practical automatic metadata assignment tools to support automatic record creation for virtual libraries, metadata repositories and digital libraries, with particular reference to library-standard metadata. The development process is incremental in nature, and depends upon an automatic metadata evaluation tool to objectively measure its progress. The evaluation tool is based on and informed by the metadata created and maintained by librarian experts at the INFOMINE Project, and uses different metrics to evaluate different metadata fields. In this paper, we describe the form and function of common metadata fields, and identify appropriate performance measures for these fields. The automatic metadata assignment tools in the iVia virtual library software are described, and their performance is measured. Finally, we discuss the limitations of automatic metadata evaluation, and cases where we choose to ignore its evidence in favor of human judgment.

...read moreread less

46 citations

Collapse

Network Information

Performance

Metrics

5,953

Papers

123,769

Citations

No. of papers in the topic in previous years
Year	Papers
2023	32
2022	79
2021	13
2020	11
2019	21
2018	24

Metadata repository

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics