scispace - formally typeset
Search or ask a question

Showing papers on "Meta Data Services published in 2015"


Journal ArticleDOI
TL;DR: The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse.

90 citations


Patent
31 Mar 2015
TL;DR: Metadata management convergence platforms, systems, and methods to organize a community of users' data records as mentioned in this paper have been proposed to manage metadata records related to content housed in unique, disparate or federated holdings in centralized or distributed environments, including vehicle fleet information systems, government document holdings, insurance and underwriting information holdings; academic library collections; and entertainment archives.
Abstract: Metadata management convergence platforms, systems, and methods to organize a community of users' data records. More specifically, methods managing metadata records related to content housed in unique, disparate or federated holdings in centralized or distributed environments. Also systems and methods for creating and managing metadata records using domain specific language, vocabulary and metadata schema accepted by a community of users of unique, disparate or federated databases in centralized or distributed environments. Such environments can include content repositories including but not limited to: vehicle fleet information systems; government document holdings; insurance and underwriting information holdings; academic library collections; and entertainment archives.

84 citations


Proceedings ArticleDOI
29 Oct 2015
TL;DR: The key insight is to collect and use more metadata about all elements of the analytic ecosystem by means of an architecture and user experience that reduce the cost of contributing such metadata.
Abstract: Open data analysis platforms are being adopted to support collaboration in science and business. Studies suggest that analytic work in an enterprise occurs in a complex ecosystem of people, data, and software working in a coordinated manner. These studies also point to friction between the elements of this ecosystem that reduces user productivity and quality of work. LabBook is an open, social, and collaborative data analysis platform designed explicitly to reduce this friction and accelerate discovery. Its goal is to help users leverage each other's knowledge and experience to find the data, tools and collaborators they need to integrate, visualize, and analyze data. The key insight is to collect and use more metadata about all elements of the analytic ecosystem by means of an architecture and user experience that reduce the cost of contributing such metadata. We demonstrate how metadata can be exploited to improve the collaborative user experience and facilitate collaborative data integration and recommendations. We describe a specific use case and discuss several design issues concerning the capture, representation, querying and use of metadata.

49 citations


Journal ArticleDOI
TL;DR: A natural language processing method is employed, namely Labeled Latent Dirichlet Allocation (LLDA), and a regression model is trained via a human participants experiment to address the topic heterogeneity brought by multiple metadata standards and the lack of established semantic search in Linked‐Data‐driven geoportals.
Abstract: Geoportals provide integrated access to geospatial resources, and enable both authorities and the general public to contribute and share data and services. An essential goal of geoportals is to facilitate the discovery of the available resources. Such process heavily relies on the quality of metadata. While multiple metadata standards have been established, data contributers may adopt different standards when sharing their data via the same geoportal. This is especially the case for user-generated content where various terms and topics can be introduced to describe similar datasets. While this heterogeneity provides a wealth of perspectives, it also complicates resource discovery. With the fast development of the Semantic Web technologies, there is a rise of Linked-Data-driven portals. Although these novel portals open up new ways to organizing metadata and retrieving resources, they lack effective semantic search methods. This paper addresses the two challenges discussed above, namely the topic heterogeneity brought by multiple metadata standards as well as the lack of established semantic search in Linked-Data-driven geoportals. To harmonize the metadata topics, we employ a natural language processing method, namely Labeled Latent Dirichlet Allocation (LLDA), and train it using standardized metadata from Data.gov. With respect to semantic search, we construct thematic and geographic matching features from the textual metadata descriptions, and train a regression model via a human participants experiment. We evaluate our methods by examining their performances in addressing the two issues. Finally, we implement a semantics-enabled and Linked-Data-driven prototypical geoportal using a sample dataset from Esri’s ArcGIS Online.

47 citations



Journal ArticleDOI
01 Dec 2015
TL;DR: The metadata schema was extensively revised based on the evaluation results, and the new element definitions from the revised schema are presented in this article.
Abstract: Despite increasing interest in and acknowledgment of the significance of video games, current descriptive practices are not sufficiently robust to support searching, browsing, and other access behaviors from diverse user groups. To address this issue, the Game Metadata Research Group at the University of Washington Information School, in collaboration with the Seattle Interactive Media Museum, worked to create a standardized metadata schema. This metadata schema was empirically evaluated using multiple approaches-collaborative review, schema testing, semi-structured user interview, and a large-scale survey. Reviewing and testing the schema revealed issues and challenges in sourcing the metadata for particular elements, determining the level of granularity for data description, and describing digitally distributed games. The findings from user studies suggest that users value various subject and visual metadata, information about how games are related to each other, and data regarding game expansions/alterations such as additional content and networked features. The metadata schema was extensively revised based on the evaluation results, and we present the new element definitions from the revised schema in this article. This work will serve as a platform and catalyst for advances in the design and use of video game metadata.

37 citations


Patent
28 Jul 2015
TL;DR: In this paper, a computing device operates to determine one or more filters for the set of metadata, and a metadata from the selected metadata is selected based on the one or multiple filters.
Abstract: A computing device operates to receive, from at least a first peer device, a set of metadata that includes one or more identifiers to media playback resources. The computing device operates to determine one or more filters for the set of metadata. A metadata from the set of metadata is selected based on the one or more filters. A search request is provided to a network service for a media playback resource based on the selected metadata.

34 citations


Journal ArticleDOI
TL;DR: The paper defined a metadata and data description format, called “Togo Metabolome Data” (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis.
Abstract: Metabolomics - technology for comprehensive detection of small molecules in an organism - lags behind the other "omics" in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called "Togo Metabolome Data" (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.

33 citations


Patent
27 Feb 2015
TL;DR: In this paper, the authors propose a platform for data management that leverages a metadata repository, which tracks and manages all aspects of the data lifecycle, including status information (load dates, quality exceptions, access rights, etc.), definitions (business meaning, technical formats, etc.).
Abstract: An analytical computing environment for large data sets comprises a software platform for data management. The platform provides various automation and self-service features to enable those users to rapidly provision and manage an agile analytics environment. The platform leverages a metadata repository, which tracks and manages all aspects of the data lifecycle. The repository maintains various types of platform metadata including, for example, status information (load dates, quality exceptions, access rights, etc.), definitions (business meaning, technical formats, etc.), lineage (data sources and processes creating a data set, etc.), and user data (user rights, access history, user comments, etc.). Within the platform, the metadata is integrated with all platform services, such as load processing, quality controls and system use. As the system is used, the metadata gets richer and more valuable, supporting additional automation and quality controls.

27 citations


Proceedings ArticleDOI
08 Sep 2015
TL;DR: This paper explores several alternative design strategies to efficiently support the execution of existing workflow engines across multi-site clouds, by reducing the cost of metadata operations in a 2-level metadata partitioning hierarchy that combines distribution and replication.
Abstract: With their globally distributed datacenters, clouds now provide an opportunity to run complex large-scale applications on dynamically provisioned, networked and federated infrastructures. However, there is a lack of tools supporting data intensive applications across geographically distributed sites. For instance, scientific workflows which handle many small files can easily saturate state-of-the-art distributed filesystems based on centralized metadata servers (e.g. HDFS, PVFS). In this paper, we explore several alternative design strategies to efficiently support the execution of existing workflow engines across multi-site clouds, by reducing the cost of metadata operations. These strategies leverage workflow semantics in a 2-level metadata partitioning hierarchy that combines distribution and replication. The system was validated on the Microsoft Azure cloud across 4 EU and US datacenters. The experiments were conducted on 128 nodes using synthetic benchmarks and real-life applications. We observe as much as 28% gain in execution time for a parallel, geo-distributed real-world application (Montage) and up to 50% for a metadata-intensive synthetic benchmark, compared to a baseline centralized configuration.

21 citations


Journal ArticleDOI
TL;DR: Investigation of research journal publications as a potential source for identifying descriptive metadata about methods for research data indicates that journal articles provide rich descriptive content that can be sufficiently mapped to existing metadata standards with methods-related elements, resulting in a mapping of the data production process for a study.
Abstract: Understanding the methods and processes implemented by data producers to generate research data is essential for fostering data reuse. Yet, producing the metadata that describes these methods remains a time-intensive activity that data producers do not readily undertake. In particular, researchers in the long tail of science often lack the financial support or tools for metadata generation, thereby limiting future access and reuse of data produced. The present study investigates research journal publications as a potential source for identifying descriptive metadata about methods for research data. Initial results indicate that journal articles provide rich descriptive content that can be sufficiently mapped to existing metadata standards with methods-related elements, resulting in a mapping of the data production process for a study. This research has implications for enhancing the generation of robust metadata to support the curation of research data for new inquiry and innovation.

Patent
18 Dec 2015
TL;DR: In this paper, the metadata and hash values stored on the client device are moved from a metadata database to a resynchronization database, and the data in the metadata database is deleted.
Abstract: Resynchronization of folders shared among multiple client devices over a network is provided. Metadata and hash values stored on the client device are moved from a metadata database to a resynchronization database, and the data in the metadata database is deleted. Metadata is created for locally stored synchronized files. For each file, the created metadata is compared to the metadata stored in the resynchronization database. If the metadata matches, hash values are retrieved from the resynchronization database and stored with the created metadata in the metadata database. If the metadata does not match, hashes for the file are created and stored with the created metadata in the metadata database. A synchronization operation may be performed which consists of comparing the files stored on the client to the synchronized versions on a host server and updating or adding files that are not present or not up to date.


Journal Article
TL;DR: This work proposes an application profile of the IEEE LOM standard with special focus on the field of distance learning, and presents the proposed Educational Metadata Profile (EMP), in which its ontological binding is given in the subsequent section (The EMP ontology), and an evaluation of this ontology model is presented.
Abstract: Introduction Metadata are "machine-readable information about electronic resources or other things" (Berners-Lee, 1997) and are used to describe the features of a resource, thus making easier its management and retrieval. A set of metadata elements combined so as to serve a specific purpose, constitute a metadata schema. Although the adoption of a single metadata standard would assure reusability of resources and interoperability among applications, there exists no metadata schema yet, appropriate to fulfil the requirements and needs of every application. Some schemas focus on technical metadata, other on educational metadata while some other on more specialized elements. When existing approaches are not sufficient enough to cover the special requirements of an institution or organization, the use of application profiles is suggested. According to Heery & Patel (2000), an application profile is an aggregation of metadata elements selected from one or more different schemas and combined into a new compound schema. Particularly, in the case of educational recourses, the set of metadata used to describe their characteristics, should be able to capture their educational and pedagogical aspects. Therefore, apart from author, title or type--fields that are common in all metadata schemas--an educational metadata schema should also include information regarding the resource's particular learning type, its intended end users, the instructional context and many more. A kind of educational resource that is increasingly used by Educational Institutions in recent years is the Learning Object (LO). According to Nikolopoulos, Solomou, Pierrakeas & Kameas (2012) LOs are pieces of educational material that directly correlate the knowledge they convey with specific objectives (learning outcomes) of the learning process. But although LOs constitute a common trend in organizing educational material and have been utilized by many modern e-learning systems (Schreurs & Al-Zoubi, 2007), they cannot be used effectively because there exists no metadata schema capable of capturing all of their characteristics. This insufficiency becomes even greater in the case of LOs that are designed for use in the context of distance learning courses, where the proper handling and dissemination of LOs is crucial for the success of the learning process, because in most cases, contrary to what happens in a classroom, no human tutor would be continuously available to monitor students' path or progress through the educational process. Hellenic Open University (HOU) is a Higher Education Institute specialized in distance and lifelong learning that the last two years seeks to re-organize its material and to provide its students with advanced services for delivering knowledge. Such services require the consumption of adequately characterized LOs, using a metadata schema that is capable of capturing as many as possible of their pedagogical aspects and especially those considered to be important according to distance learning principles. To the best of our knowledge, no such a schema or application profile exists, able to satisfy these requirements. Consequently, through this work we propose an application profile of the IEEE LOM standard with special focus on the field of distance learning. After reviewing existing approaches for describing educational resources, as well as several binding methods (section Background), we move on to the presentation of our proposed Educational Metadata Profile (EMP), in section EMP: New Elements and Modifications. Its ontological binding is given in the subsequent section (The EMP ontology), whereas section Evaluation of the EMP ontology presents an evaluation of this ontology model, through its application for characterizing real LO instances. Conclusions follow, in our last section. Background In literature, several metadata standards and profiles have been proposed, each serving different purposes and needs. …

Journal ArticleDOI
TL;DR: This paper analyses the three most common types of change within metadata records as well as their subcategories and discusses the possible implications of such changes within and beyond the metadata records.
Abstract: Evolving user needs and relevance require continuous change and reform. A good digital collection has mechanisms to accommodate the differing uses being made of the digital library system. In a metadata management context, change could mean to transform, substitute, or make the content of a metadata record different from what it is or from what it would be if left alone. In light of the evolving compliance requirements, this paper analyses the three most common types of change within metadata records as well as their subcategories and discusses the possible implications of such changes within and beyond the metadata records.

Book ChapterDOI
14 Sep 2015
TL;DR: A robust multidimensional metadata quality evaluation model that measures metadata quality based on five metrics and by taking into account contextual parameters concerning metadata generation and use is proposed.
Abstract: The need for good quality metadata records becomes a necessity given the large quantities of digital content that is available through digital repositories and the increasing number of web services that use this content. The context in which metadata are generated and used affects the problem in question and therefore a flexible metadata quality evaluation model that can be easily and widely used has yet to be presented. This paper proposes a robust multidimensional metadata quality evaluation model that measures metadata quality based on five metrics and by taking into account contextual parameters concerning metadata generation and use. An implementation of this metadata quality evaluation model is presented and tested against a large number of real metadata records from the humanities domain and for different applications.

Patent
27 Aug 2015
TL;DR: In this paper, the authors present a system for data collection and processing in a network, including one or more sensors disposed in the network interface and configured to collect raw signal traffic data where each sensor is further configured to parse the raw signal data into network protocols; split the network protocols into content data and metadata; derive contextual metadata from the content data; compile the metadata and the derived metadata to produce anonymized metadata; encrypt the anonymised metadata; and transmit to the encrypted anonymizedmetadata to a unified data server.
Abstract: Systems and methods for data collection and processing in a network, including one or more sensors disposed in a network interface and configured to collect raw signal traffic data where each sensor is further configured to parse the raw signal traffic data into network protocols; split the network protocols into content data and metadata; derive contextual metadata from the content data; compile the metadata and the derived metadata to produce anonymized metadata; encrypt the anonymized metadata; and transmit to the encrypted anonymized metadata to a unified data server.

Proceedings ArticleDOI
01 Jul 2015
TL;DR: The metadata quality metric helps the metadata harvester collection administrators detecting and improving the weaknesses of their metadata, and harvesters locating the most problematic collections, in terms of metadata quality, and prompt their administrators to improve their metadata.
Abstract: The quality of the data and metadata affects the interoperability of the collections and the quality of all processing. Our metadata quality metric helps the metadata harvester collection administrators detecting and improving the weaknesses of their metadata, and harvesters locating the most problematic collections, in terms of metadata quality, and prompt their administrators to improve their metadata. We extended and used an adaptive quantitative metadata quality metric and a tool to implement it. In controlled values, their value distribution is considered, and in free text values the length of their description. Moreover, we also consider additional information in the OAI-PMH XML responces, that is not normally mapped in metadata elements, but still contains metadata information, such as XML attributes. We used the tool to make quality observations, to examine collections for patterns and irregularities and to produce the appropriate advice for the collection administrators. Some of these observations are demonstrated here. We compared the reported quality over a 3-year period, to get a general quantitative and qualitative feeling of the diversity in the record descriptions, and the changes in their quality during their lifetime. We verified the assumption that the quality increases over time: usually by a tiny amount, in every collection, and by a lot on a small number of collections. Also, the lower quality collections are the ones that stop responding and vanish.

Patent
19 Nov 2015
TL;DR: In this article, the authors present a system and method for metadata processing that can be used to encode an arbitrary number of security policies for code running on a stored-program processor, such that metadata is unbounded and software programmable to be applicable to a wide range of metadata processing policies.
Abstract: A system and method for metadata processing that can be used to encode an arbitrary number of security policies for code running on a stored-program processor. This disclosure adds metadata to every word in the system and adds a metadata processing unit that works in parallel with data flow to enforce an arbitrary set of policies, such that metadata is unbounded and software programmable to be applicable to a wide range of metadata processing policies. This instant disclosure is applicable to a wide range of uses including safety, security, and synchronization.

Book
08 Aug 2015
TL;DR: The theory provides the conceptual underpinnings for a new approach which moves away from expert defined standardised metadata to a user driven approach with users as metadata co-creators, which changes the current focus on metadata simplicity and efficiency to one of metadata enriching, which is a continuous and evolving process of data linking.
Abstract: An Emergent Theory of Digital Library Metadata is a reaction to the current digital library landscape that is being challenged with growing online collections and changing user expectations. The theory provides the conceptual underpinnings for a new approach which moves away from expert defined standardised metadata to a user driven approach with users as metadata co-creators. Moving away from definitive, authoritative, metadata to a system that reflects the diversity of users' terminologies, it changes the current focus on metadata simplicity and efficiency to one of metadata enriching, which is a continuous and evolving process of data linking. From predefined description to information conceptualised, contextualised and filtered at the point of delivery. By presenting this shift, this book provides a coherent structure in which future technological developments can be considered. * Metadata is valuable when continuously enriched by experts and users* Metadata enriching results from ubiquitous linking * Metadata is a resource that should be linked openly* The power of metadata is unlocked when enriched metadata is filtered for users individually

Book ChapterDOI
31 May 2015
TL;DR: Roomba is developed, a tool that enables to validate, correct and generate dataset metadata, and it is shown that the automatic corrections done by Roomba increase the overall quality of the datasets metadata and highlight the need for manual efforts to correct some important missing information.
Abstract: Linked Open Data LOD has emerged as one of the largest collections of interlinked datasets on the web. In order to benefit from this mine of data, one needs to access descriptive information about each dataset or metadata. However, the heterogeneous nature of data sources reflects directly on the data quality as these sources often contain inconsistent as well as misinterpreted and incomplete metadata information. Considering the significant variation in size, the languages used and the freshness of the data, one realizes that finding useful datasets without prior knowledge is increasingly complicated. We have developed Roomba, a tool that enables to validate, correct and generate dataset metadata. In this paper, we present the results of running this tool on parts of the LOD cloud accessible via the datahub.io API. The results demonstrate that the general state of the datasets needs more attention as most of them suffers from bad quality metadata and lacking some informative metrics that are needed to facilitate dataset search. We also show that the automatic corrections done by Roomba increase the overall quality of the datasets metadata and we highlight the need for manual efforts to correct some important missing information.

Journal ArticleDOI
TL;DR: A novel framework for efficient retrieval of data from the cloud data servers using metadata using ‘MaaS—Metadata as a Service’, which has outperformed other existing methods in reducing the latency during data retrieval.
Abstract: In cloud era as the data stored is enormous, efficient retrieval of data with reduced latency plays a major role. In cloud, owing to the size of the stored data and lack of locality information among the stored files, metadata is a suitable method of keeping track of the storage. This paper describes a novel framework for efficient retrieval of data from the cloud data servers using metadata with less amount of time. Performance of queries due to availability of files for query processing can be greatly improved by the efficient use of metadata and its analysis thereof. Hence this paper proposes a generic approach of using metadata in cloud, named ‘MaaS—Metadata as a Service.’ The proposed approach has exploited various methodologies in reducing the latency during data retrieval. This paper investigates the issues on creation of metadata, metadata management and analysis of metadata in a cloud environment for fast retrieval of data. Cloud bloom filter, a probabilistic data structure used for efficient retrieval of metadata is stored across various metadata servers dispersed geographically. We have implemented the model in a cloud environment, and the experimental results show that methodology used is efficient in increasing the throughput and also by handling large number of queries efficiently with reduced latency. The efficacy of the approach is tested through experimental studies using KDD Cup 2003 dataset. In the experimental results, proposed ‘MaaS’ has outperformed other existing methods.

14 Jul 2015
TL;DR: This paper has designed dynamically provisioned parallel TCP streams and non-blocking concurrent in-memory cache for the system performance and implemented these mechanisms in a cloud-hosted metadata retrieval, caching, and prefetching system called DLS.
Abstract: Due to the diverse approaches and characteristics of scientific research, valuable resources within most important scientific disciplines exist in heterogeneous storage systems in distributed environments as they are often created and maintained by different information providers. Considering more than half of all data access operations in traditional storage systems are metadata access operations, it is important to design a framework that can effectively query and access valuable metadata information in distributed environments. In this paper, we present a highly efficient caching and prefetching mechanism tailored to reduce metadata access latency and improve responsiveness in wide-area data transfers. We designed dynamically provisioned parallel TCP streams and non-blocking concurrent in-memory cache for the system performance. We have implemented these mechanisms in a cloud-hosted metadata retrieval, caching, and prefetching system called directory listing service (DLS) and have evaluated its performance on both local area and wide area settings.

Proceedings ArticleDOI
07 Sep 2015
TL;DR: This work shows how to use the different formats and types of metadata in order to validate the legal argument for relevant evidence in legal cases.
Abstract: Metadata is not visible when viewing data in a number of forms such as a word document or an image. It is, however, an important consideration in the discovery of information for use in digital forensic investigations. Different types of documents and files have a number of formats and types of metadata, which can be used to discover the properties of a file, document or network activity. Moreover, Metadata is useful in many circumstances, where it can provide collaboration evidence of between groups of people, because some of them are not aware of which type of information is stored within their document. Thus, the digital forensics investigator can access to this hidden document information. In legal cases, the identification of relevant digital evidence is crucial for supporting the case, verification and an examination existing legal argument forms. In this work, we show how to use the different formats and types of metadata in order to validate the legal argument for relevant evidence.

Proceedings ArticleDOI
01 Sep 2015
TL;DR: A new highly reliable policy called MAMS (multiple actives multiple standbys) to ensure multiple metadata service reliability in file systems is introduced and results confirm that the MAMS policy can achieve a faster transparent fault tolerance in different error scenarios with less influence on metadata operations.
Abstract: Most mass data processing applications nowadays often need long, continuous, and uninterrupted data access. Parallel/distributed file systems often use multiple metadata servers to manage the global namespace and provide a reliability guarantee. With the rapid increase of data amount and system scale, the probability of hardware or software failures keeps increasing, which easily leads to multiple points of failures. Metadata service reliability has become a crucial issue as it affects file and directory operations in the event of failures. Existing reliable metadata management mechanisms can provide fault tolerance but have disadvantages in system availability, state consistence, and performance overhead. This paper introduces a new highly reliable policy called MAMS (multiple actives multiple standbys) to ensure multiple metadata service reliability in file systems. Different from traditional strategies, the MAMS divides metadata servers into different replica groups and maintains more than one standby node for failover in each group. Combining the global view with distributed protocols, the MAMS achieves an automatic state transition and service takeover. We have implemented the MAMS policy in a prototyping file system and conducted extensive tests to validate and evaluate it. The experimental results confirm that the MAMS policy can achieve a faster transparent fault tolerance in different error scenarios with less influence on metadata operations. Compared with typical designs in Hadoop Avatar, Hadoop HA, and Boom-FS file systems, the mean time to recovery (MTTR) with the MAMS was reduced by 80.23%, 65.46% and 28.13%, respectively.

Proceedings ArticleDOI
01 Oct 2015
TL;DR: This paper presents an approach of creating semantic metadata from relational educational data into ontology using D2RQ mapping files to present meaningful information based on users' needs.
Abstract: Current techniques for retrieving content and usage information from educational data are based on keywords which including string combinations. This technique raises the limitation in terms of capturing learning conceptualization associated to the results. Aims to reveal this issue, this paper present an approach of creating semantic metadata from relational educational data into ontology using D2RQ mapping files. The selected relational data converted to the defined metadata template in ontology such as instances or data properties. Thereafter, the learning conceptualization which stored in ontology knowledge-based growth the metadata to retrieve information meaning based and then used for further analysis. The retrieval results are expected to present meaningful information based on users' needs.

Journal ArticleDOI
06 Nov 2015
TL;DR: Results of an exploratory analysis of representation of dates in over 8 million metadata records from one of the largest digital aggregators, Digital Public Library of America (DPLA), and compares it to EDTF specifications are presented.
Abstract: Considering the value of dates in the life cycle of the digital resource, capturing and storing dates metadata in a structured way can have a significant impact on information retrieval. There are a number of format conventions in common use for encoding the date and time values; the Extended Date/Time Format (EDTF) is one of the most expressive. This paper presents results of an exploratory analysis of representation of dates in over 8 million metadata records from one of the largest digital aggregators, Digital Public Library of America (DPLA), and compares it to EDTF specifications. This benchmark study provides empirical data -- at both the individual provider level and the group level (content hubs or service hubs) -- about the overall level and patterns of application of date metadata in DPLA metadata records in relation to EDTF.

Patent
14 May 2015
TL;DR: In this paper, a communication data source file is parsed into conversation-specific files that include message content and metadata, and the message contents and metadata are displayed on a computing device operated by a reviewer.
Abstract: Systems and methods enable convenient and accurate searching, filtering, reviewing, and classification of electronic documents without the loss of metadata. A communication data source file is parsed into conversation-specific files that include message content and metadata. The message content and metadata are displayed on a computing device operated by a reviewer. To streamline the review process, the reviewer can filter display of the message content according to various metadata categories as well as search conversation-specific files using the metadata categories.

Journal ArticleDOI
TL;DR: An overview on types of metadata standards and schemas is presented, and the issues and challenges in metadata creation, management, interoperability, and resource discovery are discussed.
Abstract: Information resources are available in various kinds of media and forms. To describe them there exists number of diverse metadata standards and schema. Metadata is crucial for preservation and archiving, organisation, resource discovery and information retrieval across platforms. As one metadata standard cannot be applicable for all the emerging media and document formats, a combination of them is used. In this context the present paper presents an overview on types of metadata standards and schemas, and also discusses on the issues and challenges in metadata creation, management, interoperability, and resource discovery.

Proceedings Article
01 Sep 2015
TL;DR: The Dublin Core remains as the most commonly used metadata schema, followed by MARC 21, METS and MODS, and the number of repositories that use or provide metadata application profiles is 13, which the authors consider as very low.
Abstract: Shows the results of a survey by questionnaire sent to the managers of 2, 165 digital repositories registered at OpenDOAR. Its purpose was to identify the existence and the use of application profiles and related metadata schemas. Of this total, 431 questionnaires were filled. The survey enabled the identification of metadata application profiles, as well as schemas and metadata elements/properties used within these repositories. According to the results, the number of repositories that use or provide metadata application profiles is 13, which we consider as very low. The Dublin Core remains as the most commonly used metadata schema, followed by MARC 21, METS and MODS. The dataset that resulted from the survey is openly available at Repositori UM, the institutional repository of the University of Minho.