scispace - formally typeset
Search or ask a question

Showing papers on "Metadata repository published in 2012"


Patent
20 Feb 2012
TL;DR: In this paper, the authors present a request identification and parsing process to locate object metadata and to handle the request in accordance therewith, where different types of metadata exist for a particular object, where metadata in a configuration file is overridden by metadata in response header or request string, with metadata in the request string taking precedence.
Abstract: To serve content through a content delivery network (CDN), the CDN must have some information about the identity, characteristics and state of its target objects. Such additional information is provided in the form of object metadata, which according to the invention can be located in the request string itself, in the response headers from the origin server, in a metadata configuration file distributed to CDN servers, or in a per-customer metadata configuration file. CDN content servers execute a request identification and parsing process to locate object metadata and to handle the request in accordance therewith. Where different types of metadata exist for a particular object, metadata in a configuration file is overridden by metadata in a response header or request string, with metadata in the request string taking precedence.

504 citations


Patent
02 Mar 2012
TL;DR: In this paper, a metabase formed from metadata can be used for various data management operations, such as enhanced data management, enhanced data identification, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data.
Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.

238 citations


Patent
27 Mar 2012
TL;DR: In this paper, a system and method for capturing objects and balancing systems resources in a capture system is described, where an object is captured, metadata associated with the object is generated, and the object and metadata stored.
Abstract: A system and method for capturing objects and balancing systems resources in a capture system are described. An object is captured, metadata associated with the objected generated, and the object and metadata stored.

214 citations


Patent
05 Nov 2012
TL;DR: In this paper, a method and system for processing network metadata is described, where metadata may be processed by dynamically instantiated executable software modules which make policy-based decisions about the character of the network metadata and about presentation of the metadata to consumers.
Abstract: A method and system for processing network metadata is described. Network metadata may be processed by dynamically instantiated executable software modules which make policy-based decisions about the character of the network metadata and about presentation of the network metadata to consumers of the information carried by the network metadata. The network metadata may be type classified and each subclass within a type may be mapped to a definition by a unique fingerprint value. The fingerprint value may be used for matching the network metadata subclasses against relevant policies and transformation rules. For template-based network metadata such as NetFlow v9, an embodiment of the invention can constantly monitor network traffic for unknown templates, capture template definitions, and informs administrators about templates for which custom policies and conversion rules do not exist. Conversion modules can efficiently convert selected types and/or subclasses of network metadata into alternative metadata formats.

157 citations


Patent
25 Jul 2012
TL;DR: In this paper, a system and method are disclosed for controlling metadata associated with content on an electronic device that includes displaying interface screens for user entry of metadata control instructions, accepting user instructions, modifying metadata of applicable content, and associating the modified metadata with the applicable content.
Abstract: A system and method are disclosed for controlling metadata associated with content on an electronic device that includes displaying interface screens for user entry of metadata control instructions, accepting user instructions, modifying metadata of applicable content, and associating the modified metadata with the applicable content. The system can export and/or store the applicable content along with modified metadata. The system can automatically modify metadata according to one or more profiles. Relevant profiles can be determined based on the export mechanism, destination or type of content. The system can add watermarks to indicate metadata modification. The system can display metadata for user modification. The content can include photos, videos or other content. The system can display metadata and geolocation indicators for each content item that indicate whether that item has associated metadata and geolocation information. The user can selectively modify metadata of selected content.

118 citations


Patent
02 Feb 2012
TL;DR: In this paper, a system and methods are provided that enable a data and information repository with a semantic engine that enables users to easily capture information in various formats from various devices along with rich metadata relating to that information.
Abstract: System and methods are provided that enable a data and information repository with a semantic engine that enables users to easily capture information in various formats from various devices along with rich metadata relating to that information. The information repository can be configured to query the captured information and any metadata to extrapolate new meaning, including semantic meaning, and to perform various tasks, including but not limited to sharing of the information and metadata. In some embodiments, the information repository is configured to generate recommendations to users based on analysis of the captured information.

107 citations


Patent
03 Jul 2012
TL;DR: In this article, a delta marking stream (DMS) is used to track changes between data locations in the first volume and the second volume using a DMS and metadata is intended to be committed to the DMS.
Abstract: In one aspect, a method includes providing data protection to data in a first volume at a first data protection appliance by storing a copy of the data in a second volume using a second data protection appliance, tracking changes between data locations in the first volume and the second volume using a delta marking stream (DMS) and receiving, at the first data protection appliance, metadata. The metadata is intended to be committed to the DMS. The method further includes mirroring the metadata at a third data protection appliance.

105 citations


Journal ArticleDOI
TL;DR: Six key recommendations for libraries and standards agencies are provided, including rising to the challenges and embracing the opportunities presented by current technological trends, adopting minimal requirements of Linked Data principles, developing ontologies, deciding on what needs to be retained from current library models, becoming part of the Linked data cloud, and developing mixed-metadata approaches.
Abstract: Contemporary metadata principles and standards tended to result in document-centric rather than data-centric; human-readable rather than machine-processable metadata. In order for libraries to create and harness shareable, mashable and re-usable metadata, a conceptual shift can be achieved by adjusting current library models such as Resource Description and Access (RDA) and Functional Requirements for Bibliographic Records (FRBR) to models based on Linked Data principles. In relation to technical formats, libraries can leapfrog to Linked Data technical formats such as the Resource Description Framework (RDF), without disrupting current library metadata operations. This paper provides six key recommendations for libraries and standards agencies. These include rising to the challenges and embracing the opportunities presented by current technological trends, adopting minimal requirements of Linked Data principles, developing ontologies, deciding on what needs to be retained from current library models, becoming part of the Linked Data cloud, and developing mixed-metadata (standards-based and socially-constructed) approaches. Finally, the paper concludes by identifying and discussing five major benefits of such metadata re-conceptualisation. The benefits include metadata openness and sharing, serendipitous discovery of information resources, identification of zeitgeist and emergent metadata, facet-based navigation and metadata enriched with links.

94 citations


Journal ArticleDOI
TL;DR: The paper shows that using metadata with the appropriate metadata architecture can yield considerable benefits for LOD publication and use, including improving find ability, accessibility, storing, preservation, analysing, comparing, reproducing, finding inconsistencies, correct interpretation, visualizing, linking data, assessing and ranking the quality of data and avoiding unnecessary duplication of data.
Abstract: Public and private organizations increasingly release their data to gain benefits such as transparency and economic growth. The use of these open data can be supported and stimulated by providing considerable metadata (data about the data), including discovery, contextual and detailed metadata. In this paper we argue that metadata are key enablers for the effective use of Linked Open Data (LOD). We illustrate the potential of metadata by 1) presenting an overview of advantages and disadvantages of metadata derived from literature, 2) presenting metadata requirements for LOD architectures derived from literature, workshops and a questionnaire, 3) describing a LOD metadata architecture that meets the requirements and 4) showing examples of the application of this architecture in the ENGAGE project. The paper shows that using metadata with the appropriate metadata architecture can yield considerable benefits for LOD publication and use, including improving find ability, accessibility, storing, preservation, analysing, comparing, reproducing, finding inconsistencies, correct interpretation, visualizing, linking data, assessing and ranking the quality of data and avoiding unnecessary duplication of data. The Common European Research Information Format (CERIF) can be used to build the metadata architecture and achieve the advantages.

87 citations


Book ChapterDOI
27 May 2012
TL;DR: In this article, the authors present a transparent and interactive methodology for ingesting, converting and linking cultural heritage metadata into Linked Data, which is designed to maintain the richness and detail of the original metadata.
Abstract: Within the cultural heritage field, proprietary metadata and vocabularies are being transformed into public Linked Data. These efforts have mostly been at the level of large-scale aggregators such as Europeana where the original data is abstracted to a common format and schema. Although this approach ensures a level of consistency and interoperability, the richness of the original data is lost in the process. In this paper, we present a transparent and interactive methodology for ingesting, converting and linking cultural heritage metadata into Linked Data. The methodology is designed to maintain the richness and detail of the original metadata. We introduce the XMLRDF conversion tool and describe how it is integrated in the ClioPatria semantic web toolkit. The methodology and the tools have been validated by converting the Amsterdam Museum metadata to a Linked Data version. In this way, the Amsterdam Museum became the first ‘small' cultural heritage institution with a node in the Linked Data cloud.

78 citations


Proceedings Article
08 Jul 2012
TL;DR: A method to automatically label multi-lingual data with named entity tags is proposed and it is shown how to effectively combine the weak annotations stemming from Wikipedia metadata with information obtained through English-foreign language parallel Wikipedia sentences.
Abstract: In this paper we propose a method to automatically label multi-lingual data with named entity tags. We build on prior work utilizing Wikipedia metadata and show how to effectively combine the weak annotations stemming from Wikipedia metadata with information obtained through English-foreign language parallel Wikipedia sentences. The combination is achieved using a novel semi-CRF model for foreign sentence tagging in the context of a parallel English sentence. The model outperforms both standard annotation projection methods and methods based solely on Wikipedia metadata.

Patent
13 Sep 2012
TL;DR: In this article, a management console can display a hierarchical view the client devices and/or their data and can further provide utilities for processing the various data formats, including fields for storing both metadata common to the client device data and value added metadata can be used to mine or process the data of the disparate client devices.
Abstract: Systems and methods integrate disparate backup devices with a unified interface. In certain examples, a management console manages data from various backup devices, while retaining such data in its native format. The management console can display a hierarchical view the client devices and/or their data and can further provide utilities for processing the various data formats. A data structure including fields for storing both metadata common to the client device data and value-added metadata can be used to mine or process the data of the disparate client devices. The unified single platform and interface reduces the need for multiple data management products and/or customized data utilities for each individual client device and provides a single pane of glass view into data management operations. Integrating the various types of storage formats and media allows a user to retain existing storage infrastructures and further facilitates scaling to meet long-term management needs.

Patent
13 Sep 2012
TL;DR: In this article, a set-top box is used to store a collection of metadata describing each of the files stored on the data storage medium in a memory during operation of the computer system.
Abstract: Systems and methods allow for reliably and efficiently managing files stored on a data storage medium associated with a computer system such as a set-top box. The computer system manages a collection of metadata describing each of the files stored on the data storage medium in a memory during operation of the computer system. A current snapshot of the collection of metadata is periodically or otherwise stored to the data storage medium. Following a reboot of the computer system, the collection of metadata can be recovered to the memory from the snapshot of the collection of metadata stored on the data storage medium.

Journal ArticleDOI
01 Nov 2012
TL;DR: This work presents an interoperability solution for sharing data among heterogeneous data sources, and proposes a metadata management framework for medical multimedia content including X-ray, ECG, MRI, and ultrasound images.
Abstract: e-Health systems provide a collaborative platform for sharing patients' medical data typically stored in distributed autonomous healthcare data sources. Each autonomous source stores its medical and multimedia data without following any global structure. This causes heterogeneity in the underlying sources with respect to the data and storage structure. Therefore, a data interoperability mechanism is required for sharing the data among the heterogeneous sources. A proper metadata structure is also necessary to represent multimedia content in the sources to enable efficient query processing. Considering these needs, we present an interoperability solution for sharing data among heterogeneous data sources. We also propose a metadata management framework for medical multimedia content including X-ray, ECG, MRI, and ultrasound images. The framework identifies features, generates and represents metadata, and produces identifiers for the medical multimedia content to facilitate efficient query processing. The framework has been tested with various user queries and the accuracy of the query results evaluated by means of precision, recall, and user feedback methods. The results confirm the effectiveness of the proposed approach.

01 Jan 2012
TL;DR: An analysis of the adoption of metadata standards on the Web based a large crawl of the Web looks at what forms of syntax and vocabularies publishers are using to mark up data inside HTML pages.
Abstract: We provide an analysis of the adoption of metadata standards on the Web based a large crawl of the Web. In particular, we look at what forms of syntax and vocabularies publishers are using to mark up data inside HTML pages. We also describe the process that we have followed and the difficulties involved in web data extraction.

Patent
19 Jun 2012
TL;DR: In this article, a server aggregates the data from aggregated browsers and transmits the generated metadata to at least one computing device, where the computing device renders a webpage using at least a portion of the provided metadata.
Abstract: Methods and devices include a server and at least two web browsers operable on at least two different computing devices. Each browser reports results of processing and rendering of webpages to the server. The server aggregates the data. The server generates metadata from the aggregated browsers. The server transmits the generated metadata to at least one computing device. The computing device renders a webpage using at least a portion of the provided metadata. The metadata may identify portions of JavaScript that can be processed in parallel. The metadata may identify a library portion that does not have to be loaded. The metadata may identify a portion of the webpage that may be rendered first before a second portion of the webpage. Returning metadata to the computing device can assist the computing device in parsing, analyzing or executing the request for the webpage.

Patent
Ruth M. Amaru1, Joshua Fox1, Benjamin Halberstadt1, Boris Melamed1, Zvi Schreiber1 
15 Mar 2012
TL;DR: In this article, a metadata management system for importing, integrating and federating metadata, including a configurable metamodel, a metadata repository for storing metadata whose structure reflects the meta-schema, at least one external metadata source, which is able to persist metadata in accordance with the structure of a meta schema, a mapping module for mapping the meta schema to the metamode, and a transformation module, operatively coupled to the metadata mapping module.
Abstract: A metadata management system for importing, integrating and federating metadata, including a configurable metamodel, a metadata repository for storing metadata whose structure reflects the metamodel, at least one external metadata source, which is able to persist metadata in accordance with the structure of a meta-schema, a mapping module for mapping the meta-schema to the metamodel, and a transformation module, operatively coupled to the metadata mapping module, for translating specific metadata from the at least one external metadata source to the metadata repository, for use in import, export or synchronization of metadata between the external metadata source and the metadata repository. A method and a computer-readable storage medium are also described.

Proceedings Article
01 May 2012
TL;DR: This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing.
Abstract: This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborates on the distinction between minimal and maximal versions thereof, briefly presents the integrated environment supporting the LRs description and search and retrieval processes and concludes with work to be done in the future for the improvement of the model.

Patent
17 Apr 2012
TL;DR: In this article, a metadata of an application is received and the metadata describes a number of artifacts of the computer application, and the artifacts associated with the at least one service are identified.
Abstract: In one aspect, a metadata of an application is received. The metadata describes a number of artifacts of the computer application. Based on a reference in the application metadata, at least one service that the application is configured to access is determined. In another aspect, additional metadata describing artifacts associated with the at least one service are identified. The artifacts associated with the at least one service and the artifacts of the computer application are selected for installation of the computer application.

Patent
Yohsuke Ishii1, Shoji Kodama1
13 Jul 2012
TL;DR: In this article, a search device receives a search request, extracts at least one of an alias or a metadata name from the search request and converts the alias to metadata name by referring to metadata schema management information for managing in an inclusive manner a namespace alias and metadata name for the retrieval device to identify a metadata schema definition defining the structure of a retrieval-target file that includes metadata.
Abstract: A search device receives a search request, extracts at least one of an alias or a metadata name from the search request, converts the alias to metadata name by referring to metadata schema management information for managing in an inclusive manner a namespace alias and a metadata name for the retrieval device to identify a metadata schema definition defining the structure of a retrieval-target file that includes metadata, and specifies a field name from the metadata name by referring to schema mapping management information for managing the corresponding relationship between a metadata name of metadata schema definition information and a field name of the retrieval index schema definition.

Patent
19 Jun 2012
TL;DR: In this paper, the authors describe a system for migration of metadata and storage management of data in a first storage environment to a second storage environment, where the first metadata for the first environment is copied to the second environment to incorporate with second metadata, and the second metadata is modified to indicate first server information used by a second server to communicate with a first server to access the migrated data from the first storage media.
Abstract: Provided are a computer program product, method, and system for migration of metadata and storage management of data in a first storage environment to a second storage environment. A migration request is processed to migrate metadata and storage management of data in a first storage environment to a second storage environment. First metadata for the first storage environment is copied to the second storage environment to incorporate with second metadata. The first metadata incorporated into the second metadata is modified to indicate first server information used by a second server to communicate with a first server to access the migrated data from the first storage media. The migration request is completed in response to incorporating the first metadata into the second metadata, wherein the first data objects remain in the first storage media after completing the migration request.

Proceedings ArticleDOI
01 Dec 2012
TL;DR: There exists a realistic chance to fool state-of-the-art image file forensic methods using available software tools and the analysis of ordered data structures on the example of JPEG file formats and the EXIF metadata format as countermeasure is introduced.
Abstract: JPEG file format standards define only a limited number of mandatory data structures and leave room for interpretation. Differences between implementations employed in digital cameras, image processing software, and software to edit metadata provide valuable clues for basic authentication of digital images. We show that there exists a realistic chance to fool state-of-the-art image file forensic methods using available software tools and introduce the analysis of ordered data structures on the example of JPEG file formats and the EXIF metadata format as countermeasure. The proposed analysis approach enables basic investigations of image authenticity and documents a much better trustworthiness of EXIF metadata than commonly accepted. Manipulations created with the renowned metadata editor ExifTool and various image processing software can be reliably detected. Analysing the sequence of elements in complex data structures is not limited to JPEG files and might be a general principle applicable to different multimedia formats.

Patent
04 May 2012
TL;DR: In this article, the metadata associated with data stored in a non-relational database is generated, and a query is executed using the generated metadata to generate a metadata result set.
Abstract: A method, computer program product, and computer system for a database system and method. In some embodiments, metadata associated with data stored in a non-relational database is generated. The metadata is based upon, at least in part, at least one of a location of the data, a state of data, and the data. The metadata is stored in a data structure in memory. A query for data stored in the non-relational database is received. The query is executed using the generated metadata to generate a metadata result set. A result set including data in the non-relational database is generated using the generated metadata result set.

Patent
08 Jun 2012
TL;DR: In this paper, a hybrid data management/storage system is presented, which includes two or more integrated or connected data management systems, and an external application and/or user interacts with the hybrid management system using a unified interface, such that the incoming raw data may be directed to be stored in any of a plurality of data management system based on the incoming data object having one or more of a number of predefined characteristics, including for example size and data type.
Abstract: A hybrid data management/storage system is provided which includes two or more integrated or connected data management systems. An external application and/or user interacts with the hybrid data management/storage system using a unified interface. Incoming raw data may be directed to be stored in any of a plurality of data management systems based on the incoming data object having one or more of a number of predefined characteristics, including for example size and/or data type. Metadata corresponding to all incoming data objects may be stored in a particular data store, regardless of whether the incoming object's raw data is stored in a different one of the plurality of data stores.

12 Nov 2012
TL;DR: A novel approach of characterization and extraction of semantic metadata through the analysis of sensor data raw observations is proposed, using approximations to represent the raw sensor measurements, and building a classification scheme to automatically infer sensor metadata like the type of observed property.
Abstract: Sensor network deployments have become a primary source of big data about the real world that surrounds us, measuring a wide range of physical properties in real time. With such large amounts of heterogeneous data, a key challenge is to describe and annotate sensor data with high-level metadata, using and extending models, for instance with ontologies. However, to automate this task there is a need for enriching the sensor metadata using the actual observed measurements and extracting useful meta-information from them. This paper proposes a novel approach of characterization and extraction of semantic metadata through the analysis of sensor data raw observations. This approach consists in using approximations to represent the raw sensor measurements, based on distributions of the observation slopes, building a classification scheme to automatically infer sensor metadata like the type of observed property, integrating the semantic analysis results with existing sensor networks metadata.

Patent
22 Jun 2012
TL;DR: In this paper, real-time metadata tracks recorded to media streams allow search and analysis operations in a variety of contexts and allow insertion of content specific advertising during appropriate portions of a media stream based on the content of the metadata tracks.
Abstract: Real-time metadata tracks recorded to media streams allow search and analysis operations in a variety of contexts. Search queries can be performed using information in real-time metadata tracks such as closed captioning, sub-title, statistical tracks, miscellaneous data tracks. Media streams can also be augmented with additional tracks. The metadata tracks not only allow efficient searching and indexing, but also allow insertion of content specific advertising during appropriate portions of a media stream based on the content of the metadata tracks.

Patent
24 Feb 2012
TL;DR: In this paper, the authors present a system for cataloging content metadata from a variety of sources and providing metadata to client devices by assigning confidence scores metadata fields from each data record, and use these confidence scores to select the metadata that is transmitted to the client device.
Abstract: Systems and methods are provided for cataloging content metadata from a variety of sources and providing metadata to client devices. A processing device receives inconsistent data records representative of a common content element, with different values for a metadata field descriptive of a common attribute of the content element. The processor assign confidence scores metadata fields from each data record, and use these confidence scores to select the metadata that is transmitted to the client device.

Patent
29 May 2012
TL;DR: In this paper, the present disclosure includes apparatus (e.g., computing systems, memory systems, controllers, etc.) and methods for providing data integrity, which can include, for example, receiving a number of sectors of data to be written to memory devices; appending first metadata corresponding to the number of sector and including first integrity data to the sector, the first metadata has a particular format; generating second integrity data corresponding to at least one of the sectors, the second metadata having a second format.
Abstract: The present disclosure includes apparatus (e.g., computing systems, memory systems, controllers, etc.) and methods for providing data integrity. One or more methods can include, for example: receiving a number of sectors of data to be written to a number of memory devices; appending first metadata corresponding to the number of sectors and including first integrity data to the number of sectors, the first metadata has a particular format; generating second integrity data to be provided in second metadata, the second integrity data corresponding to at least one of the number of sectors (wherein the second metadata has a second format); and generating third integrity data to be provided in the second metadata, the third integrity data including error data corresponding to the second integrity data and the at least one of the number of sectors.

Patent
31 May 2012
TL;DR: In this article, a structured query language (SQL) compiler is used to identify a call to a table valued user defined function (TVUDF) within a SQL statement that includes an insert statement, identify metadata associated with the TVUDF, validate and resolve a subclass type of the TVDF based on metadata and the insert statement; and generate a data loading plan to retrieve and load data from an external data source into a table of a database based on the subclass type.
Abstract: Data loading with user defined functions is described in various implementations. An example system for data loading may include a structured query language (SQL) compiler to identify a call to a table valued user defined function (TVUDF) within a SQL statement that includes an insert statement; identify metadata associated with the TVUDF; validate and resolve a subclass type of the TVUDF based on the metadata and the insert statement; and generate a data loading plan to retrieve and load data from an external data source into a table of a database based on the subclass type of the TVUDF. The system may also include a data loading engine in the database to execute the data loading plan, the data loading plan including the TVUDF to retrieve data from the external data source, and load the retrieved data into the table of the database in accordance with the data loading plan.

Proceedings ArticleDOI
10 Jun 2012
TL;DR: This work proposes a novel approach for overview-first exploration of data collections based on user-selected metadata properties, and applies the method on real data sets from the earth observation community, showing its applicability and usefulness.
Abstract: Today's digital libraries (DLs) archive vast amounts of information in the form of text, videos, images, data measurements, etc. User access to DL content can rely on similarity between metadata elements, or similarity between the data itself (content-based similarity). We consider the problem of exploratory search in large DLs of time-oriented data. We propose a novel approach for overview-first exploration of data collections based on user-selected metadata properties. In a 2D layout representing entities of the selected property are laid out based on their similarity with respect to the underlying data content. The display is enhanced by compact summarizations of underlying data elements, and forms the basis for exploratory navigation of users in the data space. The approach is proposed as an interface for visual exploration, leading the user to discover interesting relationships between data items relying on content-based similarity between data items and their respective metadata labels. We apply the method on real data sets from the earth observation community, showing its applicability and usefulness.