Showing papers on "Metadata repository published in 2009"

PDF

Open Access

Patent•

Associating and linking compact disc metadata

[...]

Ted Dunning, Bradley D. Kindig, Sean Cornell Joshlin, Christopher P. Archibald

08 Jun 2009

TL;DR: Improved techniques for enhancing, associating, and linking various sources of metadata for music files, to allow integration of commercially generated metadata with user-entered metadata, and to ensure that metadata provided to the user is of the highest quality and accuracy available, even when the metadata comes from disparate sources having different levels of credibility as discussed by the authors.

...read moreread less

Abstract: Improved techniques for enhancing, associating, and linking various sources of metadata for music files, to allow integration of commercially generated metadata with user-entered metadata, and to ensure that metadata provided to the user is of the highest quality and accuracy available, even when the metadata comes from disparate sources having different levels of credibility. The invention further provides improved techniques for identifying approximate matches when querying metadata databases, and also provides improved techniques for accepting user submissions of metadata, for categorizing user submissions according to relative credibility, and for integrating user submissions with existing metadata.

...read moreread less

221 citations

Patent•

Accessing media data using metadata repository

[...]

Walter Chang¹, Michael J. Welch¹•Institutions (1)

Adobe Systems¹

13 Nov 2009

TL;DR: In this article, a computer-implemented method includes parsing a user query to determine whether the user query assigns a field to the first term, parsing resulting in a parsed query that conforms to a predefined format, performing a search in a metadata repository using the parsed query, the metadata repository embodied in a computer readable medium and including triplets generated based on multiple modes of metadata for video content, search identifying a set of candidate scenes from the video content.

...read moreread less

Abstract: A computer-implemented method includes receiving, in a computer system, a user query comprising at least a first term, parsing the user query to at least determine whether the user query assigns a field to the first term, the parsing resulting in a parsed query that conforms to a predefined format, performing a search in a metadata repository using the parsed query, the metadata repository embodied in a computer readable medium and including triplets generated based on multiple modes of metadata for video content, the search identifying a set of candidate scenes from the video content, ranking the set of candidate scenes according to a scoring metric into a ranked scene list, and generating an output from the computer system that includes at least part of the ranked scene list, the output generated in response to the user query.

...read moreread less

165 citations

Journal Article•DOI•

Metadata Quality in Digital Repositories: A Survey of the Current State of the Art

[...]

Jung-ran Park¹•Institutions (1)

Drexel University¹

03 Apr 2009-Cataloging & Classification Quarterly

TL;DR: Results of the study indicate a pressing need for the building of a common data model that is interoperable across digital repositories.

...read moreread less

Abstract: This study presents the current state of research and practice on metadata quality through focus on the functional perspective on metadata quality, measurement, and evaluation criteria coupled with mechanisms for improving metadata quality. Quality metadata reflect the degree to which the metadata in question perform the core bibliographic functions of discovery, use, provenance, currency, authentication, and administration. The functional perspective is closely tied to the criteria and measurements used for assessing metadata quality. Accuracy, completeness, and consistency are the most common criteria used in measuring metadata quality in the literature. Guidelines embedded within a Web form or template perform a valuable function in improving the quality of the metadata. Results of the study indicate a pressing need for the building of a common data model that is interoperable across digital repositories.

...read moreread less

147 citations

Proceedings Article•

Spyglass: fast, scalable metadata search for large-scale storage systems

[...]

Andrew W. Leung¹, Minglong Shao, Timothy Bisson, Shankar Pasupathy, Ethan L. Miller¹ - Show less +1 more•Institutions (1)

University of California, Santa Cruz¹

24 Feb 2009

TL;DR: Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties, including Snapshot-based metadata collection, which is up to 10× faster than existing approaches.

...read moreread less

Abstract: The scale of today's storage systems has made it increasingly difficult to find and manage files. To address this, we have developed Spyglass, a file metadata search system that is specially designed for large-scale storage systems. Using an optimized design, guided by an analysis of real-world metadata traces and a user study, Spyglass allows fast, complex searches over file metadata to help users and administrators better understand and manage their files. Spyglass achieves fast, scalable performance through the use of several novel metadata search techniques that exploit metadata search properties. Flexible index control is provided by an index partitioning mechanism that leverages namespace locality. Signature files are used to significantly reduce a query's search space, improving performance and scalability. Snapshot-based metadata collection allows incremental crawling of only modified files. A novel index versioning mechanism provides both fast index updates and "back-in-time" search of metadata. An evaluation of our Spyglass prototype using our real-world, large-scale metadata traces shows search performance that is 1-4 orders of magnitude faster than existing solutions. The Spyglass index can quickly be updated and typically requires less than 0.1%of disk space. Additionally, metadata collection is up to 10× faster than existing approaches.

...read moreread less

146 citations

Patent•

Editing metadata in a social network

[...]

Jeffrey Scott Boston¹, Bernice E. Rogowitz¹, Mercan Topkara¹, Stephen P. Wood¹•Institutions (1)

IBM¹

15 Jan 2009

TL;DR: In this article, a system and method for server-side method for editing metadata in a file, the method including steps of: receiving from a user a request for editing the metadata in the file; presenting a window to the user for display on a user's screen wherein the window displays properties of the metadata; receiving from the user an edit to the metadata properties; and updating the metadata property with the edit received from the users, for producing an updated metadata.

...read moreread less

Abstract: A system and method for server-side method for editing metadata in a file, the method including steps of: receiving from a user a request for editing the metadata in the file; presenting a window to the user for display on a user's screen wherein the window displays properties of the metadata; receiving from the user an edit to the metadata properties; and updating the metadata properties with the edit received from the user, for producing an updated metadata.

...read moreread less

134 citations

Journal Article•DOI•

The spectral database SPECCHIO for improved long-term usability and data sharing

[...]

Andreas Hueni¹, Jens Nieke², Jürg Schopfer¹, Mathias Kneubühler¹, Klaus I Itten¹ - Show less +1 more•Institutions (2)

University of Zurich¹, European Space Research and Technology Centre²

01 Mar 2009-Computers & Geosciences

TL;DR: The recently redesigned SPECCHIO system stores spectral and metadata in a relational database based on a non-redundant data model and offers efficient data import, automated metadata generation, editing and retrieval via a Java application.

...read moreread less

133 citations

Proceedings Article•DOI•

Improving metadata management for small files in HDFS

[...]

Grant Mackey¹, Saba Sehrish¹, Jun Wang¹•Institutions (1)

University of Central Florida¹

16 Oct 2009

TL;DR: This work proposes a mechanism to store small files in HDFS efficiently and improve the space utilization for metadata, and provides for new job functionality to allow for in-job archival of directories and files so that running MapReduce programs may complete without being killed by the JobTracker due to quota policies.

...read moreread less

Abstract: Scientific applications are adapting HDFS/MapReduce to perform large scale data analytics. One of the major challenges is that an overabundance of small files is common in these applications, and HDFS manages all its files through a single server, the Namenode. It is anticipated that small files can significantly impact the performance of Namenode. In this work we propose a mechanism to store small files in HDFS efficiently and improve the space utilization for metadata. Our scheme is based on the assumption that each client is assigned a quota in the file system, for both the space and number of files. In our approach, we utilize the compression method ‘harballing', provided by Hadoop, to better utilize the HDFS. We provide for new job functionality to allow for in-job archival of directories and files so that running MapReduce programs may complete without being killed by the JobTracker due to quota policies. This approach leads to better functionality of metadata operations and more efficient usage of the HDFS. Our analysis results show that we can reduce the metadata footprint in main memory by a factor of 42.

...read moreread less

128 citations

Patent•

System, method and apparatus for enterprise policy management

[...]

Jeff G. Bone, Laura Arbilla¹, Keith Zoellner, Bradley Might, Jeremy Kaplan, Morry Belkin, Peter A. Lee, Brett A. Funderburg, A. Paul Jimenez - Show less +5 more•Institutions (1)

IBM¹

01 Oct 2009

TL;DR: In this paper, the authors present a system, methods and apparatuses for managing objects (files and directories) in network file systems according to policies, each policy may have one or more rules, each of which ties a condition to an action.

...read moreread less

Abstract: Disclosed are systems, methods and apparatuses for managing objects (files and directories) in network file systems according to policies. Each policy may have one or more rules, each of which ties a condition to an action. Each condition can be expressed in terms of metadata harvested across file systems and stored in a metadata repository. The actions are user-programmable. Users can apply and/or enforce a policy by manipulating the metadata stored in the metadata repository. For example, suppose a policy prohibits storing MP3 files in corporate storage, a user can specify a rule that ties the condition “no MP3 files in volumes A-Z” to an action “delete MP3 files from volumes A-Z.” A file management application may apply a filter to the metadata repository to produce metadata records having values that meet the specified condition and take the corresponding action on managed objects associated with those metadata records.

...read moreread less

113 citations

Journal Article•DOI•

Automatic evaluation of metadata quality in digital repositories

[...]

Xavier Ochoa¹, Erik Duval²•Institutions (2)

Escuela Superior Politecnica del Litoral¹, Université catholique de Louvain²

01 Dec 2009-International Journal on Digital Libraries

TL;DR: A set of scalable quality metrics for metadata based on the Bruce & Hillman framework for metadata quality control is presented and it is found that several metrics, especially Text Information Content, correlate well with human evaluation and that the average of all the metrics are roughly as effective as people to flag low-quality instances.

...read moreread less

Abstract: Owing to the recent developments in automatic metadata generation and interoperability between digital repositories, the production of metadata is now vastly surpassing manual quality control capabilities. Abandoning quality control altogether is problematic, because low-quality metadata compromise the effectiveness of services that repositories provide to their users. To address this problem, we present a set of scalable quality metrics for metadata based on the Bruce & Hillman framework for metadata quality control. We perform three experiments to evaluate our metrics: (1) the degree of correlation between the metrics and manual quality reviews, (2) the discriminatory power between metadata sets and (3) the usefulness of the metrics as low-quality filters. Through statistical analysis, we found that several metrics, especially Text Information Content, correlate well with human evaluation and that the average of all the metrics are roughly as effective as people to flag low-quality instances. The implications of this finding are discussed. Finally, we propose possible applications of the metrics to improve tools for the administration of digital repositories.

...read moreread less

111 citations

Patent•

Conflict management during data object synchronization between client and server

[...]

David Braginsky¹, Justin M. Rosenstein¹, Eric Joseph Uhrhane¹, David Jeske¹•Institutions (1)

Google¹

04 May 2009

TL;DR: In this paper, a client stores client metadata entries corresponding to a plurality of data objects and requests a user to select from among a predefined set of conflict resolutions to resolve the conflict, and the client performs an action in accordance with the conflict resolution selected by the user.

...read moreread less

Abstract: A client stores client metadata entries corresponding to a plurality of data objects. During a first phase of a synchronization process, the client sends one or more client metadata entries to a server. Each client metadata entry sent corresponds to a data object for which at least one metadata parameter has changed since a prior execution of the synchronization process. During a second phase of the synchronization process, the client receives from the server one or more server metadata entries, each having at least one parameter that has changed since a prior execution of the synchronization process. The client identifies any received server metadata entry that conflicts with a corresponding client metadata entry, requests a user to select from among a predefined set of conflict resolutions to resolve the conflict, and the performs an action in accordance with the conflict resolution selected by the user.

...read moreread less

110 citations

Patent•

Storing log data efficiently while supporting querying

[...]

Wei Huang¹, Yizheng Zhou¹, Bin Yu¹, Wenting Tang¹, Christian F. Beedgen¹ - Show less +1 more•Institutions (1)

Hewlett-Packard¹

04 Sep 2009

TL;DR: In this article, a logging system includes an event receiver and a storage manager, where the receiver receives log data, processes it, and outputs a column-based data "chunk" which acts as a search index when querying event data.

...read moreread less

Abstract: A logging system includes an event receiver and a storage manager. The receiver receives log data, processes it, and outputs a column-based data “chunk.” The manager receives and stores chunks. The receiver includes buffers that store events and a metadata structure that stores metadata about the contents of the buffers. Each buffer is associated with a particular event field and includes values from that field from one or more events. The metadata includes, for each “field of interest,” a minimum value and a maximum value that reflect the range of values of that field over all of the events in the buffers. A chunk is generated for each buffer and includes the metadata structure and a compressed version of the buffer contents. The metadata structure acts as a search index when querying event data. The logging system can be used in conjunction with a security information/event management (SIEM) system.

...read moreread less

Patent•

Method and system of generating reference variations for directory assistance data

[...]

Kyle Oppenheim, David Mitby, Nick Kibre

02 Nov 2009

TL;DR: In this paper, a digital directory comprising listings is accessed and metadata information associated with individual listings describing the individual listings is modified to generate transformed metadata information for aiding in an automated user input recognition process.

...read moreread less

Abstract: Methods and systems of performing user input recognition are disclosed. A digital directory comprising listings is accessed. Metadata information is associated with individual listings describing the individual listings. The metadata information is modified to generate transformed metadata information. Therefore, the transformed metadata information is generated as a function of context information relating to a typical user interaction with the listings. Information is generated for aiding in an automated user input recognition process based on the transformed metadata information.

...read moreread less

Collaborative and Content-based Filtering for Item Recommendation on Social Bookmarking Websites

[...]

A.M. Bogers¹, A.P.J. van den Bosch•Institutions (1)

Tilburg University¹

01 Jan 2009

TL;DR: This paper focuses on the task of item recommendation for social bookmarking websites, i.e. predicting which unseen bookmarks a user might like based on his or her profile, and examines how to incorporate the tags and other metadata into a nearest-neighbor collaborative filtering (CF) algorithm.

...read moreread less

Abstract: Social bookmarking websites allow users to store, organize, and search bookmarks of web pages. Users of these services can annotate their bookmarks by using informal tags and other metadata, such as titles, descriptions, etc. In this paper, we focus on the task of item recommendation for social bookmarking websites, i.e. predicting which unseen bookmarks a user might like based on his or her profile. We examine how we can incorporate the tags and other metadata into a nearest-neighbor collaborative filtering (CF) algorithm, by replacing the traditional usage-based similarity metrics by tag overlap, and by fusing tag-based similarity with usage-based similarity. In addition, we perform experiments with content-based filtering by using the metadata content to recommend interesting items. We generate recommendations directly based on KullbackLeibler divergence of the metadata language models, and we explore the use of this metadata in calculating user and item similarities. We perform our experiments on three data sets from two di erent domains: Delicious, CiteULike and BibSonomy.

...read moreread less

Patent•

Hash join using collaborative parallel filtering in intelligent storage with offloaded bloom filters

[...]

Dmitry Potapov¹, Yiu Woon Lau¹, Hakan Jakobsson¹, Umesh Panchaksharaiah¹, Poojan Kumar¹ - Show less +1 more•Institutions (1)

Business International Corporation¹

18 Sep 2009

TL;DR: In this article, a storage system analyzes the raw data based on join metadata, removing a certain amount of data that is guaranteed to be irrelevant to the join operation, then returns filtered data to the database server.

...read moreread less

Abstract: Processing resources at a storage system for a database server are utilized to perform aspects of a join operation that would conventionally be performed by the database server. When requesting a range of data units from a storage system, the database server includes join metadata describing aspects of the join operation for which the data is being requested. The join metadata may be, for instance, a bloom filter. The storage system reads the requested data from disk as normal. However, prior to sending the requested data back to the storage system, the storage system analyzes the raw data based on the join metadata, removing a certain amount of data that is guaranteed to be irrelevant to the join operation. The storage system then returns filtered data to the database server. The database system thereby avoids the unnecessary transfer of certain data between the storage system and the database server.

...read moreread less

Patent•

Repository including installation metadata for executable applications

[...]

Richard Offer¹•Institutions (1)

VMware¹

15 Jan 2009

TL;DR: In this paper, the authors describe an application specific runtime environment defined by an application environment specification to include a minimal or reduced set of software resources required for execution of the application, which are optionally stored in a resource repository that includes resources associated with a plurality of operating systems and executable applications.

...read moreread less

Abstract: Systems and methods of executing and/or provisioning an application in an application specific runtime environment are disclosed. The application specific runtime environment is defined by an application environment specification to include a minimal or reduced set of software resources required for execution of the application. These software resources are optionally stored in a resource repository that includes resources associated with a plurality of operating systems and/or executable applications. Various embodiments of the invention include the development of hierarchical resource metadata configured to characterize the various files, packages and file families included in the resource repository. In some embodiments this metadata is used to select between files and different versions of files when provisioning an application specific runtime environment.

...read moreread less

Journal Article•DOI•

A Metadata Best Practice for a Scientific Data Repository

[...]

Jane Greenberg¹, Hollie White¹, Sarah Carrier¹, Ryan Scherle²•Institutions (2)

University of North Carolina at Chapel Hill¹, National Evolutionary Synthesis Center²

10 Dec 2009-Journal of Library Metadata

TL;DR: The Dryad repository's metadata best practice balancing of these two needs is presented, and the conclusion summarizes limitations and advantages of the two prongs underlying Dryad's metadata effort.

...read moreread less

Abstract: Digital data repositories ought to support immediate operational needs and long-term project goals. This paper presents the Dryad repository's metadata best practice balancing of these two needs. The paper reviews background work exploring the meaning of science, characterizing data, and highlighting data curation metadata challenges. The Dryad repository is introduced, and the initiative's metadata best practice and underlying rationales are described. Dryad's metadata approach includes two prongs: one addressing the long-term goal to align with the Semantic Web via a metadata application profile; and another addressing the immediate need to make content available in DSpace via an extensible markup language (XML) schema. The conclusion summarizes limitations and advantages of the two prongs underlying Dryad's metadata effort.

...read moreread less

Patent•

Dynamic prefetching method and system for metadata

[...]

Cetin Goerkem, Guemues Uygar, Kuepuesoglu Oguz

18 Nov 2009

TL;DR: In this paper, the authors proposed a method and system for reducing the latencies of retrieving metadata by a user application, which aids a seamless browsing experience as the metadata is immediately available, being already prefetched, to the user upon his actual request.

...read moreread less

Abstract: The present invention relates generally to interactive communication systems and, more particularly, to a method and system for reducing latencies of retrieving metadata by a user application. The method comprises providing a browsing interface for the user to browse through; and prefetching metadata likely to be soon accessed in advance of an actual user request for access to said metadata; wherein said prefetching of said metadata is performed dynamically based on prediction of at least one next available user action with respect to a current view of said browsing interface being directed towards fetching said metadata. The present invention aids a seamless browsing experience as the metadata is immediately available, being already prefetched, to the user upon his actual request.

...read moreread less

Patent•

System and method for building virtual appliances using a repository metadata server and a dependency resolution service

[...]

Matthew William Barringer¹•Institutions (1)

Novell¹

11 Feb 2009

TL;DR: In this paper, a system and method for building virtual appliances using a repository metadata server and a dependency resolution service is provided, whereby remote clients may follow a simple and repeatable process for creating virtual appliances.

...read moreread less

Abstract: A system and method for building virtual appliances using a repository metadata server and a dependency resolution service is provided. In particular, a hosted web service may provide a collaborative environment for managing origin repositories and software dependencies, whereby remote clients may follow a simple and repeatable process for creating virtual appliances. For example, the repository metadata server may cache and parse metadata associated with an origin repository, download software from the origin repository, and generate resolution data that can be used by the dependency resolution service. The dependency resolution service may then use the resolution data to resolve dependencies for a package selected for an appliance, wherein the dependencies may include packages that are required, recommended, suggested, banned, or otherwise a dependency for the selected package.

...read moreread less

Patent•

Opinion search engine

[...]

Jian-Tao Sun¹, Xiaochuan Ni¹, Peng Xu¹, Gang Wang¹, Ke Tang¹, Zheng Chen¹ - Show less +2 more•Institutions (1)

Microsoft¹

29 Sep 2009

TL;DR: In this article, a computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a computer, cause the computer to implement an opinion search engine is defined.

...read moreread less

Abstract: A computer-readable storage medium having stored thereon computer-executable instructions which, when executed by a computer, cause the computer to implement an opinion search engine. The instructions to implement an opinion search engine cause the computer to collect opinion data about one or more objects from the Internet, extract metadata about the opinion data from the opinion data, remove duplicate metadata from the metadata to generate a resulting metadata, categorize the resulting metadata for similar objects according to one or more taxonomies from one or more websites on the Internet and rank the similar objects based on the categorized metadata.

...read moreread less

Patent•

System and method of leveraging proximity data in a web-based socially-enabled knowledge networking environment

[...]

Scott White¹, Nova T. Spivack•Institutions (1)

Radar Networks¹

07 May 2009

TL;DR: In this article, a system and methods for leveraging proximity data in a web-based socially-enabled information networking environment are disclosed, which may be implemented on a system, of semantic advertising via semantic profiles.

...read moreread less

Abstract: Systems and methods for leveraging proximity data in a web-based socially- enabled information networking environment are disclosed. In one aspect, embodiments of the present disclosure include a method, which may be implemented on a system, of semantic advertising via semantic profiles. One embodiment can include, receiving a model profile from an advertiser, enforcing a set of rules that govern accessibility of the web content, parsing the model profile to obtain a first set of model user metadata associated with the ideal set of user characteristics, comparing model user metadata of the first set of model user metadata with user metadata of a set of user metadata of a semantic user profile of a user, and generating a correlation index to indicate a degree of correlation between the model profile and the semantic user profile.

...read moreread less

Patent•

Flash management using separate metadata storage

[...]

Kevin L. Kilzer, Robert W. Ellis, Rudolph J. Sterbenz

15 Apr 2009

TL;DR: In this paper, techniques for flash memory management, including storing metadata and/or error correcting information separately from payload data, are discussed, where metadata and error correction information are stored in a random access memory within a solid state drive.

...read moreread less

Abstract: Disclosed are techniques for flash memory management, including storing metadata and/or error correcting information separately from payload data. In various embodiments, metadata and/or error correcting information are stored in a random access memory within a solid state drive.

...read moreread less

Patent•

Method and system for applying metadata to data sets of file objects

[...]

Stephen R. Germann, Ryan C. Germann, Steven Cooper, Eric Mah

03 Jul 2009

TL;DR: In this paper, the methods and systems for developing, specifying, and assigning descriptive information relating to the contents of a file (i.e., metadata) are described, and user interface controls on a computer screen implement a dynamically changing display which responds to user input by presenting new categories of choices.

...read moreread less

Abstract: The present invention generally relates to the methods and systems for developing, specifying, and assigning descriptive information relating to the contents of a file (i.e., metadata). User interface controls on a computer screen implement a dynamically changing display which responds to user input by presenting new categories of choices. Additional controls allow optimization of the process of specifying and assigning descriptive metadata.

...read moreread less

Journal Article•DOI•

Automatic metadata generation using associative networks

[...]

Marko A. Rodriguez¹, Johan Bollen¹, Herbert Van de Sompel¹•Institutions (1)

Los Alamos National Laboratory¹

09 Mar 2009-ACM Transactions on Information Systems

TL;DR: In this article, the authors proposed an automatic metadata generation system that leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources.

...read moreread less

Abstract: In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The automatic metadata generation system proposed in this article leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources. Because of its independence from content analysis, it can be applied to a wide variety of resource media types and is shown to be computationally inexpensive. The proposed method operates through two distinct phases. Occurrence and cooccurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata. Second, using the associative network as a substrate, metadata associated with metadata-rich resources is propagated to metadata-poor resources by means of a discrete-form spreading activation algorithm. This article discusses the general framework for building associative networks, an algorithm for disseminating metadata through such networks, and the results of an experiment and validation of the proposed method using a standard bibliographic dataset.

...read moreread less

Patent•

Techniques to consume content and metadata

[...]

James E. Allard¹•Institutions (1)

Microsoft¹

23 Jan 2009

TL;DR: In this article, the content and metadata associated with the content may be provided to a number of users by displaying the content on a display device while the metadata may be transmitted to a remote device corresponding to a receiving user.

...read moreread less

Abstract: Content and metadata associated with the content may be provided to a number of users. The content may be displayed on a display device while the metadata may be transmitted to a remote device corresponding to a receiving user. The user may further request desired information or metadata pertaining to the content and the requested information or metadata may be transmitted to the user's remote device. Different users may request different information on the same or different objects being displayed or presented on a display device. Each requesting user may receive requested information on the same or different objects via corresponding remote devices.

...read moreread less

Proceedings Article•DOI•

Metadata Extraction from PDF Papers for Digital Library Ingest

[...]

Simone Marinai

26 Jul 2009

TL;DR: A package that is designed to extract basic metadata from PDF documents is described, based on a suitable combination of several techniques that include PDF parsing, low level document image processing, and layout analysis.

...read moreread less

Abstract: In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract basic metadata from these documents. The package is used in combination with a digital library software suite to easily build personal digital libraries. The proposed software is based on a suitable combination of several techniques that include PDF parsing, low level document image processing, and layout analysis. In addition, we use the information gathered from a widely known citation database (DBLP) to assist the tool in the difficult task of author identification. The system is tested on some paper collections selected from recent conference proceedings.

...read moreread less

Patent•

Synchronization of metadata

[...]

Charles F. Kocsis¹, Pamela V. Bergstrom¹, Brian Van Pelt¹, Srinivas Turlapati¹•Institutions (1)

Harris Corporation¹

02 Mar 2009

TL;DR: In this paper, a system to synchronize metadata for a plurality of applications is presented, where a rules engine is programmed to apply at least a first set of the content administration rules to a metadata record received from a first application of the plurality of applied to control updating corresponding metadata stored in a master database.

...read moreread less

Abstract: A system to synchronize metadata for a plurality of applications. The system includes content administration rules programmed to define policies for updating metadata in the master database and policies for propagating updates in the metadata to the plurality of applications. The metadata describes at least one asset represented as data residing in at least one of the plurality of applications. A rules engine is programmed to apply at least a first set of the content administration rules to a metadata record received from a first application of the plurality of applications to control updating corresponding metadata stored in a master database. Changes in the corresponding metadata made to the master database can be propagated to at least one second application of the plurality of applications according to a second set of the content administration rules predefined for each of the at least one second application.

...read moreread less

Patent•

Metadata server and metadata management method

[...]

Myung Hoon Cha¹, Hong Yeon Kim¹, Ki Sung Jin¹, Young-Kyun Kim¹, Han Namgoong¹ - Show less +1 more•Institutions (1)

Electronics and Telecommunications Research Institute¹

16 Jul 2009

TL;DR: In this paper, a metadata server includes a directory hierarchy storage unit, a metadata storage unit and a search unit, which stores all directory hierarchies which are stored in the metadata server cluster.

...read moreread less

Abstract: Provided are a metadata server cluster and a metadata management method thereof, which distribute metadata for a file to a cluster including a plurality of metadata servers to replicate the metadata. The metadata server includes a directory hierarchy storage unit, a metadata storage unit, and a search unit. The directory hierarchy storage unit stores all directory hierarchies which are stored in the metadata server cluster. The metadata storage unit stores metadata for a data file. The search unit searches the directory hierarchies and the metadata.

...read moreread less

Journal Article•DOI•

Towards a semantics-based approach in the development of geographic portals

[...]

Nikolaos Athanasis¹, Kostas Kalabokidis¹, Michail Vaitis¹, Nikolaos Soulakellis¹•Institutions (1)

University of the Aegean¹

01 Feb 2009-Computers & Geosciences

TL;DR: A new methodology for knowledge discovery in geographic portals is presented based on the Semantic Web, which exploits the Resource Description Framework (RDF) in order to describe the geoportal's information with ontology-based metadata.

...read moreread less

Patent•

Information search method and information provision method based on user's intention

[...]

Hee Sung Chung, 정희성

11 Dec 2009

TL;DR: In this paper, an information search method and an information provision method based on the user's intentions are provided, where an editing device matched to the searcher's intentions ascertained using the results of analysis of searched words is provided.

...read moreread less

Abstract: Provided are an information search method and an information provision method based on the user's intentions. The information search method comprises: providing an editing device matched to the searcher's intentions ascertained using the results of analysis of searched words; and searching contents having metadata relating to metadata input through the editing device. In this way, the searcher's intentions can be ascertained from the information input by the searcher, detailed metadata input can be derived based on the ascertained intentions, and a search can be carried out using the input metadata.

...read moreread less

Patent•

Systems and methods for using metadata to enhance data management operations

[...]

Anand Prahlad, Jeremy A. Schwartz, David Ngo, Brian Brockway, Marcus S. Muller - Show less +1 more

02 Nov 2009

TL;DR: In this paper, a metabase formed from metadata can be used for various data management operations, such as enhanced data management, enhanced data identification, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data.

...read moreread less

Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.

...read moreread less

Collapse