scispace - formally typeset
Search or ask a question

Showing papers on "Metadata repository published in 2002"


Book ChapterDOI
01 Oct 2002
TL;DR: OntoMat-Annotizer extract with the help of Amilcare knowledge structure from web pages through the use of knowledge extraction rules, the result of a learning-cycle based on already annotated pages.
Abstract: Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, S-CREAM, that allows for creation of metadata and is trainable for a specific domain. Annotating web documents is one of the major techniques for creating metadata on the web. The implementation of S-CREAM, OntoMat-Annotizer supports now the semi-automatic annotation of web pages. This semi-automatic annotation is based on the information extraction component Amilcare. OntoMat-Annotizer extract with the help of Amilcare knowledge structure from web pages through the use of knowledge extraction rules. These rules are the result of a learning-cycle based on already annotated pages.

355 citations


Patent
05 Apr 2002
TL;DR: A watermark reader device reads a watermark embedded into media content and forwards the watermark information to a router, which then uses the information to find a metadata database identifier and sends the related metadata to the reader device as discussed by the authors.
Abstract: A method of performing digital asset management of media content. In this method, a watermark reader device reads a watermark embedded into media content. The watermark conveys watermark information, such as a content identifier and creator identifier. The reader forwards the watermark information to a router. The router then uses the watermark information to find a metadata database identifier. It then sends a request for metadata along with the watermark information to the metadata database identified by the metadata database identifier. The metadata database uses the watermark information to find related metadata for the media content and sends the related metadata to the reader device.

329 citations


Patent
02 Mar 2002
TL;DR: In this article, a push-pull model for efficient low-latency video-content distribution over a network is proposed, where metadata is used as a vehicle and mechanism to enable intelligent decisions to be made on content distribution system operation.
Abstract: Method, system, computer program and computer program product for a metadata enabled push-pull model and method for efficient low-latency video-content distribution over a network. Metadata is used as a vehicle and mechanism to enable intelligent decisions to be made on content distribution system operation. Metadata is data that contains information about the actual content, and in some cases, the metadata may also contain portions of the content or a low-resolution preview of the content. Aspects of the invention are directed toward the distribution of metadata throughout the network in a way that facilitates efficient system operation as well as optionally but advantageously providing set of services such as tracking, reporting, personalization, and the like.

315 citations


Proceedings ArticleDOI
07 May 2002
TL;DR: This work provides a framework, CREAM, that allows for creation of metadata, and describes its implementation, viz.
Abstract: Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, CREAM, that allows for creation of metadata. While the annotation mode of CREAM allows to create metadata for existing web pages, the authoring mode lets authors create metadata --- almost for free --- while putting together the content of a page.As a particularity of our framework, CREAM allows to create relational metadata, i.e. metadata that instantiate interrelated definitions of classes in a domain ontology rather than a comparatively rigid template-like schema asm Dublin Core. We discuss some of the requirements one has to meet when developing such an ontology-based framework, e.g. the integration of a metadata crawler, inference services, document management and a meta-ontology, and describe its implementation, viz. Ont-O-Mat, a component-based, ontology-driven Web page authoring and annotation tool.

261 citations


Book ChapterDOI
19 Aug 2002
TL;DR: The definition of a set of similarity measures for comparing ontology-based metadata and an application study using these measures within a hierarchical clustering algorithm are proposed.
Abstract: The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computersr and people to work in cooperation. Recently, different applications based on this vision have been designed, e.g. in the fields of knowledge management, community web portals, e-learning, multimedia retrieval, etc. It is obvious that the complex metadata descriptions generated on the basis of pre-defined ontologies serve as perfect input data for machine learning techniques. In this paper we propose an approach for clustering ontology-based metadata. Main contributions of this paper are the definition of a set of similarity measures for comparing ontology-based metadata and an application study using these measures within a hierarchical clustering algorithm.

217 citations


Patent
18 Dec 2002
TL;DR: In this article, a method, system, and program for query processing is described, and a data structure stored in the computer-readable medium includes data for use by the program.
Abstract: Disclosed is a method, system, and program for query processing. Metadata for a facts metadata object and one or more dimension metadata objects that are associated with the facts metadata object is stored. A view with columns for one or more measures in the facts metadata object and one or more attributes in the one or more dimension metadata objects is constructed. Additional metadata that describes roles of columns in the fact and dimension metadata objects is generated. Also disclosed is a computer-readable medium for storing data for access by a program. A data structure stored in the computer-readable medium includes data for use by the program. The data includes a cube model metadata object that includes a facts metadata object, one or more dimension metadata objects, and one or more join metadata objects that describe how one or more tables in the facts metadata object and one or more tables in the one or more dimension metadata objects are joined. The data also includes a cube metadata object that represents a subset of the cube model metadata object and comprises a view with columns for one or more measures of one of the facts metadata objects and one or more attributes of one or more of the dimension metadata objects and a document that describes roles of columns in the facts metadata object and the one or more dimension metadata objects.

215 citations


Proceedings Article
01 Jan 2002
TL;DR: The design of a Metadata Catalog Service (MCS) is presented that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes and a scalability study of the service is presented.
Abstract: Advances in computational, storage and network technologies as well as middle ware such as the Globus Toolkit allow scientists to expand the sophistication and scope of data-intensive applications. These applications produce and analyze terabytes and petabytes of data that are distributed in millions of files or objects. To manage these large data sets efficiently, metadata or descriptive information about the data needs to be managed. There are various types of metadata, and it is likely that a range of metadata services will exist in Grid environments that are specialized for particular types of metadata cataloguing and discovery. In this paper, we present the design of a Metadata Catalog Service (MCS) that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes. We describe our experience in using the MCS with several applications and present a scalability study of the service.

177 citations


Patent
28 Oct 2002
TL;DR: In this paper, an apparatus and method for conducting exclusive and inclusive metadata searches to identify and select multimedia programs is described, which comprises a metadata search controller that compares user specified search words with metadata words to find programs that meet user specified criteria.
Abstract: There is disclosed an apparatus and method for conducting exclusive and inclusive metadata searches to identify and select multimedia programs. The apparatus of the invention comprises a metadata search controller that compares user specified search words with metadata words to find programs that meet user specified search criteria. The metadata search controller executes an inclusive metadata search to search for matches between a user specified search word and a metadata word that is related to the user specified search word in a word pair contained within a word pair database. The metadata search controller calculates a rank value for each program that is found by a metadata search and creates a ranked list of such programs.

176 citations


Patent
16 Jan 2002
TL;DR: In this article, a message-based system has a viewer, where the metadata of the message is updated using the embedded information management resources, including at least one of the group of a deadline, a reminder, a deferral and an obligation.
Abstract: A method for providing emergent and flexible workflow management to a user. The user communicates with other users, using a message-based system having embedded information management resources. The message-based system has a viewer. A message is generated at the message-based system. The message has metadata. The metadata of the message is updated using the embedded information management resources. The metadata include at least one of the group of a deadline, a reminder, a deferral and an obligation. The message is sent to the other users. Some of the metadata of the message are displayed on the viewer of the message-based system. Some of the metadata of the message are tracked at the message-based system.

164 citations


Patent
07 Mar 2002
TL;DR: In this article, the authors propose a system for managing and updating metadata associated with an asset, where an asset provider can associate the asset with metadata. And the metadata can be stored by a metadata administrator (60, 80), and the metadata may be sent to distribution endpoints upon the metadata administrator receiving a request for the metadata from the distribution end points.
Abstract: In a system (100) for managing and updating metadata associated with an asset, an asset provider (5) can associate the asset with metadata. The asset provider (5) can send an asset bundle comprising the metadata and the asset to the distribution endpoints via a satellite (40). In the event the asset provider (5) modifies or updates the metadata associated with the asset, the asset provider (5) can identify which distribution endpoints are to receive the updated metadata. After identifying which distribution endpoints are to receive the updated metadata, the asset provider (5) can send the updated metadata without manual intervention to each of these distribution endpoints. Alternatively, the metadata may be stored by a metadata administrator (60, 80), and the metadata may be sent to distribution endpoints upon the metadata administrator (60,80) receiving a request for the metadata from the distribution endpoints.

154 citations


Patent
Adam Yeh1, Abhijit Kundu1
28 Jun 2002
TL;DR: In this paper, a system and method for a reporting information service using metadata to communicate with databases and a user interface is presented, which includes software in a data access component, a report component, and an interface component for populating, maintaining and dispatching reports responsive to user requests for the reports via metadata.
Abstract: A system and method for a reporting information service using metadata to communicate with databases and a user interface. The invention includes software in a data access component, a report component, and a user interface component for populating, maintaining, and dispatching reports responsive to user requests for the reports via metadata. The data access component provides a logical view of data in a database via data access metadata. The report component populates, maintains, and dispatches reports via report metadata characterizing the reports. The user interface component renders the report dispatched from the report component via user interface metadata specifying rendering attributes for the report.

Patent
26 Sep 2002
TL;DR: In this article, a system and method for managing voicemails using metadata is presented, which includes an audible introduction which a user records and associates the audible introduction to a voicemail.
Abstract: A system and method for managing voicemails using metadata is presented. The metadata includes an audible introduction which a user records and associates the audible introduction to a voicemail. The user is also able to associate other types of metadata to a voicemail, such as a description data flag, a reminder flag, and a retention flag. Once the user associates the metadata with a voicemail, the user is able to retrieve the audible introduction and the voicemail in a sequential manner or through search criteria. The user is able to customize metadata based upon the user's requirements.

Journal ArticleDOI
TL;DR: The MusicBrainz project is a large database of music metadata, and even though it's only in beta testing right now, it already contains over 300,000 tracks, providing what some have termed the "cornucopia of the commons."
Abstract: Music has always caught the public's imagination. From dreams of a giant "jukebox in the sky" over the Information Superhighway to the debate about Napster, music has always been the "killer app" used to describe new technologies. Of course, these dreams have never quite come about as planned. Instead of a smart machine seeking out music tuned to my tastes, I still have only a small number of choices on my radio dial. And ever since Napster started filtering, sharing music on the Internet has become increasingly difficult. One thing that underlies these ideas is their dependency on metadata, or data about data. Metadata provides information about artists, song titles, and so on. All that information is attached to the music, but isn't part of it. The music world suffers from a lack of standardization in terms of metadata formats, as well as a paucity of public metadata. The MusicBrainz project hopes to change this situation. It's a large database of music metadata, and even though it's only in beta testing right now, it already contains over 300,000 tracks. MusicBrainz information is all user-contributed, providing what some have termed the "cornucopia of the commons." Unlike many situations, where each user decreases the value of the shared space (the so-called "tragedy of the commons"), the easy duplication of electronic information creates a situation where each user makes the system more valuable.

Patent
25 Apr 2002
TL;DR: In this paper, the authors present a method and system for associating metadata with user data in a storage array in a manner that provides independence between metadata management and a storage controller's cache block size.
Abstract: The present invention is a method and system for associating metadata with user data in a storage array in a manner that provides independence between metadata management and a storage controller's cache block size. Metadata may be associated with user data according to multiple fashions in order to provide a desired performance benefit. In one example, the metadata may be associated according to a segment basis to maximize random I/O performance and may be associated according to a stripe basis to maximize sequential I/O performance.

Proceedings ArticleDOI
14 Jul 2002
TL;DR: A system for question answering using semi-structured metadata, QuASM (pronounced "chasm"), which aims to answer factual questions by exploiting the structure inherent in documents found on the World Wide Web.
Abstract: This paper describes a system for question answering using semi-structured metadata, QuASM (pronounced "chasm"). Question answering systems aim to improve search performance by providing users with specific answers, rather than having users scan retrieved documents for these answers. Our goal is to answer factual questions by exploiting the structure inherent in documents found on the World Wide Web (WWW). Based on this structure, documents are indexed into smaller units and associated with metadata. Transforming table cells into smaller units associated with metadata is an important part of this task. In addition, we report on work to improve question classification using language models. The domain used to develop this system is documents retrieved from a crawl of www.fedstats.gov.

Patent
12 Jun 2002
TL;DR: In this article, a metadata management system (MDS) may include partitioned migratable metadata, where metadata may be stored in multiple metadata partitions (102-0 to 102-11) and each metadata partition may be assigned to a particular system resource.
Abstract: According to one embodiment, a metadata management system (MDS) may include partitioned migratable metadata. Metadata may be stored in multiple metadata partitions (102-0 to 102-11). Each metadata partition may be assigned to a particular system resource (104-0 to 104-5). According to predetermined policies, such as metadata aging, metadata stored in one metadata partition may be migrated to a different metadata partition. A forwarding object can be placed in the old metadata partition to indicate the new location of the migrated metadata. Metadata partitions (102-0 to 102-11) may be reassigned to different resources, split and/or merged allowing a high degree of scalability, as well as flexibility in meeting storage system needs.

Patent
Katashi Nagao1
07 May 2002
TL;DR: In this article, the authors present methods, apparatus and systems to embed pointer information for metadata in content using a method that will not delete the information, so that metadata correlated with content can be correctly obtained, even after the contents have been edited.
Abstract: The present invention provides methods, apparatus and systems to embed pointer information for metadata in content using a method that will not delete the information, so that metadata correlated with content can be correctly obtained, even after the contents have been edited. In an example embodiment, a user terminal for reproducing multimedia content comprises: a pointer information detector, for detecting pointer information that is embedded in the content and that points to the location of metadata in which information concerning the content is written; a network interface, for employing the pointer information to obtain the metadata via a network; and an index information generator, for employing the metadata to generate index information that is correlated with the data structure of the digital content.

Patent
12 Apr 2002
TL;DR: In this article, a method for executing searches for resources that span more than one private resource repository in a restricted-access resource sharing system is disclosed, where each of the peer nodes is allowed to indicate to the server that the metadata vocabularies associated with the resources are designated as private.
Abstract: A method for executing searches for resources that span more than one private resource repository in a restricted-access resource sharing system is disclosed. The system includes at least one server node (12) and multiple peer nodes (16a-16d) connected to a network. Resources, such as data digital images, may be retrieved from the nodes based by issuing queries containing terms matching the metadata associated with the resources. The method includes maintaining storage of resources and associated metadata on respective peer nodes, wherein the associated metadata is based on at least one metadata vocabulary. Each of the peer nodes is allowed to indicate to the server that the metadata vocabularies associated with the resources are designated as private, thereby becoming a restricted access peer node. If a first restricted access peer node specifies to the server which metadata vocabularies the first restricted access peer node supports, a first level of privacy is provided whereby search queries received by the server that use the specified metadata vocabularies are passed to the first respective restricted access peer nodes for processing, while searches that do not use the specified vocabularies are processed by the server. If the first restricted access peer node does not specify to the server which metadata vocabularies the first restricted access peer node supports, a second level of privacy is provided whereby search queries received by the server are passed to the first respective restricted access peer nodes for processing.

Patent
Daniel Fuchs1
23 Dec 2002
TL;DR: In this paper, a mapping engine, capable of receiving descriptions of manageable software objects in a first language, for generating management information in a second language, is presented, and a set of mapping metadata, corresponding to the management information as generated.
Abstract: A mapping engine, capable of receiving descriptions of manageable software objects in a first language, for generating management information in a second language. The mapping engine is further capable of generating a set of mapping metadata, corresponding to the management information as generated. The mapping engine may be further responsive to user input. In another embodiment, a metadata compiler is provided, capable of receiving management information in a second language, and corresponding mapping metadata, for generating compiled metadata, applicable when using said management information in a first language. The metadata compiler may be used in connection with the above first aspect.

Patent
25 Apr 2002
TL;DR: In this article, a data repository abstraction layer provides a logical view of the underlying data repository that is independent of the particular manner of data representation, and a query abstraction layer is also provided.
Abstract: The present invention generally is directed to a system, method and article of manufacture for accessing data independent of the particular manner in which the data is physically represented. In one embodiment, a data repository abstraction layer provides a logical view of the underlying data repository that is independent of the particular manner of data representation. In one embodiment, the data repository abstraction layer specifies a location of data in a repository and a method for accessing the data. A query abstraction layer is also provided and is based on the data repository abstraction layer. A runtime component performs translation of an abstract query into a form that can be used against a particular physical data representation.

Patent
27 Feb 2002
TL;DR: In this article, the authors present a mechanism for associating metadata with network resources, and for locating and communicating with the network resources by providing a telephone number associated with a network resource.
Abstract: Mechanisms for associating metadata with network resources, and for locating and communicating with the network resources are disclosed. Owners of network resources define metadata that describes each network resource. The metadata includes a telephone number related to the network resource, its location, its language, its region or intended audience, and other descriptive information. The owners register the metadata in a registry. To locate a selected network resource, a client provides the telephone number to a resolver process. The resolver process provides to the client the network resource location corresponding to the telephone number. Accordingly, network resources can be located and communications with the resource can proceed merely by providing the telephone number associated with the network resource.

Patent
Suresh P. Babu1
27 Jun 2002
TL;DR: In this paper, a television server generates a metadata map that represents relationships among media content description data based on viewing patterns of multiple viewers and sends the metadata map to a client device.
Abstract: A television server generates a metadata map that represents relationships among media content description data based on viewing patterns of multiple viewers. The television server sends the metadata map to a client device. The client device targets advertisements to a viewer based on the metadata map and a recent viewing history of the viewer.

Patent
20 Aug 2002
TL;DR: In this paper, the overhead associated with operations such as collecting, refining, retrieving and maintaining of metadata can be offloaded from the optimizer instances, often accelerating individual cost estimation calculations by optimizer instance, facilitating reuse of metadata calculations and refinements.
Abstract: A metadata manager is used in a database management system to collect and maintain metadata associated with a database. Multiple optimizer instances are permitted to access the metadata maintained by the metadata manager, often eliminating the need for individual optimizer instances to retrieve and process metadata directly from the database. As such, the overhead associated with operations such as collecting, refining, retrieving and/or maintaining of metadata can be off-loaded from the optimizer instances, often accelerating individual cost estimation calculations by optimizer instances, facilitating reuse of metadata calculations and refinements, and improving metadata consistency between multiple related cost estimates.

Patent
28 Feb 2002
TL;DR: In this article, an automated metadata discovery, assignment, and submission system is described, which includes a photosharing service coupled to a network through a server, where the server stores metadata fields.
Abstract: An automated metadata discovery, assignment, and submission system is disclosed. The system includes a photosharing service coupled to a network through a server, where the server stores metadata fields. The system also includes at least one client computer capable of communicating with the server over the network, where the client computer stores a plurality digital files and an automation application. When executed, the automation application establishes communication with the photosharing service and downloads the metadata fields. The content of a first file is then automatically analyzed and one or metadata values are assigned to the downloaded metadata fields based on the analysis. In addition, the automation application automatically discovers any pre-existing metadata values associated with the file and uses the metadata values to populate corresponding downloaded metadata fields. Both the pre-existing and automatically assigned metadata values are then displayed to the user for viewing and editing. The metadata values assigned to the file are recoded for use with a next image, and the file and the metadata values are uploaded to the photosharing service for storage.

Patent
07 Nov 2002
TL;DR: In this article, a method and apparatus for obtaining metadata from multiple information sources in real-time is described, which includes receiving a user request pertaining to one or more of source metadata objects residing in multiple source metadata repositories.
Abstract: A method and apparatus for obtaining metadata from multiple information sources in real time are described. According to one aspect, the method includes receiving a user request pertaining to one or more of source metadata objects residing in multiple source metadata repositories. Each source metadata repository is maintained by a specific data management application. The method further includes responding to the user request in real time by identifying a data management application that corresponds to the source metadata objects associated with the user request and retrieving the source metadata objects using an application program interface (API) with the corresponding data management application.

Patent
Robert J. Curran1, Roger L. Haskin1
23 May 2002
TL;DR: In this paper, a storage gateway is employed as part of a security enhancing protocol in a data processing system which includes at least one metadata controller node and one application node which is granted a time limited access to files in a shared storage system.
Abstract: A storage gateway is employed as part of a security enhancing protocol in a data processing system which includes at least one metadata controller node and at least one application node which is granted a time limited access to files in a shared storage system. The gateway is provided with information as to data blocks to which access is to be allowed and also with information concerning the duration of special access granted to a requesting application node. This insures that metadata cannot be improperly used, changed or corrupted by users operating on an application node.

Patent
24 Oct 2002
TL;DR: In this paper, a method of updating a database (211) comprising a fingerprint of and an associated set of metadata for each of a number of multimedia objects, a multimedia object and metadata for the multimedia object is downloaded from a file sharing client (101-105).
Abstract: A method of updating a database (211) comprising a fingerprint of and an associated set of metadata for each of a number of multimedia objects, A multimedia object and a set of metadata for the multimedia object is downloaded from a file sharing client (101-105). A fingerprint is computed from the multimedia object, and the computed fingerprint and the obtained set of metadata are included in the database (211). The database (211) can be maintained by a central server (210), or be maintained in a distributed fashion by servers (404) running on the file sharing clients (101-105). The database (211) in this way accumulates plural sets of metadata associated with one particular fingerprint. When a sufficient number of sets has been collected, a definite set can be determined using filtering techniques.

Proceedings ArticleDOI
14 Jul 2002
TL;DR: The first phase of the interoperability infrastructure including the metadata repository, search and discovery services, rights management services, and user interface portal facilities are described.
Abstract: We describe the core components of the architecture for the National Science Digital Library (NSDL). Over time the NSDL will include heterogeneous users, content, and services. To accommodate this, a design for a technical and organization infrastructure has been formulated based on the notion of a spectrum of interoperability. This paper describes the first phase of the interoperability infrastructure including the metadata repository, search and discovery services, rights management services, and user interface portal facilities.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: The experimental results using soccer video suggest that extracting high level semantic information from existing metadata can be used effectively (80% precision and 85% recall using cross validation) in generating personalized video digests.
Abstract: We present a new framework for generating personalized video digests from detailed event metadata. In the new approach high level semantic features (e.g., number of offensive events) are extracted from an existing metadata signal using time windows (e.g., features within 16 sec. intervals). Personalized video digests are generated using a supervised learning algorithm which takes as input examples of important/unimportant events. Window-based features are extracted from the metadata and used to train the system and build a classifier that, given metadata for a new video, classifies segments into important and unimportant, according to a specific user, to generate personalized video digests. Our experimental results using soccer video suggest that extracting high level semantic information from existing metadata can be used effectively (80% precision and 85% recall using cross validation) in generating personalized video digests.

Book ChapterDOI
30 Oct 2002
TL;DR: A number of candidate interpretations of annotation are identified, and the impact these interpretations may have on Semantic Web applications is discussed.
Abstract: Semantic metadata will playa significant role in the provision of the Semantic Web. Agents will need metadata that describes the content of resources in order to perform operations, such as retrieval, over those resources. In addition, if rich semantic metadata is supplied, those agents can then employ reasoning over the metadata, enhancing their processing power. Keyto this approach is the provision of annotations, both through automatic and human means. The semantics of these annotations, however, in terms of the mechanisms through which they are interpreted and presented to the user, are sometimes unclear. In this paper, we identify a number of candidate interpretations of annotation, and discuss the impact these interpretations mayha ve on Semantic Web applications.