scispace - formally typeset
Search or ask a question

Showing papers on "Metadata repository published in 1999"


Patent
22 Jan 1999
TL;DR: An authoring system for interactive video has two or more authoring stations for providing authored metadata to be related to a main video data stream and a multiplexer for relating authored metadata from the authoring sources to the main data data stream.
Abstract: An authoring system for interactive video has two or more authoring stations for providing authored metadata to be related to a main video data stream and a multiplexer for relating authored metadata from the authoring sources to the main video data stream. The authoring stations annotate created metadata with presentation time stamps (PTS) from the main video stream, and the multiplexer relates the metadata to the main video stream by the PTS signatures. In analog streams PTS may be created and integrated. In some embodiments there may be multiple and cascaded systems, and some sources may be stored sources. Various methods are disclosed for monitoring and compensating time differences among sources to ensure time coordination in end product. In different embodiments transport of metadata to an end user station is provided by Internet streaming, VBI insertion or by Internet downloading. User equipment is enhanced with hardware and software to coordinate and present authored material with the main data stream.

404 citations


Patent
01 Oct 1999
TL;DR: In this paper, the authors propose an extensible framework for the automatic extraction and transformation of metadata into logical annotations, where metadata imbedding within a media file is extracted by a type-specific parsing module which is loaded and executed based on the mimetype of the media file being described.
Abstract: An extensible framework for the automatic extraction and transformation of metadata into logical annotations. Metadata imbedded within a media file is extracted by a type-specific parsing module which is loaded and executed based on the mimetype of the media file being described. A content processor extracts information, typically in the form of time-based samples, from the media content. An auxiliary processing step is performed to collect additional metadata describing the media file from sources external to the file. All of the metadata thus collected is combined into a set of logical annotations, which may be supplemented by summary data generated from the metadata already collected. The annotations are then formatted into a standardized form, preferably XML, which is then mapped into a database schema. The database object also stores the source XML data as well as the original media file in addition to the annotation metadata. The system provides unified metadata repositories, which can then be used for indexing and searching.

343 citations


Patent
29 Jul 1999
TL;DR: In this article, the authors present a real-time content-based analysis function in the capture device to extract metadata from the digital signals, which is then combined with the digital content in a container format such as MPEG-7, QuickTime, or FlashPix.
Abstract: One aspect of the invention is directed to a system and method for a digital capture system, such as a digital encoder, having an embedded real-time content-based analysis function in the capture device to extract metadata from the digital signals. In one embodiment, metadata (descriptive information about the digital content) is formatted and stored separately from the content. In another embodiment, the metadata may be formatted and combined with the digital content in a container format such as MPEG-7, QuickTime, or FlashPix. Digital encoding mechanisms, both pure hardware and hardware/software combinations, convert an analog video signal into a digitally encoded video bitstream. The system extracts metadata in real-time during the encoding process.

300 citations


Journal Article
01 Jan 1999-Online
TL;DR: This article discusses the variety of emerging and often conflicting projects for standardizing electronic resources and how an organization actually applies a metadata scheme to its own corporate intranet.
Abstract: Editor's Note: Be sure to take a look at this article's companion piece, also by Jessica and Susan, entitled \"Metadata Projects and Standards,\" for an overview of the variety of emerging and often conflicting projects for standardizing electronic resources. Also see the extensive list of metadata project and resource links referenced in the text. For a look into how an organization actually applies a metadata scheme to its own corporate intranet, read Kelly Doran's piece, \"Metadata for a Corporate Intranet,\" in this issue.

163 citations


Patent
27 Jan 1999
TL;DR: In this article, a document management system is adapted to operate with a process management system for recording and viewing metadata of a document, which is used to record resources referenced while the content of the document is developed.
Abstract: A document management system is adapted to operate with a process management system for recording and viewing metadata of a document. The document metadata is used to record resources referenced while the content of the document is developed. The resources are recorded in the document metadata to identify relationships between the resources and the content of the document. The document content includes links that reference the document metadata stored on a remote server. The resources recorded in the document metadata and the content of the document are simultaneously displayed on a user interface of the document management system to provide context for understanding document history. By storing document metadata and document content separately, the metadata of the document remains consistent even when multiple copies of the document are distributed over a network.

139 citations


Patent
22 Jan 1999
TL;DR: A data management system and method for creating, viewing, and interacting with object metadata directly from a computer system's operating system is presented in this article, where an object profile is defined by selecting metadata fields, such that at least one metadata can be supplied to correspond with each metadata field.
Abstract: A data management system and method is provided for creating, viewing, and interacting with object metadata directly from a computer system's operating system. In a preferred embodiment, an object profile is defined by selecting metadata fields, such that at least one metadata can be supplied to correspond with each metadata field. The metadata corresponding to each object can then be viewed by interfacing directly with the computer's operating system. The viewing of metadata can further be customized by selecting at least one metadata field within the operating system, such that metadata corresponding to an object and the at least one metadata field is displayed. The metadata fields and the metadata can further be used to search for objects stored on the computer system, either locally or remotely. The resulting objects can be retrieved and scaled to be displayed on the computer system's output device, regardless of whether the objects comprise single or multiple files.

124 citations


Patent
Sridhar Srinivasa Iyengar1
30 Jun 1999
TL;DR: In this paper, a method for effecting data interchange among software tools and repositories in a distributed heterogeneous environment in a computer system having at least one repository of a first type and at least another software modeling tool of a second type is disclosed.
Abstract: A method is disclosed for effecting data interchange among software tools and repositories in a distributed heterogeneous environment in a computer system having at least one repository of a first type and at least one software modeling tool of a second type. The method includes the steps of registering and storing metadata describing a meta model in the repository. Next, a set of rules and streams of data are generated based on the rules, and then documents conforming to each of the metamodels are generated by reading the set of rules. An importer is written for use in importing into the repository the streams of data; and, an exporter is written for use in exporting from the repository the streams of data.

105 citations


Patent
01 Jun 1999
TL;DR: In this paper, the format agent fulfills the portions of the client's request regarding metadata attributes included in the associated format of the file system, and the requested metadata attribute data is then returned to the client.
Abstract: In a computer, a system and a method handle requests from a client for accessing metadata attributes from at least one file system having an associated format containing specific metadata attributes. A format agent manages the file system. A client's request is received at an interface and forwarded to a dispatcher. The dispatcher routes the request to the format agent. The format agent fulfills the portions of the client's request regarding metadata attributes included in the associated format of the file system. If the client's request contains a metadata attribute that is not part of the file system's associated format, the format agent accesses a metadata attribute store to retrieve the metadata attribute data needed to fulfill the request. The requested metadata attribute data is then returned to the client. Multiple instances of the metadata attribute data are accessible by the client, the instances selected and/or assigned by the client and/or the system.

95 citations


Patent
19 Nov 1999
TL;DR: In this paper, the authors present a system and method for organizing, storing, retrieving and searching through binary representations of information in many forms and formats, where data is stored in its original file format, while maintaining metadata about the data items in a relational database.
Abstract: The present invention introduces a system and method for organizing, storing, retrieving and searching through binary representations of information in many forms and formats. Data is stored in its original file format, while maintaining metadata about the data items in a relational database. During searches the system utilizes the metadata to invoke data translators of the appropriate type to present data to the search engine itself. In addition, the system utilizes profiles and access control lists to restrict access to data to authorized users.

84 citations


Journal ArticleDOI
TL;DR: The Alexandria Digital Library (ADL) Project has designed and implemented collection metadata for several purposes, and it is used for internal collection management, including mapping the object metadata attributes to the common search parameters of the system.
Abstract: Within a digital library, collections may range from an ad hoc set of objects that serve a temporary purpose to established library collections intended to persist through time. The objects in these collections vary widely, from library and data center holdings to pointers to real-world objects, such as geographic places, and the various metadata schemas that describe them. The key to integrated use of such a variety of collections in a digital library is collection metadata that represents the inherent and contextual characteristics of a collection. The Alexandria Digital Library (ADL) Project has designed and implemented collection metadata for several purposes: in XML form, the collection metadata "registers" the collection with the user interface client; in HTML form, it is used for user documentation; eventually, it will be used to describe the collection to network search agents; and it is used for internal collection management, including mapping the object metadata attributes to the common search parameters of the system.

76 citations


01 Jan 1999
TL;DR: This model uses a uniform representation approach based on the Uniform Modeling Language (UML) to integrate technical and semantic metadata and their interdependencies.
Abstract: Due to the increasing complexity of data warehouses , a centralized and declarative management of metadata is essential for data warehouse administration, maintenance and usage. Metadata are usually divided into technical and semantic metadata. Typically, current approaches only support subsets of these metadata types, such as data movement meta-data or multidimensional metadata for OLAP. In particular, the interdependencies between technical and semantic metadata have not yet been investigated sufficiently. The representation of these interdependencies form an important prerequisite for the translation of queries formulated at the business concept level to executable queries on physical data. Therefore, we suggest a uniform and integrative model for data warehouse metadata. This model uses a uniform representation approach based on the Uniform Modeling Language (UML) to integrate technical and semantic metadata and their interdependencies.

Journal ArticleDOI
17 May 1999
TL;DR: This paper describes how an automatic classifier, that classifies HTML documents according to Dewey Decimal Classification, can be used to extract context sensitive metadata which is then represented using RDF.
Abstract: Automatic metadata generation may provide a solution to the problem of inconsistent, unreliable metadata describing resources on the Web. The Resource Description Framework (RDF) provides a domain-neutral foundation on which extensible element sets can be defined and expressed in a standard notation. This paper describes how an automatic classifier, that classifies HTML documents according to Dewey Decimal Classification, can be used to extract context sensitive metadata which is then represented using RDF. The process of automatic classification is described and an appropriate metadata element set is identified comprising those elements that can be extracted during classification. An RDF data model and an RDF schema are defined representing the element set and the classifier is configured to output the elements in RDF syntax according to the defined schema.

01 Jan 1999
TL;DR: A data warehouse process model is proposed to capture the dynamics of a data warehouse and the evolution operators are linked to the corresponding architecture components and quality factors they affect.
Abstract: Data warehouses are complex systems consisting of many components which store highlyaggregated data for decision support. Due to the role of the data warehouses in the daily business work of an enterprise, the requirements for the design and the implementation are dynamic and subjective. Therefore, data warehouse design is a continuous process which has to reflect the changing environment of a data warehouse, i.e. the data warehouse must evolve in reaction to the enterprise’s evolution. Based on existing meta models for the architecture and quality of a data warehouse, we propose in this paper a data warehouse process model to capture the dynamics of a data warehouse. The evolution of a data warehouse is represented as a special process and the evolution operators are linked to the corresponding architecture components and quality factors they affect. We show the application of our model on schema evolution in data warehouses and its consequences on data warehouse views. The models have been implemented in the metadata repository ConceptBase which can be used to analyze the result of evolution operations and to monitor the quality of a data warehouse.

Journal ArticleDOI
01 Oct 1999
TL;DR: This work introduces metadata implantation and stepwise evolution techniques to interrelate database elements in different databases, and to resolve conflicts on the structure and semantics of database elements (classes, attributes, and individual instances).
Abstract: A key aspect of interoperation among data-intensive systems involves the mediation of metadata and ontologies across database boundaries. One way to achieve such mediation between a local database and a remote database is to fold remote metadata into the local metadata, thereby creating a common platform through which information sharing and exchange becomes possible. Schema implantation and semantic evolution, our approach to the metadata folding problem, is a partial database integration scheme in which remote and local (meta)data are integrated in a stepwise manner over time. We introduce metadata implantation and stepwise evolution techniques to interrelate database elements in different databases, and to resolve conflicts on the structure and semantics of database elements (classes, attributes, and individual instances). We employ a semantically rich canonical data model, and an incremental integration and semantic heterogeneity resolution scheme. In our approach, relationships between local and remote information units are determined whenever enough knowledge about their semantics is acquired. The metadata folding problem is solved by implanting remote database elements into the local database, a process that imports remote database elements into the local database environment, hypothesizes the relevance of local and remote classes, and customizes the organization of remote metadata. We have implemented a prototype system and demonstrated its use in an experimental neuroscience environment.

Book
05 Sep 1999
TL;DR: This chapter discusses data Warehousing Project Management, which involves managing Lots of Data: System-Controlled Storage, and the design and construction of the Data Warehouse, which involved two-tier and Three-tier Architectures.
Abstract: I. FUNDAMENTAL COMMITMENTS. 1. Basic Data Warehousing Distinctions. An Architecture, Not A Product. The One Fundamental Question. The One Question-The Thousand and One Answers.... The First Distinction: Transaction and Decision Support System. Data Warehouse Sources of Data. Dimensions. The Data Warehouse Fact. The Data Warehouse Model of the Business: Alignment. The Data Cube. Aggregation. Data Warehouse Professional Roles. The Data Warehouse Process Model. Summary. 2. A Short History of Data. In the Beginning.... Fast Forward to Modern Times. The Very Idea of Decision Support. From Mainframes To PCs. The Promise of the Relational Database. Data Every Which Way. From Client-Server to Thin Client Computing. Why Will Things Be Different This Time? The More Things Change, the More They Stay the Same. Model of Technology Dynamics. Summary. 3. Justifying Data Warehousing. Competition for Limited Resources. An Integrated Business and Technology Solution. Economic Value, Not Business Benefits. Selling the Data Warehouse. The Reporting Data Warehouse: Running Fewer Errands. The Supply Chain Warehouse. The Cross-Selling Warehouse. The Total Quality Management Data Warehouse. The Profitability Warehouse. Data Warehousing Case Vignettes in the Press. Summary. 4. Data Warehousing Project Management. Simulating a Rational Design Process. Managing Project Requirements. Managing the Development of Architecture. Managing Project Schedule. Managing Project Quality. Managing Project Risks. Managing Project Documentation. Managing the Project Development Team. Managing Project Management. Summary. II. DESIGN AND CONSTRUCTION. 5. Business Design: The Unified Representations of The Customer and Product. The Critical Path: Alignment. A Unified Representation of the Customer. Data Scrubbing. The Cross-Functional Team. Hierarchical Structure. Customer Demographics. A Unified Representation of the Product. Data Marts: Between Prototype and Retrotype. Summary. 6. Total Data Warehouse Quality. The Information Product. Data Quality as Data Integrity. Intrinsic Qualities. Ambiguity. Timeliness and Consistency in Time. Security. Secondary Qualities. Credibility. Quality Data, Quality Reports. Information Quality, System Quality. Performance. Availability. Scalability. Functionality. Maintainability. Reinterpreting the Past. Summary. 7. Data Warehousing Technical Design. Use case Scenarios. Abstract Data Types and Concrete Data Dimensions. Data Normalization: Relevance and Limitations. Dimensions and Facts. Primary and Foreign Keys. Design for Performance: Technical Interlude. Summary. 8. Data Warehouse Construction Technologies: SQL. The Relational Database: A Dominant Design. Twelve Principles. Thinking in Sets: Declarative and Procedural Approaches. Data Definition Language. Indexing: B-Tree. Indexing: Hashing. Indexing: Bitmap. Indexing Rules of Thumb. Data Manipulation Language. Data Control Language. Stored Procedures. User-Defined Functions. Summary. 9. Data Warehouse Construction Technologies: Transaction Management. The Case For Transaction Management: The ACID Test. The Logical Unit of Work. Two-tier and Three-tier Architectures. Distributed Architecture. Middleware: Remote Procedure Call Model. Middleware: Message-Oriented Middleware. The Long Transaction. Summary. III. OPERATIONS AND TRANSFORMATIONS. 10. Data Warehouse Operation Technologies: Data Management. Database Administration. Backing Up the Data (in the Ever-Narrowing Backup Window). Recovering the Database: Crash Recovery. Recovering the Database: Version (Point-in-Time) Recovery. Recovering the Database: Roll-Forward Recovery. Managing Lots of Data: Acres of Disk. Managing Lots of Data: System-Controlled Storage. Managing Lots of Data: Automated Tape Robots. RAID Configurations. Summary. 11. Data Warehousing Performance. Performance Parameters. Denormalization for Performance. Aggregation For Performance. Buffering For Performance. Partitioning For Performance. Parallel Processing: Shared Memory. Parallel Processing: Shared Disk. Parallel Processing: Shared Nothing. Data Placement: Colocated Join. Summary. 12. Data Warehousing Operations: The Information Supply Chain. A Process, Not an Application. The Great Chain of Data. Partitioning: Divide and Conquer. Determining Temporal Granularity. Aggregate Up To the Data Warehouse. Aggregates in the Data Warehouse. The Debate about the Data Warehouse Data Model. The Presentation Layer. Integrated Decision Support Processes. Summary. 13. Metadata and Metaphor. Metaphors Alter Our Perceptions. A New Technology, a New Metaphor. Metadata are Metaphorical. Semantics. Forms of Data Normalization and Denormalization. Metadata Architecture. Metadata Repository. Models and Metamodels. Metadata Interchange Specification (MDIS). Metadata: A Computing Grand Challenge. Summary. 14. Aggregation. On-line Aggregation, Real-Time Aggravation. The Manager's Rule of Thumb. A Management Challenge. Aggregate Navigation. Information Density. Canonical Aggregates. Summary. IV. APPLICATIONS AND SPECULATIONS. 15. OLAP Technologies. OLAP Architecture. Cubes, Hypercubes, and Multicubes. OLAP Features. The Strengths of OLAP. Limitations. Summary. 16. Data Warehousing and the Web. The Business Case. The Web as a Delivery System. Key Internet Technologies. Web Harvesting: The Web as the Ultimate Data Store. The Business Intelligence Portal. Summary. 17. Data Mining. Data Mining and Data Warehousing. Data Mining Enabling Technologies. Data Mining Methods. Data Mining: Management Perspective. Summary. 18. Breakdowns: What Can Go Wrong. The Short List. The Leaning Cube of Data. The Data Warehouse Garage Sale. Will the Future be Like the Past? Model Becomes Obsolete. Missing Variables. Obsessive Washing. Combinatorial Explosion. Technology and Business Misalignment. Becoming a Commodity. Summary. 19. Future Prospects. Enterprise Server Skills to be in High Demand. The Cross-Fictional, Oops, -Functional Team. Governance. The Operational Data Warehouse. Request for Update. The Web Opportunity: Agent Technology. The Future of Data Warehousing. Summary. Glossary. References. Index.

Book
30 Sep 1999
TL;DR: The concept of metadata common factors affecting data quality data flexibility and responsive to business change active information management metadata entity types introduction to the enterprise metamodel.
Abstract: The concept of metadata common factors affecting data quality data flexibility and responsive to business change active information management metadata entity types introduction to the enterprise metamodel the challenges of information management recognizing the fallacy of software engineering distinguishing between data and information Occam's dilemma - recognizing necessary complexity establishing a common basic metamodel metadata in business managing the metadata the role of metadata in application development and support metadata in data warehousing and business intelligence the role of metadata on the Internet the basics of metamodelling design and management of metadatabases interaction between metamodels.

Proceedings Article
01 Jan 1999
TL;DR: The key components of the SmartPush architecture have been implemented, and the focus in the project is shifting towards a pilot implementation and testing the ideas in practice.
Abstract: In the SmartPush project professional editors add semantic metadata to information flow when the content is created. This metadata is used to filter the information flow to provide the end users with a personalized news service. Personalization and delivery process is modeled as software agents, to whom the user delegates the task of sifting through incoming information. The key components of the SmartPush architecture have been implemented, and the focus in the project is shifting towards a pilot implementation and testing the ideas in practice.

Journal ArticleDOI
TL;DR: The query preview interfaces make visible the problems or gaps in the metadata that are undetectable with classic form fill-in interfaces, which will have a long-term beneficial effect on the quality of the metadata as data providers will be compelled to produce more complete and accurate metadata.
Abstract: The Human-Computer Interaction Laboratory (HCIL) of the University of Maryland and NASA have collaborated over three years to refine and apply user interface research concepts developed at HCIL in order to improve the usability of NASA data services. The research focused on dynamic query user interfaces, visualization, and overview + preview designs. An operational prototype, using query previews, was implemented with NASA's Global Change Master Directory (GCMD), a directory service for earth science datasets. Users can see the histogram of the data distribution over several attributes and choose among attribute values. A result bar shows the cardinality of the result set, thereby preventing users from submitting queries that would have zero hits. Our experience confirmed the importance of metadata accuracy and completeness. The query preview interfaces make visible the problems or gaps in the metadata that are undetectable with classic form fill-in interfaces. This could be seen as a problem, but we think that it will have a long-term beneficial effect on the quality of the metadata as data providers will be compelled to produce more complete and accurate metadata. The adaptation of the research prototype to the NASA data required revised data structures and algorithms.

Journal ArticleDOI
TL;DR: A set of document content description tags, or metadata encodings, that can be used to promote disciplined search access to Internet medical documents to facilitate document retrieval by Internet search engines is defined.

Patent
Alison Lennon1
11 Jun 1999
TL;DR: In this paper, a method and apparatus for generating a metadata object having links to temporal and spatial extents in a time-sequential digital signal is disclosed, which includes the following steps: identifying an object of interest and defining a link entity between metadata in the metadata object and the identified object.
Abstract: A method and apparatus for generating a metadata object having links to temporal and spatial extents in a time-sequential digital signal is disclosed. The method includes the following steps. Firstly, identifying an object of interest in the time-sequential digital signal. Secondly, defining a link entity between metadata in the metadata object and the identified object, the link entity forming part of the metadata object. Thirdly, tracking the identified object in the time-sequential digital signal and updating the link entity in the metadata object to include the identified object's new temporal and spatial extent in the time-sequential digital signal. Finally, associating the generated metadata object with the time-sequential digital signal.

Journal ArticleDOI
TL;DR: This project's goal is to develop a catalog for a digitized collection of historical fashion objects held at the Kent State University Museum and to analyze and evaluate how well existing metadata formats can be applied to a fashion collection.
Abstract: This project's goal is to develop a catalog for a digitized collection of historical fashion objects held at the Kent State University Museum and to analyze and evaluate how well existing metadata formats can be applied to a fashion collection. The project considered the known and anticipated uses of the collection and the identification of the metadata elements that would be needed to support these uses. From a set of 90 museum accession records, 42 fashion objects were selected for cataloging. Three metadata treatments were created for these 42 items using (a) the Anglo-American Cataloguing Rules (AACR) in use with the United States MAchine-Readable Cataloging (USMARC) formats, (b) the Dublin Core set of elements designed for minimal level cataloging, and (c) the Visual Resources Association (VRA) Core Categories for Visual Resources created for developing local databases and cataloging records for visual resources collections. Comparison and analysis of the formats resulted in the adoption of a modified VRA metadata format to catalog the entire digitized historical fashion collection.

Patent
29 Dec 1999
TL;DR: In this paper, the authors present a system that retrieves metadata from a memory within a server, so that the server does not have to access a database in order to retrieve the metadata.
Abstract: One embodiment of the present invention provides a system that retrieves metadata from a memory within a server, so that the server does not have to access a database in order to retrieve the metadata. The system operates by receiving a request from a client, which causes an operation to be performed on data within the database. In response to the request, the system retrieves the metadata through a metadata object, which retrieves the metadata from a random access memory in the server. Note that this metadata specifies how the data is stored within the database. The system then performs the operation on the data within the database by using the metadata to determine how the data is stored within the database. Note that this metadata object can be used to service requests from a plurality of clients. Hence, client sessions can share the same metadata, which can greatly reduce the amount of memory used by client sessions. In one embodiment of the present invention, the metadata object contains static metadata specifying how tables and views are organized within the database. In one embodiment of the present invention, the system accesses the metadata object through a generic object on the server.

Journal ArticleDOI
TL;DR: This paper reviews eight important metadata structures in current use or development in order to develop an alternative simpler, yet inclusive, globally applicable metadata standard.
Abstract: Digital spatial data is an attempt to model and describe the real world for use in computer analysis and graphic display of information. To insure that data is not misused; the assumptions and limitations affecting the collection of the data must be fully documented. As geo-spatial data producers and users handle more and more data, proper documentation will provide them with a keener knowledge of their holdings and allow them to better manage data production, storage, updating and reuse.

Patent
28 Oct 1999
TL;DR: In this paper, a simple script request containing a data source object name is sent to the appropriate remote nodes where respective agent processes respond to automatically access the appropriate data and to automatically execute the specified program.
Abstract: Heterogeneous data at a plurality of remote nodes is accessed automatically in parallel at high speed from a user site using a simple script request containing a data source object name wherein the heterogeneous data is treated as a single data source object, the script further containing code representing a user-defined program to be executed on the data source object. The user-defined program may be represented by an embedded script, a user-defined script or an executable program designation. A user site agent breaks the user-generated script into new scripts appropriate for execution at the remote nodes. A messenger process transmits the new scripts to the appropriate remote nodes where respective agent processes respond to automatically access the appropriate data and to automatically execute the specified program. If the program is a user-defined script or executable, the respective agent processes access a metadata repository to obtain the specified program.

30 Aug 1999
TL;DR: The most important standards for representation and interchange of metadata, commercial products and research projects are presented and discussed for both, the general case and the particular case of data warehousing.
Abstract: This report gives an overview of metadata management in general (Part I) and on the role of metadata for data warehousing (Part II) Because of the complexity and extensive applicability of metadata, a compact, precise definition of the notion may hardly be provided Therefore, we explain metadata by illustrating the use and the forms it may take within various application areas In the case of data warehousing, we present a classification of metadata along certain dimensions and we discuss significant aspects of metadata management that have to be considered for the construction of a data warehouse system Furthermore, this report provides a comprehensive survey and analysis of the state of the art of metadata management in industry and research The most important standards for representation and interchange of metadata, commercial products and research projects are presented and discussed (as far as the available information allows) for both, the general case and the particular case of data warehousing

Proceedings ArticleDOI
01 Nov 1999
TL;DR: In this paper, the authors examine metadata and data-structure issues for the Historical Newspaper Digital Library and propose a framework for the logical structure and physical layout of metadata relevant to the image processing and to the historians who will use this collection.
Abstract: We examine metadata and data-structure issues for the Historical Newspaper Digital Library. This project proposes to digitize and then do OCR and linguisting processing on several years worth of historical newspapers. Newspapers are very complex information objects so developing a rich description of their content is challenging. In addition to frameworks for the logical structure and physical layout, we propose metadata relevant to the image processing and to the historians who will use this collection. Finally, we consider how the metadata infrastructure might be managed as it evolves with improved text processing capabilities and how an infrastructure might be developed to support a community of users.

Patent
02 Feb 1999
TL;DR: In this paper, the authors describe a mechanism for associating metadata with network resources and for locating the network resources in a language-independent manner, including a natural language name of the network resource, its location, its language, its region or intended audience, and other descriptive information.
Abstract: Mechanisms for associating metadata with network resources, and for locating the network resources in a language-independent manner are disclosed. The metadata may include a natural language name of the network resource, its location, its language, its region or intended audience, and other descriptive information. The owners register the metadata in a registry (10). A copy of the metadata is stored on a server (60) associated with a group of the network resources and in a registry that is indexed at a central location (32). A crawler service (24) periodically updates the registry by polling the information on each server associated with registered metadata. To locate a selected network resource, a client (70) provides the name of the network resource to a resolver process. The resolver process provides to the client the network resource location corresponding to the network resource name. Multiple metadata mappings can be established for the same network resource.

Proceedings ArticleDOI
01 Nov 1999
TL;DR: CI3, a corporate information integrator, is presented, which applies XML as a tool to facilitate data mediation and integration amongst heterogeneous sources in the context of financial analysts creating corporate profiles.
Abstract: The proliferation of electronically available data within large organizations as well as publicly available data (e.g. over the World Wide Web) poses challenges for users who wish to efficiently interact with and integrate multiple heterogeneous sources. This paper presents CI3, a corporate information integrator, which applies XML as a tool to facilitate data mediation and integration amongst heterogeneous sources in the context of financial analysts creating corporate profiles. Sources include Lotus Notes, relational databases, and the World Wide Web. CI3 applies a unified XML data model to automate integration. By preserving metadata about the source of each datum in the integrated result set, CI3 supports source attribution. Users may trace the attribution metadata from the result back to the underlying sources and leverage their expertise in interpreting the data and, if necessary, use their judgment in assessing the authenticity and veracity of results. We present a functional overview of CI3, its system architecture including the XML data model, and the integration procedures. We conclude by reflecting on lessons learned.

01 Dec 1999
TL;DR: The Dublin Core [DC1] is a small set of metadata elements for describing information resources expressed using the META and LINK tags of HTML [HTML4.0].
Abstract: The Dublin Core [DC1] is a small set of metadata elements for describing information resources. This document explains how these elements are expressed using the META and LINK tags of HTML [HTML4.0]. A sequence of metadata elements embedded in an HTML file is taken to be a description of that file. Examples illustrate conventions allowing interoperation with current software that indexes, displays, and manipulates metadata, such as [SWISH-E], [freeWAIS-sf2.0], [GLIMPSE], [HARVEST], [ISEARCH], etc., and the Perl [PERL] scripts in the appendix.

Proceedings ArticleDOI
02 Sep 1999
TL;DR: The purpose of the paper is to propose a design aid which helps in the definition of a user view (or query) from its schema and some integrity constraints, and defines a solution space which provides the set of potential queries that correspond to the user view.
Abstract: A Multi-Source Information System is composed of a set of independent data sources and a set of views defined as queries over these data sources This problem is dramatically increased in evolving information systems, such as data warehouses and Web systems, where several views are daily defined or modified by users who are not aware of the detailed metadata describing data sources and their interrelationships The purpose of the paper is to propose a design aid which helps in the definition of a user view (or query) from its schema and some integrity constraints Our approach defines a solution space which provides the set of potential queries that correspond to the user view The approach is based on the existence of metadata describing individual sources, on semantic assertions describing inter-source similarities between concepts, and on some heuristics which reduce the size of the solution space