scispace - formally typeset
Search or ask a question

Showing papers on "Metadata repository published in 1995"


Patent
07 Jun 1995
TL;DR: In this paper, natural language processing is used to determine matches between the query and the stored metadata, and images corresponding to the matches are then viewed, and desired images are selected for licensing.
Abstract: Digitized images are associated with English language captions and other data, collectively known as the metadata associated with the images. A natural language processing database removes ambiguities from the metadata, and the images and the metadata are stored in databases. A user formulates a search query, and natural language processing is used to determine matches between the query and the stored metadata. Images corresponding to the matches are then viewed, and desired images are selected for licensing. The license terms for selected images are displayed, and a subset of the selected images are ordered as desired by the user.

277 citations


Book ChapterDOI
12 Jun 1995
TL;DR: The InfoHarness system is aimed at providing integrated and rapid access to huge amounts of heterogeneous information independent of its type, representation, and location by extracting metadata and associating it with the original information.
Abstract: The InfoHarness system is aimed at providing integrated and rapid access to huge amounts of heterogeneous information independent of its type, representation, and location. This is achieved by extracting metadata and associating it with the original information. The metadata extraction methods ensure rapid and largely automatic creation of information repositories. A stable hierarchy of abstract classes is proposed to organize the processing and representation needs of different kinds of information. An extensible hierarchy of terminal classes simplifies support for new information types and utilization of new indexing technologies. InfoHarness repositories may be accessed through Mosaic or any other HyperText Transfer Protocol (HTTP) compliant browser.

194 citations


Patent
27 Oct 1995
TL;DR: In this paper, a method in a repository coupled to a computer system that generates OLE automation and Interface Definition Language (IDL) interfaces from metadata (i.e., information about data) is presented.
Abstract: A method in a repository coupled to a computer system that generates OLE automation and Interface Definition Language ("IDL") interfaces from metadata (i.e., information about data). Visual Basic programming language is used to develop a tool that generates the automation binding from the metadata in the repository. The method extends C++ programming language binding across networks, independent of the Object Request Broker (ORB) being used. A schema is provided that maps the types and features of a repository to OLE automation and member functions for Windows. The method also integrates the Application Programming Interface (API) of a repository with Windows scripting programming languages through Object Linking and Embedding (OLE).

100 citations


Proceedings ArticleDOI
29 Oct 1995
TL;DR: A software architecture to integrate a database management system with data visualization and the underlying elements of this approach are described, specifically the visual database exploration model and the metadata objects that support the model.
Abstract: A software architecture is presented to integrate a database management system with data visualization. One of its primary objectives, the retention of user-data interactions, is detailed. By storing all queries over the data along with high-level descriptions of the query results and the associated visualization, the processes by which a database is explored can be analyzed. This approach can lead to important contributions in the development of user models as "data explorers", metadata models for scientific databases, intelligent assistants and data exploration services. We describe the underlying elements of this approach, specifically the visual database exploration model and the metadata objects that support the model.

46 citations



Journal ArticleDOI
30 Apr 1995
TL;DR: The proposed InfoHarness Repository Definition Language (IRDL) aims to simplify the metadata generation process and provides high flexibility in associating typed logical information units with portions of physical data and in defining relationships between these units.
Abstract: The objective of InfoHarness is to provide integrated and rapid access to huge amounts of heterogeneous legacy information through WWW browsers. This is achieved with the help of metadata that contains information about the type, representation, and location of physical data. The proposed InfoHarness Repository Definition Language (IRDL) aims to simplify the metadata generation process. It provides high flexibility in associating typed logical information units with portions of physical data and in defining relationships between these units. The proposed stable abstract class hierarchy provides support for statements of the language that introduce new data types, as well as new indexing technologies.

17 citations



01 Dec 1995
TL;DR: This paper describes how IAFA Templates have been used in a real archive to store the metadata of lots of different types of documents and software and to derive WWW, gopher and text indices from them.
Abstract: Recently there has been a growing need for a metadata standard for the Internet. The files that are available on ftp and WWW sites can be difficult to search if they are enclosed in a container format (e.g. tar). and bibliographical data can be deeply embedded in documentation. This paper describes how IAFA Templates have been used in a real archive to store the metadata of lots of different types of documents and software and to derive WWW, gopher and text indices from them.

8 citations


01 Jan 1995
TL;DR: The goal was to produce a design document that could be used both by the users of MOSF relational databases and by users of other relational databases who would like to develop a similar RMMS metadata repository for their own set of databases.
Abstract: : The RAND Metadata Management System (RMMS) is a system that manages "metadata." Metadata denotes definitional and descriptional information about databases, simulation models, and procedures. Databases, such as those maintained in the INGRES database management system (DBMS) by the Military Operations Simulation Facility (MOSF), are prevalent throughout RAND. Similarly, many prominent simulation models are exercised regularly in the MOSF and require input data extracted from INGRES databases. However, most of these databases have little documentation or other descriptional information to go along with them. The absence of such information leaves users at a loss for understanding the definitions, abbreviations, acronyms, and descriptions of the pieces of data stored and maintained in a DBMS. This report presents the design of the RMMS, a metadata management system for relational databases. Our goal was to produce a design document that could be used both by the users of MOSF relational databases (who may be future users of RMMS) and by users of other relational databases who would like to develop a similar RMMS metadata repository for their own set of databases. This work was motivated by the proliferation of databases stored and used in the MOSF and by a realization by the MOSF database management staff that such metadata is at least as important as, if not more important than, the actual data values. Although a "management system" typically includes facilities for user interaction and maintenance, in this document we focus primarily on the metadata storage structures within RMMS. Detailed discussion of user interface and maintenance designs is beyond the scope of this report.

4 citations


Proceedings ArticleDOI
11 Sep 1995
TL;DR: The KBMS is a component of an intelligent information system based upon a federated architecture, also including a database management system for time-series-oriented data and a visualization system.
Abstract: Over the last few years, dramatic increases and advances in mass storage for both secondary and tertiary storage made possible the handling of big amounts of data (for example, satellite data, complex scientific experiments, and so on). However, to the full use of these advances, metadata for data analysis and interpretation, as well as the complexity of managing and accessing large datasets through intelligent and efficient methods, are still considered to be the main challenges to the information-science community when dealing with large databases. Scientific data must be analyzed and interpreted by metadata, which has a descriptive role for the underlying data. Metadata can be, partly, a priori definable according to the domain of discourse under consideration (for example, atmospheric chemistry) and the conceptualization of the information system to be built. It may also be extracted by using learning methods from time-series measurement and observation data. In this paper, a knowledge-based management system (KBMS) is presented for the extraction and management of metadata in order to bridge the gap between data and information. The KBMS is a component of an intelligent information system based upon a federated architecture, also including a database management system for time-series-oriented data and a visualization system.

3 citations


Journal ArticleDOI
TL;DR: The types of metadata encountered and the problems associated with dealing with them are discussed, and an alternative approach based on textual markup rather than, for example, the relational model is described.
Abstract: With many types of scientific data, the amount of descriptive and qualifying information associated with the data values is quite variable and potentially large compared with the number of actual data values. This problem has been found to be particularly acute when dealing with data about the nutrient composition of foods, and a system—based on textual markup rather than, for example, the relational model—has been developed to deal with it. This paper discusses the types of metadata encountered and the problems associated with dealing with them, and then describes this alternative approach. The approach described has been installed in several locations around the world, and is in preliminary use as a tool for interchanging data among different databases as well as local database management.


01 Nov 1995
TL;DR: Results of the performance tests show the superior performance of the developed algorithms and reveal that to build high performance MDDSs it is imperative that they adopt approaches that exploit the data and transaction characteristics.
Abstract: We address several important issues that arise in the development of Massive Digital Database Systems (MDDSs) in which data is being added continuously and on which users pose queries on the fly. {\em News-on-demand} and document retrieval systems are examples of systems that have these characteristics. Given the size of data, metadata such as index structures become even more important in these systems --- data is accessed only after processing the metadata, both of which will reside on tertiary storage. The focus of this paper is on {\em query and transaction processing} in such systems, with emphasis on {\em metadata management}. The performance in these systems can be measured in terms of the {\em response time} for the queries and the {\em recency} or age of the items retrieved. Both need to be minimized. The key to satisfying the performance requirements is to exploit the characteristics of the metadata as well as of the queries and updates that access the metadata. After analyzing the functionality and correctness properties of updates, we develop an efficient scheme for executing queries concurrently with updates such that the queries have short response times and are guaranteed to return the most recent articles. Secondly, we address logging and recovery issues and propose techniques for efficiently migrating metadata updates from disk to tape. Thirdly, considering the tape access needs of queries, we develop new tape scheduling techniques for multiple queries such that the response time of queries is reduced. Results of the performance tests on a prototype system show the superior performance of the developed algorithms and reveal that to build high performance MDDSs it is imperative that we adopt approaches that exploit the data and transaction characteristics.

01 Jan 1995
TL;DR: MP is a program for validating the syntactical structure of formal metadata, testing the structure against the Content Standards for Digital Geospatial Metadata devised by the Federal Geographic Data Committee (FGDC).
Abstract: MP is a program for validating the syntactical structure of formal metadata, testing the structure against the Content Standards for Digital Geospatial Metadata devised by the Federal Geographic Data Committee (FGDC). MP is described as a compiler because it contains not only a lexical parser but also code to analyze the tree that the parser generates, and code to output the metadata in several different formats. It is written in Standard C (i.e. ANSI C) and runs on Linux, UNIX, and all versions of Microsoft Windows (95 and later, including Windows 10). MP generates a textual report indicating errors in the metadata, primarily in the structure but also in the values of some of the scalar elements (i.e. those whose values are restricted by the standard). Output formats include text (the same as the input format), Hypertext Markup Language (HTML), Standard Generalized Markup Language (SGML), Extensible Markup Language (XML) and Directory Interchange Format (DIF). MP has the ability to recognize and process elements that are not part of the FGDC standard, provided these elements are properly described in a local file.

01 Jun 1995
TL;DR: A skeleton model for the design of a repository to store spatial metadata is proposed and an object oriented modelling approach is adopted in preference to an entity relationship approach because of its ability to model functional and dynamic aspects of the repository.
Abstract: The design of spatial information systems has traditionally been carried out independently of mainstream database developments. It is contended that the adoption of mainstream database design techniques is important to progress in the spatial information systems development field. An accepted approach to the development of information systems is through an integrated development environment with a design repository at its core. This paper proposes a skeleton model for the design of a repository to store spatial metadata. An object oriented modelling approach is adopted in preference to an entity relationship approach because of its ability to model functional and dynamic aspects of the repository. 1 Address correspondence to: Ms S.K.S. Cockcroft, Lecturer, Department of Information Science, University of Otago, P.O. Box 56, Dunedin, New Zealand. Fax: +64 3 479 8311 Email: scockcroft@commerce.otago.ac.nz