Book ChapterDOI
Importing Documents and Metadata into Digital Libraries: Requirements Analysis and an Extensible Architecture
Ian H. Witten,David Bainbridge,Gordon W. Paynter,Stefan J. Boddie +3 more
- pp 390-405
Reads0
Chats0
TLDR
This paper analyzes the requirements of the import process for general digital libraries and argues that they are best met by an extensible architecture that facilitates the addition of new document formats and metadata facilities to existing digital library systems.Abstract:
Flexible digital library systems need to be able to accept, or "import," documents and metadata in a variety of forms, and associate metadata with the appropriate documents. This paper analyzes the requirements of the import process for general digital libraries. The requirements include (a) format conversion for source documents, (b) the ability to incorporate existing conversion utilities, (c) provision for metadata to be specified in the document files themselves and/or in separate metadata files, (d) format conversion for metadata files, (e) provision for metadata to be computed from the document content, and (f) flexible ways of associating metadata with documents or sets of documents. We argue that these requirements are so open-ended that they are best met by an extensible architecture that facilitates the addition of new document formats and metadata facilities to existing digital library systems. An implementation of this architecture is briefly described.read more
Citations
More filters
Journal ArticleDOI
Towards a digital library theory: a formal digital library ontology
TL;DR: A formal ontology for DLs is proposed that defines the fundamental concepts, relationships, and axiomatic rules that govern the DL domain, therefore providing a frame of reference for the discussion of essential concepts of DL design and construction.
Journal ArticleDOI
Automatic Scientific Document Clustering Using Self-organized Multi-objective Differential Evolution
TL;DR: The effectiveness of the proposed approach, namely self-organizing map based multi-objective document clustering technique (SMODoc_clust) is shown in automatic classification of some scientific articles and web-documents.
Proceedings Article
Visual collaging of music in a digital library
TL;DR: A prototype system is described that combines images located through textual metadata with a visualisation technique known as collaging to provide a leisurely, undirected interaction with a music collection.
Proceedings ArticleDOI
A new framework for building digital library collections
TL;DR: This paper introduces a new framework for building digital library collections and contrasts it with existing systems, and demonstrates its flexibility by showing howdigital library collections can be extended and altered to satisfy new requirements.
Patent
Creating variations when transforming data into consumable content
Jennifer P. Michelstein,David Benjamin Lee,Katrika Morris,Christopher H. Pratley,Sarah Faulkner,Steven Richard Hollasch,Nathaniel George Freier,Hai Liu,Chad Garrett Waldman,Brett D. Brewer +9 more
TL;DR: In this article, a computing device can execute a transformation engine for transforming data into consumable content, which can be configured to analyze the data to identify relationships among data elements or other portions of the data, and to identify any possible approaches to transforming the data ("worlds") based upon the relationships and the data.
References
More filters
Proceedings ArticleDOI
Inductive learning algorithms and representations for text categorization
TL;DR: A comparison of the effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, realtime classification speed, and classification accuracy is compared.
Proceedings Article
Domain-specific keyphrase extraction
TL;DR: This paper shows that a simple procedure for keyphrase extraction based on the naive Bayes learning scheme performs comparably to the state of the art, and explains how this procedure's performance can be boosted by automatically tailoring the extraction process to the particular document collection at hand.
Journal ArticleDOI
The Santa Fe Convention of the Open Archives Initiative
TL;DR: The convention presents a simple technical and organizational framework to support basic interoperability among e-print archives and participants have expressed the intention of implementing this framework to allow for interoperability experiments in the course of the year 2000.
Proceedings ArticleDOI
Using compression to identify acronyms in text
TL;DR: This article used several PPMD models to encode the acronym in terms of its definition, including whether the acronym occurred before or after its definition (direction), distance between the acronym and the definition (first-word offset), the pattern of words in the definition with letters in the acronym (subsequent-word offsets), and the number of letters taken from each of those words.
Proceedings ArticleDOI
Power to the people: end-user building of digital library collections
TL;DR: An interface that makes it easy for people to build their own library collections and an interface for the administrative user who is responsible for maintaining a digital library installation are described.