Showing papers presented at "ACM international conference on Digital libraries in 2000"

PDF

Open Access

Proceedings Article•DOI•

Snowball: extracting relations from large plain-text collections

[...]

Eugene Agichtein¹, Luis Gravano¹•Institutions (1)

01 Jun 2000

TL;DR: This paper develops a scalable evaluation methodology and metrics for the task, and presents a thorough experimental evaluation of Snowball and comparable techniques over a collection of more than 300,000 newspaper documents.

...read moreread less

Abstract: Text documents often contain valuable structured data that is hidden Yin regular English sentences. This data is best exploited infavailable as arelational table that we could use for answering precise queries or running data mining tasks.We explore a technique for extracting such tables from document collections that requires only a handful of training examples from users. These examples are used to generate extraction patterns, that in turn result in new tuples being extracted from the document collection.We build on this idea and present our Snowball system. Snowball introduces novel strategies for generating patterns and extracting tuples from plain-text documents.At each iteration of the extraction process, Snowball evaluates the quality of these patterns and tuples without human intervention,and keeps only the most reliable ones for the next iteration. In this paper we also develop a scalable evaluation methodology and metrics for our task, and present a thorough experimental evaluation of Snowball and comparable techniques over a collection of more than 300,000 newspaper documents.

...read moreread less

1,399 citations

Proceedings Article•DOI•

Content-based book recommending using learning for text categorization

[...]

Raymond J. Mooney¹, Loriene Roy¹•Institutions (1)

University of Texas at Austin¹

01 Jun 2000

TL;DR: This work describes a content-based book recommending system that utilizes information extraction and a machine-learning algorithm for text categorization and shows initial experimental results demonstrate that this approach can produce accurate recommendations.

...read moreread less

Abstract: Recommender systems improve access to relevant products and information by making personalized suggestions based on previous examples of a user's likes and dislikes. Most existing recommender systems use collaborative filtering methods that base recommendations on other users' preferences. By contrast,content-based methods use information about an item itself to make suggestions.This approach has the advantage of being able to recommend previously unrated items to users with unique interests and to provide explanations for its recommendations. We describe a content-based book recommending system that utilizes information extraction and a machine-learning algorithm for text categorization. Initial experimental results demonstrate that this approach can produce accurate recommendations.

...read moreread less

1,330 citations

Proceedings Article•DOI•

Visualizing digital library search results with categorical and hierarchical axes

[...]

Ben Shneiderman¹, David Feldman¹, Anne Rose¹, Xavier Ferré Grau¹•Institutions (1)

University of Maryland, College Park¹

01 Jun 2000

TL;DR: A simplified two-dimensional display that uses categorical and hierarchical axes, called hieraxes, that is applied to a digital video library of science topics used by middle school teachers, a legal information system, and a technical library using the ACM Computing Classification System.

...read moreread less

Abstract: Digital library search results are usually shown as a textual list, with 10-20 items per page. Viewing several thousand search results at once on a two-dimensional display with continuous variables is a promising alternative. Since these displays can overwhelm some users, we created a simplified two-dimensional display that uses categorical and hierarchical axes, called hieraxes. Users appreciate the meaningful and limited number of terms on each hieraxis. At each grid point of the display we show a cluster of color-coded dots or a bar chart. Users see the entire result set and can then click on labels to move down a level in the hierarchy. Handling broad hierarchies and arranging for imposed hierarchies led to additional design innovations. We applied hieraxes to a digital video library of science topics used by middle school teachers, a legal information system, and a technical library using the ACM Computing Classification System. Feedback from usability testing with 32 subjects revealed strengths and weaknesses.

...read moreread less

183 citations

Proceedings Article•DOI•

Acrophile: an automated acronym extractor and server

[...]

Leah S. Larkey¹, Paul Ogilvie¹, M. Andrew Price¹, Brenden Tamilio²•Institutions (2)

University of Massachusetts Amherst¹, Hampshire College²

01 Jun 2000

TL;DR: A web server for acronym and abbreviation lookup, containing a collection of acronyms and their expansions gathered from a large number of web pages by a heuristic extraction process, which has the potential to be much more inclusive as data from more web pages are processed.

...read moreread less

Abstract: We implemented a web server for acronym and abbreviation lookup, containing a collection of acronyms and their expansions gathered from a large number of web pages by a heuristic extraction process. Several different extraction algorithms were evaluated and compared. The corpus resulting from the best algorithm is comparable to a high-quality hand-crafted site, but has the potential to be much more inclusive as data from more web pages are processed.

...read moreread less

151 citations

Proceedings Article•DOI•

Greenstone: a comprehensive open-source digital library software system

[...]

Ian H. Witten¹, Stefan J. Boddie¹, David Bainbridge¹, Rodger J. McNab•Institutions (1)

University of Waikato¹

01 Jun 2000

TL;DR: The Greenstone digital library software is described, a comprehensive, open-source system for the construction and presentation of information collections that offers effective full-text searching and metadata-based browsing facilities that are attractive and easy to use.

...read moreread less

Abstract: This paper describes the Greenstone digital library software, a comprehensive, open-source system for the construction and presentation of information collections. Collections built with Greenstone offer effective full-text searching and metadata-based browsing facilities that are attractive and easy to use. Moreover, they are easily maintainable and can be augmented and rebuilt entirely automatically. The system is extensible: software "plugins" accommodate different document and metadata types.

...read moreread less

137 citations

Proceedings Article•DOI•

Server selection on the World Wide Web

[...]

Nick Craswell¹, Peter Bailey¹, David Hawking²•Institutions (2)

Australian National University¹, Commonwealth Scientific and Industrial Research Organisation²

01 Jun 2000

TL;DR: A technique for capturing an accurate 3D representation of library materials which can be integrated directly into current digitization setups will allow digitization efforts to provide patrons with more realistic digital facsimile of library material.

...read moreread less

Abstract: Significant efforts are being made to digitize rare and valuable library materials, with the goal of providing patrons and historians digital facsimiles that capture the "look and feel" of the original materials. This is often done by digitally photographing the materials and making high resolution 2D images available. The underlying assumption is that the objects are flat. However, older materials may not be flat in practice, being warped and crinkled due to decay, neglect, accident and the passing of time. In such cases, 2D imaging is insufficient to capture the "look and feel" of the original. For these materials, 3D acquisition is necessary to create a realistic facsimile. This paper outlines a technique for capturing an accurate 3D representation of library materials which can be integrated directly into current digitization setups. This will allow digitization efforts to provide patrons with more realistic digital facsimile of library materials.

...read moreread less

125 citations

Proceedings Article•DOI•

Overview of Mondou Web search engine using text mining and information visualizing technologies

[...]

H. Kawano¹•Institutions (1)

Kyoto University¹

13 Nov 2000

TL;DR: This work is developing the Japanese Web search engine "Mondou (RCAAU"), one of the first generation of Web search engines, and introduces the concept of an integrated query mechanism for different search engines based on the KQML agents.

...read moreread less

Abstract: As the volume of Web pages on the Internet is increasing rapidly, it is becoming hard for users to discover valuable Web resources. It is especially difficult for naive users to discover informative pages by popular Web search engines, since they don't have background and domain knowledge about the status of Web systems. Therefore, many kinds of Web search engines have been developed in order to support the processes of Web information retrieval. We are developing the Japanese Web search engine "Mondou (RCAAU)". Though our engine is one of the first generation of Web search engines, we tried to implement the rapidly emerging technologies of data mining in our search engine from 1995. We are also implementing Java applets based on information visualization. The author presents technical overviews of the Mondou Web search engine. One of the most important techniques is the text mining algorithms based on the primitive association rules. Mondou provides highly relevant feedback keywords to users, in order to support search steps. Using the associative keywords, users can modify the combination of keywords in the initial query. We also introduce the concept of an integrated query mechanism for different search engines based on the KQML agents. Furthermore, in order to visualize the characteristics of search results, we are developing Java applets to display the ROC graph and the clusters of specific documents. We are also trying the improve Web robots for the Mondou system from the view point of data cleaning. Finally, we discuss the effectiveness and performance of our Web search engine.

...read moreread less

95 citations

Proceedings Article•DOI•

Knowledge-based metadata extraction from PostScript files

[...]

Giovanni Giuffrida¹, Eddie C. Shek¹, Jihoon Yang¹•Institutions (1)

HRL Laboratories¹

01 Jun 2000

TL;DR: A system, based on a novel spatial/visual knowledge principle, for extracting metadata from scientific papers stored as PostScript files that embeds the general knowledge about the graphical layout of a scientific paper to guide the metadata extraction process.

...read moreread less

Abstract: The automatic document metadata extraction process is animportant task in a world where thousands of documents are just one``click'' away. Thus, powerful indices are necessary to support effective retrieval. The upcoming XML standard represents an important step in this direction as itssemistructuredrepresentation conveys document metadata together with the text of the document. For example, retrieval of scientific papers by authors or affiliations would be a straightforward tasks if papers were stored in XML.Unfortunately, today, the largest majority of documents on the web are available in forms that do not carryadditional semantics. Converting existing documents to a semistructured representation is time consuming and no automatic process can be easily applied. In this paper we discuss a system, based on a novel spatial/visualknowledge principle, for extracting metadata from scientific papers storedas PostScript files. Our system embeds the general knowledge about the graphical layout of a scientific paper to guide the metadata extraction process. Our system can effectively assist the automatic index creation for digital libraries.

...read moreread less

91 citations

Proceedings Article•DOI•

A mediation infrastructure for digital library services

[...]

Sergey Melnik¹, Hector Garcia-Molina¹, Andreas Paepcke¹•Institutions (1)

Stanford University¹

01 Jun 2000

TL;DR: A flexible and dynamic mediator infrastructure that allows mediators to be composed from a set of modules (``blades'') that implements a particular mediation function, such as protocol translation, query translation, or result merging is described.

...read moreread less

Abstract: Digital library mediators allow interoperation between diverse information services. In this paper we describe a flexible and dynamic mediator infrastructure that allows mediators to be composed from a set of modules (``blades''). Each module implements a particular mediation function, such as protocol translation, query translation, or result merging. All the information used by the mediator, including the mediator logic itself, is represented by an RDF graph.We illustrate our approach using a mediation scenario involving a Dienst and a Z39.50 server, and we discuss the potential advantages and weaknesses of our framework.

...read moreread less

82 citations

Proceedings Article•DOI•

Effects of annotations on student readers and writers

[...]

Joanna Wolfe¹•Institutions (1)

University of Texas at Austin¹

01 Jun 2000

TL;DR: Results indicate that annotations improve recall of emphasized items, influence how specific arguments in the source materials are perceived, decrease students' tendencies to unnecessarily summarize, and implications for the design and implementation of digitally annotated materials are discussed.

...read moreread less

Abstract: Recent research on annotations has focused on how readers annotate texts, ignoring the question of how reading annotations might affect subsequent readers of a text. This paper reports on a study of persuasive essays written by 123 undergraduates receiving primary source materials annotated in various ways. Findings indicate that annotations improve Findings indicate that annotations improve recall of emphasized items, influence how specific arguments in the source materials are perceived, decrease students' tendencies to unnecessarily summarize. Of particular interest is that students' perceptions of the annotator appeared to greatly influence how they responded to the annotated material. Using this study as a basis, I discuss implications for the design and implementation of digitally annotated materials.

...read moreread less

81 citations

Proceedings Article•DOI•

Preserving digital information forever

[...]

Andrew Waugh¹, Ross Wilkinson¹, Brendan Hills¹, Jon Dell'oro¹•Institutions (1)

Commonwealth Scientific and Industrial Research Organisation¹

01 Jun 2000

TL;DR: This paper describes the preservation approach adopted in the Victorian Electronic Record Strategy (VERS) which is currently being trialed within the Victorian government, one of the states of Australia.

...read moreread less

Abstract: Well within our lifetime we can expect to see most information being created, stored and used digitally. Despite the growing importance of digital data, the wider community pays almost no attention to the problems of preserving this digital information for the future. Even within the archival and library communities most work on digital preservation has been theoretical, not practical, and highlights the problems rather than giving solutions. Physical libraries have to preserve information for long periods and this is no less true of their digital equivalents. This paper describes the preservation approach adopted in the Victorian Electronic Record Strategy (VERS) which is currently being trialed within the Victorian government, one of the states of Australia. We review the various preservation approaches that have been suggested and describe in detail encapsulation, the approach which underlies the VERS format. A key difference between the VERS project and previous digital preservation projects is the focus within VERS on the construction of actual systems to test and implement the proposed technology. VERS is not a theoretical study in preservation.

...read moreread less

Proceedings Article•DOI•

Developing services for open eprint archives: globalisation, integration and the impact of links

[...]

Steve Hitchcock¹, Les Carr¹, Zhuoan Jiao¹, Donna Bergmark², Wendy Hall¹, Carl Lagoze², Stevan Harnad¹ - Show less +3 more•Institutions (2)

University of Southampton¹, Cornell University²

01 Jun 2000

TL;DR: The Open Citation project is described, which will focus on linking papers held in freely accessible eprint archives such as the Los Alamos physics archives and other distributed archives, and which will build on the work of the Open Archives initiative to make the data held in such archives available to compliant services.

...read moreread less

Abstract: The rapid growth of scholarly information resources available in electronic form and their organisation by digital libraries is proving fertile ground for the development of sophisticated new services, of which citation linking will be one indispensable example. Many new projects, partnerships and commercial agreements have been announced to build citation linking applications. This paper describes the Open Citation (OpCit) project, which will focus on linking papers held in freely accessible eprint archives such as the Los Alamos physics archives and other distributed archives, and which will build on the work of the Open Archives initiative to make the data held in such archives available to compliant services. The paper emphasises the work of the project in the context of emerging digital library information environments, explores how a range of new linking tools might be combined and identifies ways in which different linking applications might converge. Some early results of linked pages from the OpCit project are reported.

...read moreread less

Proceedings Article•DOI•

Scalable browsing for large collections: a case study

[...]

Gordon W. Paynter¹, Ian H. Witten¹, Sally Jo Cunningham¹, George Buchanan²•Institutions (2)

University of Waikato¹, Middlesex University²

01 Jun 2000

TL;DR: A case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site and the ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing.

...read moreread less

Abstract: Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use.To convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.

...read moreread less

Proceedings Article•DOI•

Compus: visualization and analysis of structured documents for understanding social life in the 16th century

[...]

Jean-Daniel Fekete¹, Nicole Dufournaud²•Institutions (2)

École des mines de Nantes¹, University of Nantes²

01 Jun 2000

TL;DR: The Compus visualization system that assists in the exploration and analysis of structured document corpora encoded in XML, providing a synoptic visualization of a corpus and allowing for dynamic queries and structural transformations, assists researchers in finding regularities or discrepancies leading to a higher level analysis of historic source.

...read moreread less

Abstract: This article describes the Compus visualization system that assists in the exploration and analysis of structured document corpora encoded in XML. Compus has been developed for and applied to a corpus of 100 French manuscript letters of the 16th century, transcribed and encoded for scholarly analysis using the recommendations of the Text Encoding Initiative. By providing a synoptic visualization of a corpus and allowing for dynamic queries and structural transformations, Compus assists researchers in finding regularities or discrepancies, leading to a higher level analysis of historic source. Compus can be used with other richly encoded text corpora as well.

...read moreread less

Proceedings Article•DOI•

Document overlap detection system for distributed digital libraries

[...]

Krisztián Monostori¹, Arkady Zaslavsky¹, Heinz Schmidt¹•Institutions (1)

Monash University¹

01 Jun 2000

TL;DR: The MatchDetectReveal (MDR) system as mentioned in this paper uses a modified suffix tree representation to identify the exact overlapping chunks and its performance is also presented, which is capable of identifying overlapping and plagiarised documents.

...read moreread less

Abstract: In this paper we introduce the MatchDetectReveal(MDR) system, which is capable of identifying overlapping and plagiarised documents. Each component of the system is briefly described. The matching-engine component uses a modified suffix tree representation, which is able to identify the exact overlapping chunks and its performance is also presented.

...read moreread less

Proceedings Article•DOI•

Modeling customizable Web applications - a requirement's perspective

[...]

Gerti Kappel, Werner Retschitzegger, Wieland Schwinger

13 Nov 2000

TL;DR: A framework of requirements, covering the design space of customize Web applications is suggested and existing approaches for developing customizable Web applications are surveyed and general shortcomings are identified pointing the way to next-generation modeling methods.

...read moreread less

Abstract: The Web is more and more used as a platform for full-fledged increasingly complex applications, where a huge amount of change-intensive data is managed by underlying database systems. From a software engineering point of view, the development of Web applications requires proper modeling methods in order to ensure architectural soundness and maintainability. Existing modeling methods for Web applications, however, fall short on considering a major requirement posed on today's Web applications, namely customization. Web applications should be customizable with respect to various context factors comprising different user preferences, device capabilities and locations in mobile scenarios, to mention just a few. The goal of this paper is twofold. First, a framework of requirements, covering the design space of customizable Web applications is suggested. Second, on the basis of this framework, existing approaches for developing customizable Web applications are surveyed and general shortcomings are identified pointing the way to next-generation modeling methods.

...read moreread less

Proceedings Article•DOI•

Re-engineering structures from Web documents

[...]

Chuang-Hue Moh¹, Ee-Peng Lim¹, Wee Keong Ng¹•Institutions (1)

Nanyang Technological University¹

01 Jun 2000

TL;DR: A general framework for reverse engineering the underlying structures of the DTD from a collection of similarly structured XML documents when they share some common but unknown DTDs is proposed.

...read moreread less

Abstract: To realize a wide range of applications (including digital libraries) on the Web, a more structured way of accessing the Web is required and such requirement can be facilitated by the use of XML standard. In this paper, we propose a general framework for reverse engineering (or re-engineering) the underlying structures i.e.,the DTD from a collection of similarly structured XML documents when they share some common but unknown DTDs. The essential data structures and algorithms for the DTD generation have been delveloped and experiments on real Web collections have been conducted to demonstrate their feasibilty. In addition, we also proposed a method ofimposing a constraint on the repetitiveness on the element in a DTD rule to further simplify the generated DTD without compromising their correctness.

...read moreread less

Proceedings Article•DOI•

A digital museum of Taiwanese butterflies

[...]

Jen-Shin Hong¹, Herng-Yow Chen¹, Jieh Hsiang¹•Institutions (1)

National Chi Nan University¹

01 Jun 2000

TL;DR: A comprehensive digital collection of Taiwan's butterflies to provide a modern research environment on butterflies for academic institutions, as well as an interactive butterfly educational environment for the general public.

...read moreread less

Abstract: Taiwan is renown for its great variety of butterflies. There are about 400 species, a number of which unique to Taiwan, over its 36,500 sq km land. Last year we built a comprehensive digital collection of Taiwan's butterflies to provide a modern research environment on butterflies for academic institutions, as well as an interactive butterfly educational environment for the general public. Our digital museum emphasizes on the ease to use, and provides a number of innovative features to help the user fully utilize the information provided by the system. The digital museum is accessible through the Web at http://digimuse.nmns.edu.tw.

...read moreread less

Proceedings Article•DOI•

Live from the stacks: user feedback on mobile computers and wireless tools for library patrons

[...]

Michael L. W. Jones¹, Robert Rieger¹, Paul Treadwell¹•Institutions (1)

Cornell University¹

01 Jun 2000

TL;DR: Findings from a library technologies user survey and on-site mobile library access prototype testing are outlined and future research directions can be derived from the results of these two studies are presented.

...read moreread less

Abstract: Digital library research is made more robust and effective when end-user opinions and viewpoints inform the research, design and development process. A rich understanding of user tasks and contexts is especially necessary when investigating the use of mobile computers in traditional and digital library environments, since the nature and scope of the research questions at hand remain relatively undefined. This paper outlines findings from a library technologies user survey and on-site mobile library access prototype testing, and presents future research directions that can be derived from the results of these two studies.

...read moreread less

Proceedings Article•DOI•

Extracting and visualizing semantic structures in retrieval results for browsing

[...]

Katy Börner¹•Institutions (1)

Indiana University¹

01 Jun 2000

TL;DR: An approach that organizes retrieval results semantically and displays them spatially for browsing is introduced, implemented to visualize retrieval results from two different databases: the Science Citation Index Expanded and theDido Image Bank.

...read moreread less

Abstract: The paper introduces an approach that organizes retrieval results semantically and displays them spatially for browsing. Latent Semantic Analysis as well as cluster techniques are applied for semantic data analysis. A modified Boltzman algorithm is used to layout documents in a two-dimensional space for interactive exploration. The approach was implemented to visualize retrieval results from two different databases: the Science Citation Index Expanded and theDido Image Bank.

...read moreread less

Proceedings Article•DOI•

WebSifter: an ontology-based personalizable search agent for the Web

[...]

A. Scime¹, L. Kerschberg²•Institutions (2)

State University of New York System¹, George Mason University²

13 Nov 2000

TL;DR: This work presents a methodology, architecture, and proof-of-concept prototype for query construction and results analysis that provides the user with a ranking of choices based on the user's determination of importance.

...read moreread less

Abstract: The World Wide Web provides access to a great deal of information on a vast array of subjects. In a typical Web search a vast amount of information is retrieved. The quantity can be overwhelming, and much of the information may be marginally relevant or completely irrelevant to the user's request. We present a methodology, architecture, and proof-of-concept prototype for query construction and results analysis that provides the user with a ranking of choices based on the user's determination of importance. The user initially designs the query with assistance from the user's profile, a thesaurus, and previously constructed queries acting as a taxonomy of the information requirements. After the query has returned its results, decision analytic methods and information source reliability information are used in conjunction with the expanded taxonomy to rank the solution candidates.

...read moreread less

Proceedings Article•

Proceedings of the fifth ACM conference on Digital libraries

[...]

Peter J. Nürnberg¹, David L. Hicks¹, Richard Furuta²•Institutions (2)

Aalborg University – Esbjerg¹, Texas A&M University²

01 Jun 2000

Proceedings Article•DOI•

MiBiblio: personal spaces in a digital library universe

[...]

Lourdes Fernández¹, J. Alfredo Sánchez¹, Alberto Javier García García¹•Institutions (1)

Universidad de las Américas Puebla¹

01 Jun 2000

TL;DR: Mbiblio allows users to create virtual places the authors term personal spaces, and as users find useful items in the repositories, they organize these items and keep them handy in their personal spaces for future use.

...read moreread less

Abstract: This paper describes MiBiblio, a highly personalizable interface to large collections in digital libraries. MiBiblio allows users to create virtual places we term personal spaces. As users find useful items in the repositories, they organize these items and keep them handy in their personal spaces for future use. Personal spaces may also be updated by user agents.

...read moreread less

Proceedings Article•DOI•

Browsing the structure of multimedia stories

[...]

Robert B. Allen¹, Jane Acheson¹•Institutions (1)

University of Maryland, College Park¹

01 Jun 2000

TL;DR: The browser provides a framework for interactive summaries, video of the narrative of Corduroy, a children's short feature which was analyzed in detail.

...read moreread less

Abstract: Stories may be analyzed as sequences of causally-related events and reactions to those events by the characters. We employ a notation of plot elements, similar to one developed by Lehnert,and we extend that by forming higher level "story threads"Stories may be analyzed as sequences of causally-related events and reactions to those events by the characters. We employ a notation of plot elements, similar to one developed by Lehnert,and we extend that by forming higher level "story threads" We apply the browser to Corduroy, a children's short feature which was analyzed in detail. We provide additional illustrations with analysis of Kiss of Death, a Film Noir classic. Effectively, the browser provides a framework for interactive summaries, video of the narrative

...read moreread less

Proceedings Article•DOI•

Patron-augmented digital libraries

[...]

Dion Hoe-Lian Goh¹, John J. Leggett¹•Institutions (1)

Texas A&M University¹

01 Jun 2000

TL;DR: A case is presented for digital scholarship in which patrons perform all scholarly work electronically, and a proposal is made for patron-augmented digital libraries (PADLs), a class of digital libraries that supports the digital scholarship of its patrons.

...read moreread less

Abstract: Digital library research is mostly focused on the generation of large collections of multimedia resources and state-of-the-art tools for their indexing and retrieval. However, digital libraries should provide more than advanced collection maintenance and retrieval services since the ultimate goal of any (academic) library is to serve the scholarly needs of its users. This paper begins by presenting a case for digital scholarship in which patrons perform all scholarly work electronically. A proposal is then made for patron-augmented digital libraries (PADLs), a class of digital libraries that supports the digital scholarship of its patrons. Finally, a prototype PADL (called Synchrony) providing access to video segments and associated textual transcripts is described. Synchrony allows patrons to search the library for artifacts, create annotations/original compositions, integrate these artifacts to form synchronized mixed text and video presentations and, after suitable review, publish these presentations into the digital library if desired. A study to evaluate the PADL concept and the usability of Synchrony is also discussed. The study revealed that participants were able to use Synchrony for the authoring and publishing of presentations and that attitudes toward PADLs were generally positive.

...read moreread less

Proceedings Article•DOI•

Purpose and usability of digital libraries

[...]

Yin-Leng Theng¹, Norliza Mohd-Nasir¹, Harold Thimbleby¹•Institutions (1)

Middlesex University¹

01 Jun 2000

TL;DR: A preliminary study was conducted to help understand the purpose of Digital libraries and to investigate whether meaningful results could be obtained from small user studies of digital libraries.

...read moreread less

Abstract: A preliminary study was conducted to help understand the purpose of digital libraries (DLs) and to investigate whether meaningful results could be obtained from small user studies of digital libraries. Results stress the importance of mental models, and of "traditional" library support.

...read moreread less

Proceedings Article•DOI•

Building a digital library of captured educational experiences

[...]

Gregory D. Abowd¹, L.D. Harvel, Jason A. Brotherton¹•Institutions (1)

Georgia Institute of Technology¹

13 Nov 2000

TL;DR: While efforts have focused quite a bit on short-term access that occurs over the duration of a course, it is clear that significant value is added to the archive as it is tuned for long-term use.

...read moreread less

Abstract: Since 1995, we have been researching the application of ubiquitous computing technology to support the automated capture of live university lectures so that students and teachers may later access them. With virtually no additional effort beyond that which lecturers already expend on preparing and delivering a lecture, we are able to create a repository, or digital library, of rich educational experiences that is constantly growing. The resulting archive includes a heterogeneous mix of materials presented in lectures. We discuss access issues for this digital library that cover short-term and long-term use of the repository. While our efforts have focused quite a bit on short-term access that occurs over the duration of a course, it is clear that significant value is added to the archive as it is tuned for long-term use. These long-term access issues for an experiential digital library have not yet been addressed, and we highlight some of those challenges in this paper.

...read moreread less

Proceedings Article•DOI•

Server-side automatic metadata generation using qualified Dublin Core and RDF

[...]

C. Jenkins¹, D. Inman•Institutions (1)

London South Bank University¹

13 Nov 2000

TL;DR: The paper proposes a technique for automatically generating Qualified Dublin Core metadata (Weibel, 2000) on a Web server using a Java Servlet, structured using the Resource Description Framework (RDF) and expressed in eXtensible Markup Language (XML).

...read moreread less

Abstract: The paper proposes a technique for automatically generating Qualified Dublin Core metadata (Weibel, 2000) on a Web server using a Java Servlet. The metadata is structured using the Resource Description Framework (RDF) and expressed in eXtensible Markup Language (XML). The descriptions cover ten of the fifteen standard Dublin Core metadata elements and semantic precision is increased by element refinement and encoding scheme qualifiers. The servlet refinement and encoding scheme qualifiers. The servlet produces rich but interoperable metadata encompassing data from all three of the main element groups; content, instantiation and intellectual property. The generated descriptions could most obviously be used by tools for resource discovery but also by local data management applications. Sites wishing to submit content to a portal or specialised digital library could be encouraged to run such an application in order to automate resource description.

...read moreread less

Proceedings Article•DOI•

The psychology of multimedia databases

[...]

Mark G. L. M. van Doorn¹, Arjen P. de Vries¹•Institutions (1)

University of Twente¹

01 Jun 2000

TL;DR: user interaction and an automatically created thesaurus that maps text concepts and internal image concept representations, generated by various feature extraction algorithms, improve the query formulation process of the image retrieval system.

...read moreread less

Abstract: Multimedia information retrieval in digital libraries is a difficult task for computers in general. Humans on the other hand are experts in perception, concept representation, knowledge organization and memory retrieval. Cognitive psychology and science describe how cognition works in humans, but can offer valuable clues to information retrieval researchers as well. Cognitive psychologists view the human mind as a general-purpose Ysymbol-processing system that interacts with the world. A multimedia Yinformation retrieval system can also be regarded as a symbol-processing system that interacts with the environment. Its underlying information retrieval model can be seen as a cognitive framework that describes how the describe the design and implementation of a combined text/image retrieval system (as an example of a multimedia retrieval system) that is inspired by cognitive theories such as Paivio's dual coding theory and Marr's theory of perception. User interaction and an automatically created thesaurus that maps text concepts and internal image concept representations, generated by various feature extraction algorithms, improve the query formulation process of the image retrieval system. Unlike most``multimedia databases'' found in literature, this image retrieval system uses the the functionality provided by an extensible multimedia DBMSthat itself is part of an open distributed environment.

...read moreread less

Proceedings Article•DOI•

A study of user behavior in an immersive virtual environment for digital libraries

[...]

Fernando A. Das Neves¹, Edward A. Fox¹•Institutions (1)

Virginia Tech¹

01 Jun 2000

TL;DR: The focus of this study was the effect of clustering techniques and query highlighting on search strategy users develop in the virtual environment, and whether position or spatial arrangement influenced user behavior.

...read moreread less

Abstract: In this paper we present a 2x3 factorial design study evaluating the limits and differences on the behavior of 10 users when searching in a virtual reality representation that mimics the arrangement of a traditional library. The focus of this study was the effect of clustering techniques and query highlighting on search strategy users develop in the virtual environment, and whether position or spatial arrangement influenced user behavior. We found several particularities that can be attributed to the differences in the VR environment.This study's results identify: 1) the need of co-designing both spatial arrangement and interaction method; 2) adifficulty novice users faced when using clusters to identify commontopics; 3) the influence of position and distance on users' selection of collection items to inspect; and 4) that users did not search until found the best match, but only until they found a satisfactory match.

...read moreread less