Author
Colleen Fallaw
Bio: Colleen Fallaw is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Digital library & Metadata. The author has an hindex of 3, co-authored 12 publications receiving 33 citations.
Papers
More filters
••
21 Jun 2015TL;DR: This paper surveys the coverage of existing bibliographic ontologies in the context of meeting these scholarly needs, and provides an illustrated discussion of potential extensions that might fully realize a solution.
Abstract: Bibliographic metadata standards are a longstanding mechanism for Digital Libraries to manage records and express relationships between them. As digital scholarship, particularly in the humanities, incorporates and manipulates these records in an increasingly direct manner, existing systems are proving insufficient for providing the underlying addressability and relational expressivity required to construct and interact with complex research collections. In this paper we describe motivations for these "worksets" and the technical requirements they raise. We survey the coverage of existing bibliographic ontologies in the context of meeting these scholarly needs, and finally provide an illustrated discussion of potential extensions that might fully realize a solution.
12 citations
•
9 citations
••
12 Sep 2014TL;DR: An exploratory bibliometric study to examine and characterize music-related content in the HathiTrust Digital Library to determine in what ways the materials in HTDL could be considered to form a unique music digital library for use by musicology scholars and students.
Abstract: The HathiTrust Digital Library (HTDL) consists of digitized print materials contributed from the collections of some of the foremost research libraries of the world. The HTDL contains over 11 million volumes comprising approximately 3.9 billion pages. In this paper, we describe an exploratory bibliometric study to examine and characterize music-related content in the HTDL. Our study provides an overview of the music-related content in the HTDL as seen through the lenses of format, genre, language, and chronology. We seek to determine in what ways, if any, the materials in HTDL could be considered to form a unique music digital library for use by musicology scholars and students. We also suggest ways in which the music-related content of the HTDL holdings could be made more useful to users with musicological needs and interests.
4 citations
••
08 Sep 2014TL;DR: An ongoing assessment of the utility of the MARC-based metadata underlying the HathiTrust Digital Library is reported on and the implications for advanced computational access to texts in the HathoTrust are explored.
Abstract: Print-based libraries use metadata (specifically MARC catalog records) for both bibliographic control and to support discovery through online public access catalogs. Depending on its accuracy, completeness, and detail, metadata can afford an aerial view of a collection's topical strengths, scope of coverage, and item-to-item relationships, but the view offered is in part a function of metadata design. Most MARC records were created to support management of large print collections and optimized to meet the requirements of library online public access catalogs. How well do pre-existing MARC records serve the discovery needs of scholars using a large-scale digital library hosting collections of retrospectively digitized books and serials? This paper reports on an ongoing assessment of the utility of the MARC-based metadata underlying the HathiTrust Digital Library and explores the implications for advanced computational access to texts in the HathiTrust. We consider here the utility of metadata to scholars creating worksets for analysis, examining three user scenarios, which were gleaned from an ongoing user-requirements study done for the HathiTrust Research Center: (1) using metadata fields in combination for corpus characterization and discovery; (2) relying on metadata to identify resources of interest; and (3) using bibliographies of known items to seed research worksets. Our goal is to better understand the need for metadata remediation and augmentation and assess the scope of additional work required.
3 citations
Cited by
More filters
••
TL;DR: An overview of the different complex matching approaches is provided and a classification of thecomplex matching approaches based on their specificities (i.e., type of correspondences, guiding structure) is proposed.
Abstract: Simple ontology alignments, largely studied in the literature, link a single entity of a source ontology to a single entity of a target ontology. One of the limitations of these alignments is, however, their lack of expressiveness which can be overcome by complex alignments. While diverse state-of-the-art surveys mainly review the matching approaches in general, to the best of our knowledge, there is no study about the specificities of the complex matching problem. In this paper, an overview of the different complex matching approaches is provided. It proposes a classification of the complex matching approaches based on their specificities (i.e., type of correspondences, guiding structure). The evaluation aspects and the limitations of these approaches are also discussed. Insights for future work in the field are provided.
48 citations
••
18 citations
••
21 Jun 2015TL;DR: This paper surveys the coverage of existing bibliographic ontologies in the context of meeting these scholarly needs, and provides an illustrated discussion of potential extensions that might fully realize a solution.
Abstract: Bibliographic metadata standards are a longstanding mechanism for Digital Libraries to manage records and express relationships between them. As digital scholarship, particularly in the humanities, incorporates and manipulates these records in an increasingly direct manner, existing systems are proving insufficient for providing the underlying addressability and relational expressivity required to construct and interact with complex research collections. In this paper we describe motivations for these "worksets" and the technical requirements they raise. We survey the coverage of existing bibliographic ontologies in the context of meeting these scholarly needs, and finally provide an illustrated discussion of potential extensions that might fully realize a solution.
12 citations
••
TL;DR: This paper examines a selection of digitized e-books in several prominent digital repositories and discusses the impact of OCR technology on e-book text file formats, metadata, and the online reading experience.
Abstract: The electronic conversion of scanned image files to readable text using optical character recognition (OCR) software and the subsequent migration of raw OCR text to e-book text file formats are key remediation or media conversion technologies used in digital repository e-book production. Despite real progress, the OCR problem of reliability and accuracy in OCR-derived e-book text and metadata persists. This paper examines a selection of digitized e-books in several prominent digital repositories and discusses the impact of OCR technology on e-book text file formats, metadata, and the online reading experience.
11 citations