Proceedings ArticleDOI
Metadata and data structures for the historical newspaper digital library
Robert B. Allen,John Schalow +1 more
- pp 147-153
TLDR
In this paper, the authors examine metadata and data-structure issues for the Historical Newspaper Digital Library and propose a framework for the logical structure and physical layout of metadata relevant to the image processing and to the historians who will use this collection.Abstract:
We examine metadata and data-structure issues for the Historical Newspaper Digital Library. This project proposes to digitize and then do OCR and linguisting processing on several years worth of historical newspapers. Newspapers are very complex information objects so developing a rich description of their content is challenging. In addition to frameworks for the logical structure and physical layout, we propose metadata relevant to the image processing and to the historians who will use this collection. Finally, we consider how the metadata infrastructure might be managed as it evolves with improved text processing capabilities and how an infrastructure might be developed to support a community of users.read more
Citations
More filters
Book
Reading and Writing the Electronic Book
TL;DR: This book begins with a brief historical overview the history of electronic books, including the social and technical forces that have shaped their development, and takes a closer look at the sociality of reading: how the authors read in a group and how they share what they read.
Journal ArticleDOI
The architecture of TrueViz: a groundTRUth/metadata editing and VIsualiZing ToolKit
Chang Ha Lee,Tapas Kanungo +1 more
TL;DR: TrueViz is implemented in the Java programming language and works on various platforms including Windows and Unix and reads and stores groundtruth/metadata in XML format, and reads a corresponding image stored in TIFF image file format.
Proceedings ArticleDOI
The challenge of virginia banks: an evaluation of named entity analysis in a 19th-century newspaper collection
Gregory Crane,Alison Jones +1 more
TL;DR: This paper evaluates automatic extraction of ten named entity classes from a 19th century newspaper, the Civil War years of the Richmond Times Dispatch, digitized with IMLS support by the University of Richmond, and suggests the kinds of knowledge sources that digital libraries need to assemble as part of their machine readable reference collections to support named entity identification as a core service.
An Annotated Bibliography on Temporal and Evolution Aspects in the World Wide Web
TL;DR: The present bibliography reflects interest by collecting the references concerning the handling of time and evolution issues in World Wide Web research by following several fortunate bibliographies on time-varying information.
Patent
Method and system for forming a hyperlink reference and embedding the hyperlink reference within an electronic version of a paper
TL;DR: In this article, the authors describe a method for storing a version of a mass-produced printed paper, and forming a reference within the version, which is associated with an operation and at least a portion of the version.
References
More filters
Book
Handbook of Character Recognition and Document Image Analysis
Horst Bunke,Patrick S. P. Wang +1 more
TL;DR: Arabic character recognition, A. Amin automatic reading of braille documents, and Antonacopoulos techniques for improving OCR results.
Journal ArticleDOI
Pink Panther: a Complete Environment for Ground-Truthing and Benchmarking Document Page Segmentation
Berrin Yanikoglu,Luc Vincent +1 more
TL;DR: A new approach for the automatic evaluation of document page segmentation algorithms that is region-based: segmentation quality is assessed by comparing the segmentation output, described as a set of regions, to the corresponding ground-truth.
Book
The newspaper designer's handbook
TL;DR: This article took a hands-on approach to newspaper design techniques from basic page layout to complex info-graphics, emphasizing the importance of a fundamental yet often overlooked aspect of design, and a new section on the newspaper design report card.
Journal ArticleDOI
An approach to a digital library of newspapers
TL;DR: A new application for retrieving news from a large electronic bank of newspapers is intended to manage past issues of newspapers in such a way that users are able to draw up chronicles and trends about reported topics.