scispace - formally typeset
Search or ask a question

Intelligent Data Retrieval from Raster Images of Documents

TL;DR: The task is to automate the analysis of data contained in raster image documents for the purpose of intelligent information retrieval in a digital library, with contributions to the fields of document under­ standing and intelligent interactive information retrieval.
Abstract: Documents on conventional media, such as books, news­ papers and microfiche, can be converted into the digital form of raster images by the use of scanners. A digital li­ brary is a server on a computer network that can respond to user requests by retrieving relevant data contained within raster image documents as well as in other format­ ted ASCII documents. The task is to automate the analysis of data contained in raster image documents for the purpose of intelligent information retrieval in a digital library. . The task is to develop computational theories and algo­ rithms, with contributions to the fields of document under­ standing and intelligent interactive information retrieval. The limitations of a technology necessary to convert books to text-searchable form, viz., optical character recognition, will be addressed. Specific research agenda items: adap­ tive document understanding, robust page layout analysis, table understanding, intelligent text recognition, graphics analysis, topic categorization, content-based retrieval of captioned pictures and query processing for information retrieval.
Citations
More filters
Book ChapterDOI
01 Jan 1996
TL;DR: This chapter will survey some of that work, especially that which relates to the treatment of video and the use of digital video libraries for education and the proliferation of technical articles and special issues addressing these questions underscore their importance.
Abstract: The Information Age is fully upon us. A recent article noted that there are perhaps 50 million people using the Internet on a regular basis, and That “the current growth rate is about 15% per month (!) and this could well continue until almost all of those in the ‘developed world’ are connected” [FM94, p. 30]. In addition, the digital domain consists not only of text but increasingly of other media representations, from graphics images to audio to motion video. As the amount of information and number of users exponentially escalate, more attention focuses on the basic problems of information management: How do you digitize information? How can you then visualize it and find what you need? How do you use and manipulate it effectively? How is it stored and managed? The proliferation of technical articles and special issues addressing these questions underscore their importance; see for example the special issue on Content-based retrieval [Nar95] or digital libraries [F+95]. This chapter will survey some of that work, especially that which relates to the treatment of video and the use of digital video libraries For education.

51 citations

Journal ArticleDOI
TL;DR: The CORE (Chemical Online Retrieval Experiment) project is a library of primary journal articles in chemistry and the methods for building the system and accumulating the database are described.
Abstract: The CORE (Chemical Online Retrieval Experiment) project is a library of primary journal articles in chemistry. Any library has an inside and an outside; in this article we describe the inside of the library and the methods for building the system and accumulating the database. A later article will describe the outside (user experiences). Among electronic-library projects, the CORE project is unusual in that it has both ASCII derived from typesetting and image data for all its pages, and among experimental electronic-library projects, it is unusually large. We describe here (a) the processes of scanning and analyzing about 400,000 pages of primary journal material, (b) the conversion of a similar amount of textual database material, (c) the linking of these two data sources, and (d) the indexing of the text material.

27 citations

Proceedings ArticleDOI
TL;DR: A 'document browser' application is being developed that allows a user to interactively specify queries on the documents in the digital library using a graphical user interface, provides feedback about the candidate documents at each stage of the retrieval process, and allows refinements of the query based on the intermediate results of the search.
Abstract: This paper describes an approach to retrieving information from document images stored in a digital library by means of knowledge-based layout analysis and logical structure derivation techniques. Queries on document image content are categorized in terms of the type of information that is desired, and are parsed to determine the type of document from which information is desired, the syntactic level of the information desired, and the level of analysis required to extract the information. Using these clauses in the query, a set of salient documents are retrieved, layout analysis and logical structure derivation are performed on the retrieved documents, and the documents are then analyzed in detail to extract the relevant logical components. A 'document browser' application, being developed based on this approach, allows a user to interactively specify queries on the documents in the digital library using a graphical user interface, provides feedback about the candidate documents at each stage of the retrieval process, and allows refinements of the query based on the intermediate results of the search. Results of a query are displayed either as an image or as formatted text.

22 citations


Cites methods from "Intelligent Data Retrieval from Ras..."

  • ...Some ideas that we have developed with respect to the retrieval of data from raster images of documents stored in a digital library have been previously described by Srihari et al..(11) Related work in the analysis and retrieval of information from document images has been ongoing at CEDAR for several years....

    [...]

Book ChapterDOI
Edward A. Fox1
16 Oct 1994
TL;DR: Recommendations for research are made based on the experience with a digital library for computer science and its use in improving undergraduate education and the application of intelligent system methods to help provide an effective solution.
Abstract: Building digital libraries is one of the most important Grand Challenge problems faced by information professionals. This paper surveys the relevant literature and includes recommendations for research, based on our experience with a digital library for computer science and its use in improving undergraduate education. Of particular interest is the application of intelligent system methods to help provide an effective solution. Mediators or agents are needed throughout the system. It is also important to improve the interface, help with searching, and in general allow the digital library to carry out many of the functions of an expert human intermediary.

13 citations


Cites methods from "Intelligent Data Retrieval from Ras..."

  • ...Many digital library efforts work with bitmap page images [35] and those demand large amounts of storage (e.g., in the TULIP project, it is estimated that an average journal would consume roughly 1/4 gigabyte each year)! Further, extensive and expensive processing is needed if we are to undertake a retrospective conversion from paper to electronic forms [ 66 ]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, an intelligent information retrieval system that can work with the user to satisfy their information needs has been proposed, which is based on Artificial Intelligence (AI) techniques and the goal is to produce an Intelligent Information Retrieval System (IRS).
Abstract: The ever-changing nature of information sources, coupled with the increased demand on dwindling academic resources, led librarians and other information professionals to recognise the need for information retrieval (IR) systems that can incorporate the expertise of the information professional and gather knowledge about the user's experiences and preferences (Werckert & Cooper, 1989). The explosion of available information resources brought about by the development of the Internet and the World-Wide Web (WWW) has strengthened this need. The goal is to produce an 'intelligent' IR system which would work with the user to satisfy their information needs, so the application of Artificial lntelligence (AI) techniques seems a likely approach to the problem (Morris, 1990).

9 citations

References
More filters
Proceedings ArticleDOI
02 Nov 1986
TL;DR: This work examines several rc nbproblems in document understanding tasks from pixel processing issues to symbolic xning to global control hdrawings, half-tone pictures, and icons.
Abstract: f rEihl document image is an optically scanned and dryesentation of aprintedpage thatconsisb of blocls to achieve an end result. We examine several rc nbproblems in document understanding tasks. The lr rmge from pixel processing issues to symbolic xning to global control hdrawings, half-tone pictures, and icons. Document dastanding is a goal-oriented problem involving f d interpreting differsnt blocks and coordinating the u,',r{l" ,'flilmF 008790f .00@ 1986 IEEE 87 E[r :l I [---t I I I r---------'r I tr--lt lr-----rl I I

90 citations

Proceedings Article
14 Jul 1991
TL;DR: This paper focuses on the implementation of a multi-stage system PICTION that uses captions to identify humans in an accompanying photograph that provides a computationally less expensive alternative to traditional methods of face recognition.
Abstract: It is often the case that linguistic and pictorial information are jointly provided to communicate information. In situations where the text describes salient aspects of the picture, it is possible to use the text to direct the interpretation (i.e., labelling objects) in the accompanying picture. This paper focuses on the implementation of a multi-stage system PICTION that uses captions to identify humans in an accompanying photograph. This provides a computationally less expensive alternative to traditional methods of face recognition. It does not require a pre-stored database of face models for all people to be identified. A key component of the system is the utilisation of spatial constraints (derived from the caption)in order to reduce the number of possible labels that could be associated with face candidates (generated by a face locator). A rule-based system is used to further reduce this number and arrive at a unique labelling. The rules employ spatial heuristics as well as distinguishing characteristics of faces (e.g., male versus female). The system is noteworthy since a broad range of AI techniques are brought to bear (ranging from natural-language parsing to constraint satisfaction and computer vision).

41 citations

Proceedings Article
12 Jul 1992
TL;DR: The face locator developed by the authors takes a hypothesis generate and test approach to the task of finding the locations of people's faces in digitized pictures.
Abstract: The human face is an object that is easily located in complex scenes by infants and adults alike. Yet the development of an automated system to perform this task is extremely challenging. This paper is concerned with the development of a computational model for locating human faces in newspaper photographs based on cognitive research in human perceptual development. In the process of learning to recognize objects in the visual world, one could assume that natural growth favors the development of the abilities to detect the more essential features first. Hence, a study of the progress of an infant's visual abilities can be used to categorize the potential features in terms of their importance. The face locator developed by the authors takes a hypothesis generate and test approach to the task of finding the locations of people's faces in digitized pictures. Information from the accompanying caption is used in the verification phase. The system successfully located all faces in 44 of the 60 (73%) test newspaper photographs.

27 citations

Proceedings ArticleDOI
20 Oct 1993
TL;DR: Several hierarchical levels in modeling text are discussed, where decisions about entities are made in the context of decisions about other entities and when analytic and holistic methods are combined, overall system performance is higher than either alone.
Abstract: Several hierarchical levels in modeling text are discussed. At each level, the model is typically based on elements at the next lower level, e.g., characters in terms of features, words in terms of characters, phrases/sentences in terms of words, and paragraphs in terms of sentences. Such models are referred to as analytic, since they are based on step-by-step analysis. Although such models are usually sufficient, in order to achieve system robustness it is necessary to use models that skip a level, e.g., words in terms of features/pixels and sentences in terms of characters. Such models are holistic in that decisions about entities are made in the context of decisions about other entities. Holistic methods are slower than analytic methods and do not necessarily perform better. However, when analytic and holistic methods are combined, overall system performance is higher than either alone. Recognition algorithms based on different models can be combined to achieve robustness. >

15 citations