Top 108 papers published in the topic of Document processing in 1993

Journal Article•DOI•

The document spectrum for page layout analysis

[...]

01 Nov 1993-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The document spectrum (or docstrum) as discussed by the authors is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components, which yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks.

...read moreread less

Abstract: Page layout analysis is a document processing technique used to determine the format of a page. This paper describes the document spectrum (or docstrum), which is a method for structural page layout analysis based on bottom-up, nearest-neighbor clustering of page components. The method yields an accurate measure of skew, within-line, and between-line spacings and locates text lines and text blocks. It is advantageous over many other methods in three main ways: independence from skew angle, independence from different text spacings, and the ability to process local regions of different text orientations within the same image. Results of the method shown for several different page formats and for randomly oriented subpages on the same image illustrate the versatility of the method. We also discuss the differences, advantages, and disadvantages of the docstrum with respect to other lay-out methods. >

...read moreread less

654 citations

Patent•

Method and apparatus for processing a display document utilizing a system level document framework

[...]

David R. Anderson, Jack H. Palevich, Arnold Schaeffer, Larry S. Rosenstein, Ryoji Watanabe - Show less +1 more

12 Nov 1993

TL;DR: In this article, an object-oriented compound document architecture provides system level support for document processing features such as collaboration, linking, eternal undo, and content based retrieval, among others.

...read moreread less

Abstract: An object-oriented compound document architecture provides system level support for document processing features. The object-oriented compound document framework supports a variety of document processing functions. The framework provides system level support of collaboration, linking, eternal undo, and content based retrieval, among other things. System level support is provided for document changes, annotation through model and linking, anchors, model hierarchies, enhanced copy and pasting, command objects, and a generic retrieval framework.

...read moreread less

187 citations

Patent•

Image-based document processing system providing enhanced transaction balancing

[...]

Norman P. Kern¹•Institutions (1)

Unisys¹

17 Feb 1993

TL;DR: In this article, a method for handling misplaced transaction documents in an image-based transaction processing system which performs transaction balancing is presented, where a transaction is found to have a missing document, the free item store is accessed and one or more documents therefrom are displayed to permit determining whether any of these candidates correspond to the missing document.

...read moreread less

Abstract: A method which provides for handling misplaced transaction documents in an image based transaction processing system which performs transaction balancing. A preferred form of the method provides for displaying out-of-balance transactions such that transaction debit and credit data are displayed along with one or more document images. Using this transaction display, it is determined whether a transaction includes a misplaced document which does not belong, or whether a document is missing from the transaction. If a transaction is found to include a document which does not belong, data corresponding to the document is stored in a free item store. If a transaction is found to have a missing document, the free item store is accessed and one or more documents therefrom are displayed to permit determining whether any of these candidates correspond to the missing document.

...read moreread less

133 citations

Patent•

Real time storage/retrieval subsystem for document processing in banking operations

[...]

James D. Rogan¹, Gerhard M. Werner¹, Mark A. Stewart¹, Martin J. Danko¹•Institutions (1)

Unisys¹

06 Jan 1993

TL;DR: A storage/retrieval module apparatus for use in a document imaging system can receive digitized optical image and related information data, convert it to electrical digital data for storage on disk at high rates of speed as discussed by the authors.

...read moreread less

Abstract: A storage/retrieval module apparatus for use in a document imaging system can receive digitized optical image and related information data, convert it to electrical digital data for storage on disk at high rates of speed. Concurrently with storage operations, the storage/retrieval module apparatus can also execute retrieval of stored data under command of a host computer or a connected workstation.

...read moreread less

108 citations

Patent•

Electronic filing system using a mark on each page of the document for building a database with respect to plurality of multi-page documents

[...]

Fueki Kazumasa¹•Institutions (1)

Ricoh¹

13 Dec 1993

TL;DR: In this paper, an electronic filing system includes an input part for inputting retrieval information for one or a plurality of documents which are to be registered prior to a document registration process, a database part for managing the retrieval information input from the input part, an image scanner for scanning a document and outputting image data related to the scanned document, a storage for storing the image data received from the image scanner, a recognition part for recognizing the mark on the document based on the data stored in the storage, and a controller for making a database building with respect to an arbitrary document based

...read moreread less

Abstract: An electronic filing system files one of a plurality of documents each having a mark associated with the document for recognizing each document. The electronic filing system includes an input part for inputting retrieval information for one or a plurality of documents which are to be registered prior to a document registration process, a database part for managing the retrieval information input from the input part, an image scanner for scanning a document and outputting image data related to the scanned document, a storage for storing the image data received from the image scanner, a recognition part for recognizing the mark on the document based on the image data stored in the storage, and a controller for making a database building with respect to an arbitrary document based on the mark of the arbitrary document recognized by the recognition part and the retrieval information which is managed in the database for each document when carrying out the document registration process.

...read moreread less

87 citations

Proceedings Article•DOI•

Handwritten Korean character image database PE92

[...]

Daijin Kim, Young-Sup Hwang, Sang-Tae Park, Eun Jung Kim, Sang-Hoon Paek, Sung-Yang Bang - Show less +2 more

20 Oct 1993

TL;DR: The purpose of the current PE92 database project is to provide a comprehensive set of character image data to a developer of a recognition system so that the developer can concentrate on developing an algorithm.

...read moreread less

Abstract: The purpose of the current PE92 database project is two fold. One is to provide a comprehensive set of character image data to a developer of a recognition system so that the developer can concentrate on developing an algorithm. The other is to offer a means by which an evaluator can compare various algorithms objectively. The authors collected 100 sets of KS 2350 handwritten Korean character images. They tried to collect as many writing styles as possible. The first 70 sets were generated by more than 500 different writers, and each of the remaining 30 sets was written by the same person. Writers wrote down the characters in prespecified boxes and the database was created by scanning the data sheets by an image scanner. Each image is the size of 100/spl times/100 with 256 gray levels. Finally, the authors analyze the quality of the database created and calculated various statistics of the database PE92. >

...read moreread less

74 citations

Patent•

Document storage and retrieval system for storing and retrieving document image and full text data

[...]

Hiromichi Fujisawa¹, Atsushi Hatakeyama¹, Yasuaki Nakano¹, Junichi Higashino¹, Toshihiro Hananoi¹ - Show less +1 more•Institutions (1)

Hitachi¹

24 Aug 1993

TL;DR: In this paper, a document storage and retrieval system is provided with means for storing a document body in the form of image, and storing text information in a character code string for retrieval, means for executing a retrieval with reference to the text information, and means for displaying a document image relating thereto on a retrieval terminal according to the retrieval result.

...read moreread less

Abstract: A document storage and retrieval system is provided with means for storing a document body in the form of image, means for storing text information in the form of a character code string for retrieval, means for executing a retrieval with reference to the text information, and means for displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of image. Accordingly, users are capable of retrieving documents with arbitrary words and also capable of reading even such a document as is complicated to include mathematical expressions and charts through a terminal in the form of image, the same as on paper. Further, the invention provides a system wherein the text information for retrieval is extracted automatically from the document image through character recognition. Since a precision of the character recognition has not been satisfactory hitherto, a visual retrieval and correction have been carried out without fail by operators. However, there is no necessity for the operators to attend therefor according to the invention. Thus, the text information for retrieval can be generated at the cost of practical time and money even in case of volumes of documents.

...read moreread less

55 citations

Journal Article•DOI•

Structure recognition methods for various types of documents

[...]

Toyohide Watanabe¹, Qin Luo¹, Noboru Sugie¹•Institutions (1)

Nagoya University¹

01 Mar 1993

TL;DR: Experimental methods of recognizing the document structures of various types of documents in the framework of document understanding are described and are effective under the knowledge-based frame-work and are integrated complementarily from the top-down (model-driven) and bottom-up (data- driven) approaches.

...read moreread less

Abstract: In this paper, we describe experimental methods of recognizing the document structures of various types of documents in the framework of document understanding Namely, we interpret document structures with individually characterized document knowledge The document understanding process is divided into three procedures: the first is the recognition of document structures from a two-dimensional point of view; the second is the recognition of item relationships from a one-dimensional point of view; and the third is the recognition of characters from a zero-dimensional point of view The procedure for recognizing structures plays the most important role in document understanding This procedure extracts and classifies the logical item blocks from paper-based documents distinctly

...read moreread less

53 citations

Patent•

Integrated multifunctional document processing system for faxing, copying, printing, and scanning document information

[...]

Darwin Hu, John Joseph Ring

03 Jun 1993

TL;DR: In this paper, a multifunctional document processing system for faxing, copying, printing or scanning document information and for transmitting and receiving document signals to and from a remote device is described.

...read moreread less

Abstract: A multifunctional document processing system for faxing, copying, printing or scanning document information and for transmitting and receiving document signals to and from a remote device. The system comprises a multifunctional local device which includes a scanner for optically scanning document information, for converting the scanned document information into electrical document signals and for transmitting the document signals to the processor. The multifunctional local device also includes a recording device, such as a printer for receiving document signals from the processor and for producing a recorded form of the document information, such as a printed document based on the received document signals. A control module is interfaced between the processor and the multifunctional local device for receiving document signals from the multifunctional local device and from the remote device and for sending the received documents signals to the processor. The control module also receives document signals from the processor and sends the received document signals to either the multifunctional local device or the remote device. The control module additionally generates and transmits control signals to the multifunctional local device.

...read moreread less

51 citations

Proceedings Article•DOI•

Writer-adaptation for on-line handwritten character recognition

[...]

N. Matic¹, Isabelle Guyon¹, John S. Denker¹, Vladimir Vapnik¹•Institutions (1)

Bell Labs¹

20 Oct 1993

TL;DR: The authors have designed a writer-adaptable character recognition system for online characters entered on a touch terminal that is based on a Time Delay Neural Network that is pre-trained on examples from many writers to recognize digits and uppercase letters.

...read moreread less

Abstract: The authors have designed a writer-adaptable character recognition system for online characters entered on a touch terminal. It is based on a Time Delay Neural Network (TDNN) that is pre-trained on examples from many writers to recognize digits and uppercase letters. The TDNN without its last layer serves as a preprocessor for an optimal hyperplane classifier that can be easily retrained to peculiar writing styles. This combination allows for fast writer-dependent learning of new letters and symbols. The system is memory and speed efficient. >

...read moreread less

50 citations

Proceedings Article•DOI•

Optical recognition of chemical graphics

[...]

Richard G. Casey¹, Stephen K. Boyer¹, P. Healey¹, Alex Miller¹, B. Oudot¹, K. Zilles¹ - Show less +2 more•Institutions (1)

IBM¹

20 Oct 1993

TL;DR: A prototype system for encoding chemical structure diagrams from scanned printed documents is described, and the final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling.

...read moreread less

Abstract: A prototype system for encoding chemical structure diagrams from scanned printed documents is described. The system distinguishes a structure diagram from other printed material on a page image using size and spacing characteristics. It distinguishes line graphics from symbols in an intermediate vectorization stage. Line information is mapped into a connection diagram that represents atomic bonds. Atomic symbols are identified by means of chemical drawing conventions and optical character recognition. The final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling. >

...read moreread less

Proceedings Article•

DR-LINK's linguistic conceptual approach to document detection

[...]

Elizabeth D. Liddy, Sung H. Myaeng

01 Jan 1993

TL;DR: This article developed a system whose architecture is modular in design and which uses lexical, syntactic, semantic and discourse processing techniques to produce richer representations of documents and topic statements for more accurate matching to queries.

...read moreread less

Abstract: There is a continuum of levels of linguistic-conceptual processing which can produce enrichments of the original text in order to explicitly represent documents at more conceptual levels for more accurate matching to queries. To produce richer representations of documents and topic statements we have developed a system whose architecture is modular in design and which uses lexical, syntactic, semantic and discourse processing techniques

...read moreread less

Patent•

Document processing apparatus for extracting a format from one document and using the extracted format to automatically edit another document

[...]

Eisaku Nakatani¹•Institutions (1)

Casio¹

11 Mar 1993

TL;DR: In this article, a document data stored in a document storage area is extracted line by line to analyze the structure of the document data and the document layout information is extracted from the analysis result.

...read moreread less

Abstract: Document data stored in a document storage area are extracted line by line to analyze the structure of the document data. The document layout information is extracted from the analysis result. The extracted layout information is stored, as learning data, in a document layout information learning area. In format conversion, the document data to be output, which is extracted in the same manner as described above, is converted on the basis of the learning data. Document data having a consistent layout is output to a CRT or a printer in accordance with the converted layout information.

...read moreread less

Patent•

Multifunctional document processing system for receiving document signals from a local or a remote device

[...]

Darwin Hu, Keisaku Kano, John Joseph Ring

03 Jun 1993

TL;DR: In this paper, a multifunctional document processing system is described, which includes a scanner for optically scanning document information, for converting the scanned document information into electrical document signals and for transmitting the document signals to the host computer.

...read moreread less

Abstract: A multifunctional document processing system receives document signals from a local or a remote device and processes the document signals utilizing a host computer for transmission to the local or remote device. The system has a multifunctional local peripheral device which includes a scanner for optically scanning document information, for converting the scanned document information into electrical document signals and for transmitting the document signals to the host computer. The multifunctional local peripheral device also includes a recording device, such as a printer for receiving document signals from the host computer and for producing a recorded form of the document information, such as a printed document based on the received document signals. A control module is interfaced between the host computer and the multifunctional local peripheral device for receiving document signals from the multifunctional local peripheral device and from the remote device and for sending the received documents signals to the host computer. The control module also receives document signals from the host computer and sends the received document signals to either the multifunctional local peripheral device or the remote device. The control module additionally generates and transmits control signals to the multifunctional local peripheral device.

...read moreread less

Patent•

Modular networked image processing system and method therefor

[...]

Kent Pavey, David Feitelberg

23 Aug 1993

TL;DR: In this article, a modularly configured networked document processing system has a plurality of computers communicating over a local area network, and a gateway for communication to a mainframe computer.

...read moreread less

Abstract: A modularly configured networked document processing system has a plurality of computers communicating over a local area network. The system has fax servers and scanners to convert documents into image signals, work stations to recognize character signals from the image signals, to process the character signals, and to display the unintelligible processed character signals for user intervention. In addition, the system has a gateway for communication to a mainframe computer.

...read moreread less

Proceedings Article•DOI•

Cursive script recognition: A fast reader scheme

[...]

D. Guillevic¹, Ching Y. Suen¹•Institutions (1)

Concordia University¹

20 Oct 1993

TL;DR: A cheque processing system currently under development, based on a psychological model of the reading process for a fast reader, and the module for extracting graphical clues, implemented with the techniques of mathematical morphology, is discussed.

...read moreread less

Abstract: A cheque processing system currently under development is described. More precisely, the cursive script recognition module for the legal amount is discussed. Commonly, systems perform recognition either on a character by character basis, or on a word level. The authors investigate the recognition at a higher level of abstraction, at the sentence level. Knowledge of context, orthography, syntax and semantics is used to supplement the information from the graphical input. The system is based on a psychological model of the reading process for a fast reader. The module for extracting graphical clues, implemented with the techniques of mathematical morphology, is discussed. >

...read moreread less

Patent•

Data processing system and method for selecting customized character recognition processes and coded data repair processes for scanned images of document forms

[...]

Timothy S. Betts¹, Valerie M. Carras¹, Lewis B. Knecht¹•Institutions (1)

IBM¹

23 Mar 1993

TL;DR: In this paper, a document form processing template is provided which specifies the identity and preferred sequence for selected, customized character recognition processes and selected coded data error correction processes which are reasonably likely to be needed to automatically process a selected batch of document forms whose fields have certain, anticipated, uniform characteristics.

...read moreread less

Abstract: A data processing method, system and computer program repairs character recognition errors for digital images of document forms. A document form processing template is provided which specifies the identity and preferred sequence for selected, customized character recognition processes and selected, customized coded data error correction processes which are reasonably likely to be needed to automatically process a selected batch of document forms whose fields have certain, anticipated, uniform characteristics.

...read moreread less

Proceedings Article•DOI•

Toward a practical document understanding of table-form documents: its framework and knowledge representation

[...]

T. Watanabe¹, Q. Luo¹, Noboru Sugie¹•Institutions (1)

Nagoya University¹

20 Oct 1993

TL;DR: A framework of four-layer recognition processes is proposed for understanding documents, and a knowledge representation method adaptable to the understanding of table-form documents is addressed.

...read moreread less

Abstract: A framework of four-layer recognition processes is proposed for understanding documents, and a knowledge representation method adaptable to the understanding of table-form documents is addressed. Although Y. Nakano et al. (1986) looked upon the recognition of multi-kinds of table-form documents as an important subject from a practical point of view, they could not report any successful approach because their knowledge was based only on the physical coordinate data. In the approach presented, this recognition issue was solved, using both the classification tree based on the physical characteristics and the structure description tree based on the logical characteristics. At least, it is not so difficult to classify various kinds of documents into appropriate document classes since table-form documents are well designed on the basis of vertical and horizontal line segments. However, it is not easy in the case of the other documents because the geometric and spatial characteristics of documents are not well specified. It is necessary to investigate the application techniques for the other documents from the viewpoint of the knowledge representation. >

...read moreread less

Patent•

Method and system to recognize encoding type in document processing language

[...]

Tetsuo Motoyama¹, Donny Tsay¹•Institutions (1)

Ricoh¹

19 Jan 1993

TL;DR: In this article, the first three tags of the document are examined to determine if they contain the object identifier having the value "28 CF 44 00 H". If one of these tags has this value, an encoding flag is set to indicate the document is a binary SPDL file and the recognition process is terminated.

...read moreread less

Abstract: A method and system to efficiently and automatically determine whether a document to be processed and printed has been encoded in a binary or clear text representation of a page description language. The document is initially processed as if it were binary encoded and the first three tags of the document are examined to determine if they contain the object identifier having the value "28 CF 44 00 H". If one of the first three tags has this value, an encoding flag is set to indicate the document is a binary SPDL file and the recognition process is terminated. If the document is determined not to be a binary SPDL file it is examined to see if it is clear text SPDL file. The beginning of the document is examined to determine if it contains zero or more S separators which are defined to be spaces, carriage returns, line feeds, and tabs followed by the characters "

...read moreread less

Book•

Automatic analysis and understanding of documents

[...]

Yuan Yan Tang, Chang D. Yan, Mohamed Cheriet, Ching Y. Suen

01 Dec 1993

Transformation of structured documents with the use of grammar

[...]

Eila Kuikka, Martti Penttonen

01 Jan 1993

TL;DR: The method uses grammars to define both the structure of documents and transformation between structures and its implementation to certain modifications in a syntax-directed document processing system created by the authors.

...read moreread less

Abstract: SUMMARY In structured text processing systems the need for transformation of document instances is obvious if the structure definition of the document type changes. This article presents a transformation method with the use of an extended syntax-directed translation schema and its implementation to certain modifications in a syntax-directed document processing system created by the authors. The method uses grammars to define both the structure of documents and transformation between structures.

...read moreread less

Proceedings Article•DOI•

Block segmentation and text area extraction of vertically/horizontally written document

[...]

N. Amamoto, S. Torigoe, Y. Hirogaki

20 Oct 1993

TL;DR: The authors mainly discuss a structural extraction technique called "text area extraction" which locates all text areas in 255 of the 309 (83%) evaluation documents.

...read moreread less

Abstract: The development of a computational method for extracting character portions from a complicated document is described. The goal of the proposed method is to realize a practical document structure analysis for advanced optical character recognition systems with document databases. The authors mainly discuss a structural extraction technique called "text area extraction" which locates all text areas in 255 of the 309 (83%) evaluation documents. >

...read moreread less

Proceedings Article•DOI•

Document image retrieval system using character candidates generated by character recognition process

[...]

S. Senda¹, M. Minoh, Katsuo Ikeda•Institutions (1)

Kyoto University¹

20 Oct 1993

TL;DR: The authors have implemented a document image retrieval system that automatically stores document images in the form of character candidates obtained by character recognition process, and developed a fast and efficient searching algorithm that is able to locate all occurrences of any finite number of keywords in the character candidates.

...read moreread less

Abstract: The authors have implemented a document image retrieval system; it automatically stores document images in the form of character candidates obtained by character recognition process. When we give some keywords to the system, it can retrieve the images containing at least one of the keywords. The strategy of using character candidates remarkably lowers the rate of missing the keywords on retrieval, because it includes several hypotheses in character segmentation and in character recognition. For finding the keywords from the storage of the character candidates, the authors have developed a fast and efficient searching algorithm. It is able to locate all occurrences of any finite number of keywords in the character candidates. The improvement of using character candidates and the efficiency of the searching algorithm are also described. >

...read moreread less

Patent•

Document processing apparatus for correcting address and format information of document information up to a designated page

[...]

Takise Kikuo¹, Hiroshi Takakura¹, Takahiro Kato¹, Yukari Shibuya¹, Masaki Hamada¹ - Show less +1 more•Institutions (1)

Canon Inc.¹

02 Mar 1993

TL;DR: In this paper, a document processing system for formatting document information in accordance with a format includes a disk apparatus which independently has a document file, an address table file, and a format information file.

...read moreread less

Abstract: A document processing system for formatting document information in accordance with a format includes a disk apparatus which independently has a document file, an address table file, and a format information file. The document file stores the document information corresponding to predetermined regions. The address table file stores an address table corresponding to the document information. The format information file stores the format information corresponding to the document information. A control circuit, in response to a page designation, controls the three files to correct format information and address information for document information preceding the designated page.

...read moreread less

Proceedings Article•DOI•

The implementation methodology for a CD-ROM English document database

[...]

Ihsin T. Phillips¹, Jaekyu Ha, Robert M. Haralick, Dov Dori•Institutions (1)

Seattle University¹

20 Oct 1993

TL;DR: The authors first briefly describe the makeup of a database of scanned document images of scientific and technical documents written in English which are being produced in a CD-ROM format and concentrate on the implementation methodology used to prepare the database.

...read moreread less

Abstract: Producing a database of scanned document images for development or evaluation of OCR and document image understanding algorithms is neither easy nor inexpensive. The authors first briefly describe the makeup of a database of scanned document images of scientific and technical documents written in English which are being produced in a CD-ROM format. Then, the authors concentrate on the implementation methodology used to prepare the database. The methodology gives the protocols for each step of the database preparation, and the error model used for the estimation of the ground-truth errors that may exist in the database is discussed. >

...read moreread less

Source Book on Digital Libraries

[...]

Edward A. Fox

01 Jan 1993

TL;DR: This extensive report outlines the steps necessary to create a national, electronic Science, Engineering and Technology Library and indicates that it will only be successful if supported by top-quality research on information storage and retrieval, hypertext, document processing, human-computer interaction, scaling up of information systems, networking, multimedia systems, visualization, education, and training.

...read moreread less

Abstract: This extensive report outlines the steps necessary to create a national, electronic Science, Engineering and Technology Library. Step one is for NSF to play a lead role in launching a concerted RD it will only be successful if supported by top-quality research on information storage and retrieval, hypertext, document processing, human-computer interaction, scaling up of information systems, networking, multimedia systems, visualization, education, and training. NOTE: Because of its large size, this reports is not available in hard copy from the department. It can be obtained electronically through anonynous FTP to fox.cs.vt.edu (in directory /pub/DigitalLibrary). To obtain a hard copy, write to Mark Roope at University Printing Services; "Documents on Demand"; Virginia Tech; Blacksburg VA 24061-0243; or call (703) 231-6701.

...read moreread less

Document image understanding: integrating recovery and interpretation

[...]

David Doermann

01 Jan 1993

TL;DR: A "stroke platform" representation is provided which establishes a verifiable "link to the pixels" and demonstrate its usefulness for recovery tasks and introduces the concept of recovery into the document domain.

...read moreread less

Abstract: Many document image understanding problems require a more comprehensive examination of document features than is typically deemed necessary for recognition tasks. We believe that these problems require a detailed analysis of stroke and sub-stroke features in the document image with the goal of obtaining information about the environment or process which created the document and establishing a context for understanding. We introduce the concept of recovery into the document domain. We provide a "stroke platform" representation which establishes a verifiable "link to the pixels" and demonstrate its usefulness for recovery tasks. This representation allows us to overcome many of the problems associated with the rapid, irreversible abstraction associated with traditional document processing methods and provides the basic framework for our analysis of handwritten documents. By obtaining a detailed description of the document and its properties, we are able to establish a context for analysis and validate assumptions about the domain. This dissertation presents our work on several document image understanding problems: (1) demonstrating the successful use of the stroke platform for the problem of interpreting and reconstructing junctions and endpoints, (2) exploring the effects of the handwriting process on the document by the development of a model for instrument grasp and a study of its effects on pressure features, (3) posing and providing an approach to the problem of recovering temporal information from static images of handwriting, (4) addressing various sub-tasks of the problem of processing form documents, and (5) extending the detailed analysis philosophy to demonstrate its feasibility in related document domains.

...read moreread less

Proceedings Article•DOI•

An OCR system for business cards

[...]

H. Saiga¹, Y. Nakamura¹, Y. Kitamura¹, T. Morita¹•Institutions (1)

National Archives and Records Administration¹

20 Oct 1993

TL;DR: An experimental business card recognition system is described that recognizes Japanese business cards of a wide variety both in formats and fonts and classifies recognition results into several predefined categories so that a business card database is built automatically.

...read moreread less

Abstract: An experimental business card recognition system is described that recognizes Japanese business cards of a wide variety both in formats and fonts. It also classifies recognition results into several predefined categories so that a business card database is built automatically. The overall procedure divides into a character recognition phase and a linguistic processing phase. In the first phase, the system gives a strictly local approach toward line segmentation to cope with slanted images. In the linguistic processing phase, recognition result strings generated in the former phase are analysed, classified and verified with the help of built databases. Some experimental results are also discussed. >

...read moreread less

Proceedings Article•DOI•

Document structures: A survey

[...]

Yuan Yan Tang¹, Ching Y. Suen¹•Institutions (1)

Concordia University¹

20 Oct 1993

TL;DR: A survey which contains a collection of many methods of describing document structures is presented and several novel concepts and theoretical analyses are also presented.

...read moreread less

Abstract: Knowing the structure of a document is the key to successful processing of that document. From different points of view, there exist different definitions for document structures. A survey which contains a collection of many methods of describing document structures is presented. Several novel concepts and theoretical analyses are also presented in this survey. >

...read moreread less

Patent•

Document processing apparatus for simultaneously displaying graphic data, image data, and character data for a frame

[...]

Hiroshi Takakura¹, Toshihiko Komatsu¹•Institutions (1)

Canon Inc.¹

24 Mar 1993

TL;DR: A document processing apparatus includes an microprocessing unit, a main memory, an external memory, a cathode ray tube display, and input units (e.g., a pointing device and a keyboard) as discussed by the authors.

...read moreread less

Abstract: A document processing apparatus includes an microprocessing unit, a main memory, an external memory, a cathode ray tube display, and input units (e.g., a pointing device and a keyboard). In this apparatus, graphic (e.g., an illustration) data, image (e.g., a photographic image) data, and character (e.g., a sentence) data for a frame formed on a sheet are controlled. Frame size data is also stored in the above memories. The microprocessing unit designates the size data or one of the graphic, image, and character data to be obtained.

...read moreread less

Showing papers on "Document processing published in 1993"