Showing papers on "Document processing published in 2002"

PDF

Open Access

Patent•

Computer-implemented system and method for text-based document processing

[...]

James Cox¹, Oliver M. Dain¹•Institutions (1)

31 May 2002

TL;DR: A computer-implemented system and method for processing text-based documents is described in this article. But the system is not suitable for the analysis of text documents and it cannot handle large numbers of documents.

...read moreread less

Abstract: A computer-implemented system and method for processing text-based documents. A frequency of terms data set is generated for the terms appearing in the documents. Singular value decomposition is performed upon the frequency of terms data set in order to form projections of the terms and documents into a reduced dimensional subspace. The projections are normalized, and the normalized projections are used to analyze the documents.

...read moreread less

233 citations

Patent•

System and method for processing currency bills and documents bearing barcodes in a document processing device

[...]

William J. Jones, Robert J. Klein, Curtis W. Hallowell, Charles P. Jenrick

23 Jul 2002

TL;DR: In this paper, a document processing device has an evaluation region disposed along a transport path between an input and output receptacle capable of processing both currency bills and barcoded media having at least two barcodes.

...read moreread less

Abstract: A document processing device having an evaluation region disposed along a transport path between an input and output receptacle capable of processing both currency bills and barcoded media having at least two barcodes. One of the barcodes encodes a ticket number and another barcode encodes a payout amount associated with that ticket number. The evaluation region includes detectors for detecting predetermined characteristics of currency bills and a barcode reader for scanning the barcodes printed on the barcoded media. A controller coupled to the evaluation region controls the operation of the document processing device and receives input from and provides information to a user via a control unit In some embodiments, the document processing device may have any number of output receptacles, and the control unit allows the user to specify which output receptacle receives which type of document An optional coin sorter may be coupled to the document processing device to allow document and coin processing The document processing device may be coupled to a network to communicate information to devices linked to the network.

...read moreread less

181 citations

Patent•

Systems and methods for providing hardcopy secure documents and for validation of such documents

[...]

Grace T. Brewington¹•Institutions (1)

Xerox¹

16 Dec 2002

TL;DR: A secure document processing system for receiving an original document and for printing a secure hardcopy version of the original document, wherein the secure hard copy version includes a machine-readable encoded image signature which represents an image segment of a document as discussed by the authors.

...read moreread less

Abstract: A secure document processing system for receiving an original document and for printing a secure hardcopy version of the original document, wherein the secure hardcopy version includes a machine-readable encoded image signature which represents an image segment of the original document. Such hardcopy secure documents can be validated by inputting them to an secure document validation system operable to identify and process the machine readable encoded representation and in response to determine whether the recovered image signature indicates that the document is counterfeit or has been altered.

...read moreread less

157 citations

Journal Article•DOI•

Document image analysis: A primer

[...]

Rangachar Kasturi¹, Lawrence O'Gorman², Venu Govindaraju³•Institutions (3)

Pennsylvania State University¹, Avaya², University at Buffalo³

01 Feb 2002-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: This paper briefly describes various components of a document analysis system and provides the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.

...read moreread less

Abstract: Document image analysis refers to algorithms and techniques that are applied to images of documents to obtain a computer-readable description from pixel data. A well-known document image analysis product is the Optical Character Recognition (OCR) software that recognizes characters in a scanned document. OCR makes it possible for the user to edit or search the document’s contents. In this paper we briefly describe various components of a document analysis system. Many of these basic building blocks are found in most document analysis systems, irrespective of the particular domain or language to which they are applied. We hope that this paper will help the reader by providing the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.

...read moreread less

143 citations

Patent•

Document processing system and method

[...]

Garrett O'Carroll

15 Nov 2002

TL;DR: In this paper, a target document (25 ) is generated by merging four source documents (2 - 5 ) by parsing a document (6, 8, 11, 20 ) into a hierarchical tree if it is not already in that form, and merging the trees.

...read moreread less

Abstract: A target document ( 25 ) is generated by merging four source documents ( 2 - 5 ). There are three merge operations, an operation ( 10 ) for two source documents ( 2, 13 ), an operation ( 13 ) for an intermediate target document and a source document ( 4 ), and an operation ( 21 ) for a second intermediate target document and a final source document ( 5 ). In each merge operation one source document inherits from the other. An inheriting instruction is embedded within the inheriting document. Merging is performed by parsing a document ( 6, 8, 11, 20 ) into a hierarchical tree if it is not already in that form, and merging the trees. Matching nodes are identified and are combined or replaced according to a policy.

...read moreread less

140 citations

Patent•

System for authenticating and processing of checks and other bearer documents

[...]

Lewis J. Moore

18 Mar 2002

TL;DR: In this article, an encrypted symbol is imprinted on the document using ink that is not visible in invisible light, and the symbol includes information used to authenticate the document and to identify the bearer of the document.

...read moreread less

Abstract: A bearer document processing system includes preparation, verification, redeeming and depositing of the document. An encrypted symbol is imprinted on the document using ink that is not visible in invisible light. The symbol includes information used to authenticate the document and to identify the bearer of the document. The document is scanned at a transaction point. The symbol can be decoded at transaction points or at a remote central processing station. Accounts involved in transactions are credited and debited using information contained in the encoded symbol and other information provided by the bearer and the acceptor of the document. Transactions are performed in essentially real-time, and the bearer is provided with evidence of a successful transaction. Although applicable to any type of bearer document such as stock certificates, money orders, the system is particularly applicable to processing bank checks in real-time and with the possibility of fraudulent transactions being minimized.

...read moreread less

134 citations

Journal Article•DOI•

Recognition of Handwritten Bengali Characters: A Novel Multistage Approach

[...]

Ahmad Fuad Rezaur Rahman¹, Russly Abdul Rahman¹, Michael Fairhurst¹•Institutions (1)

University of Kent¹

01 May 2002-Pattern Recognition

TL;DR: This analysis demonstrates how detection of various high-level features of the Bengali character set might help formulate successful multistage OCR design.

...read moreread less

94 citations

Journal Article•DOI•

Writer adaptation for online handwriting recognition

[...]

S.D. Connell¹, Anil K. Jain²•Institutions (2)

Agilent Technologies¹, Michigan State University²

01 Mar 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work uses writer-independent writing style models (lexemes) to identify the styles present in a particular writer's training data and updates these models using the writer's data, demonstrating the feasibility of this approach on both isolated handwritten character recognition and unconstrained word recognition tasks.

...read moreread less

Abstract: Writer-adaptation is the process of converting a writer-independent handwriting recognition system into a writer-dependent system. It can greatly increasing recognition accuracy, given adequate writer models. The limited amount of data a writer provides during training constrains the models' complexity. We show how appropriate use of writer-independent models is important for the adaptation. Our approach uses writer-independent writing style models (lexemes) to identify the styles present in a particular writer's training data. These models are then updated using the writer's data. Lexemes in the writer's data for which an inadequate number of training examples is available are replaced with the writer-independent models. We demonstrate the feasibility of this approach on both isolated handwritten character recognition and unconstrained word recognition tasks. Our results show an average reduction in error rate of 16.3 percent for lowercase characters as compared against representing each of the writer's character classes with a single model. In addition, an average error rate reduction of 9.2 percent is shown on handwritten words using only a small amount of data for adaptation.

...read moreread less

94 citations

Patent•

System and method for processing currency bills and substitute currency media in a single device

[...]

William J. Jones, Frank M. Csulits, Robert J. Klein, Curtis W. Hallowell

13 Aug 2002

TL;DR: In this paper, a document processing device (100) having an evaluation region (104) disposed along a transport path between an input (102) and output receptacle (108) capable of processing both currency bills and barcoded media.

...read moreread less

Abstract: A document processing device (100) having an evaluation region (104) disposed along a transport path between an input (102) and output receptacle (108) capable of processing both currency bills and barcoded media. The evaluation region includes detectors (110) for detecting predetermined characteristics of currency bills and a barcode reader (112) for scanning the barcoded media. A controller (114) coupled to the evaluation region controls the operation of the document processing device and receives input from and provides information to a user via a control unit (216). In some embodiments, the document processing device may have any number of output receptacles (708a and 708b), and the control unit allows the user to specify which output receptacle receives which type of document. In some embodiments, an optional coin sorter (1048) may be coupled to the document processing device to allow document and coin processing. The document processing device may be coupled to a network (1192) to communicate information to devices (1100a-1100n) linked to the network.

...read moreread less

94 citations

Journal Article•DOI•

Evaluating the performance of table processing algorithms

[...]

Jianying Hu¹, Ramanujan S. Kashi¹, Daniel P. Lopresti², Gordon Wilfong²•Institutions (2)

Avaya¹, Alcatel-Lucent²

01 Mar 2002-International Journal on Document Analysis and Recognition

TL;DR: An intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition are introduced and a new paradigm, “graph probing,” is described for comparing the results returned by the recognition system and the representation created during ground-truthing.

...read moreread less

Abstract: While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition have gained acceptance in the literature, attempts to formalize the problem for higher-level algorithms, while receiving a fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity. In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely (deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, “graph probing,” for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks as well.

...read moreread less

92 citations

Proceedings Article•DOI•

Off-line handwritten Arabic character segmentation algorithm: ACSA

[...]

Toufik Sari, Labiba Souici, Mokhtar Sellami

06 Aug 2002

TL;DR: A new character segmentation algorithm (ACSA) of Arabic scripts is presented, which yields on the segmentation of isolated handwritten words in perfectly separated characters based on morphological rules constructed at the feature extraction phase.

...read moreread less

Abstract: Character segmentation is a necessary preprocessing step for character recognition in many OCR systems. It is an important step because incorrectly segmented characters are unlikely to be recognized correctly. The most difficult case in character segmentation is the cursive script. The scripted nature of Arabic written language poses some high challenges for automatic character segmentation and recognition. In this paper, a new character segmentation algorithm (ACSA) of Arabic scripts is presented. The developed segmentation algorithm yields on the segmentation of isolated handwritten words in perfectly separated characters. It is based on morphological rules, which are constructed at the feature extraction phase. Finally, ACSA is combined with an existing handwritten Arabic character recognition system (RECAM).

...read moreread less

Journal Article•DOI•

Neural network based system for script identification in Indian documents

[...]

S. Basavaraj Patil¹, N. V. Subbareddy¹•Institutions (1)

University B.D.T College of Engineering¹

01 Feb 2002-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: A neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts and results are very encouraging and prove the effectiveness of the approach.

...read moreread less

Abstract: The paper describes a neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts. Script identification is a basic requirement in automation of document processing, in multi-script, multi-lingual environments. The system developed includes a feature extractor and a modular neural network. The feature extractor consists of two stages. In the first stage the document image is dilated using 3 X 3 masks in horizontal, vertical, right diagonal, and left diagonal directions. In the next stage, average pixel distribution is found in these resulting images. The modular network is a combination of separately trained feedforward neural network classifiers for each script. The system recognizes 64 X 64 pixel document images. In the next level, the system is modified to perform on single word-document images in the same three scripts. Modified system includes a pre-processor, modified feature extractor and probabilistic neural network classifier. Pre-processor segments the multi-script multi-lingual document into individual words. The feature extractor receives these word-document images of variable size and still produces the discriminative features employed by the probabilistic neural classifier. Experiments are conducted on a manually developed database of document images of size 64 X 64 pixels and on a database of individual words in the three scripts. The results are very encouraging and prove the effectiveness of the approach.

...read moreread less

Patent•

Document processing apparatus, document processing method, document processing program, and recording medium

[...]

Kenichiro Kobayashi¹, Makoto Akabane¹, Tomoaki Nitta¹, Nobuhide Yamazaki¹, Erika Kobayashi¹ - Show less +1 more•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

10 May 2002

TL;DR: The text format of input data is checked and converted into a system-manipulated format using tags, heading information, and the like, and then the converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns as discussed by the authors.

...read moreread less

Abstract: The text format of input data is checked, and is converted into a system-manipulated format It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns Each block section is tagged with a tag indicating a block The data divided into blocks is parsed based on tags, character patterns, etc, and is structured A table in text is also parsed, and is segmented into cells Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data A sentence-extraction template paired with the tree-structured data is used to extract sentences

...read moreread less

Patent•

Document processing method and system

[...]

Yasuo Mori¹•Institutions (1)

Canon Inc.¹

11 Sep 2002

TL;DR: In this paper, a document processing method and system which implement display that improves efficiency and usability of edit operations when inserting, moving, or copying & pasting data, by taking full advantage of the feature of retaining data and set values hierarchically in the system.

...read moreread less

Abstract: The present invention provides a document processing method and system which implement display that improves efficiency and usability of edit operations when inserting, moving, or copying & pasting data, by taking full advantage of the feature of retaining data and set values hierarchically in the system In document processing for editing a document consisting of multiple sets of original data, when a user moves a graphic object which represents a desired original by dragging it on the document in order to move or copy the desired original data to a certain position on the document, the present invention detects the boundary between originals in the document, nearest to the position of the cursor dragging the graphic object which represents the desired original, and displays an identifiable mark on the boundary between originals in the document

...read moreread less

Patent•

Document classification and labeling using layout graph matching

[...]

Yue Ma, Jinhong Guo, David Doermann, Jian Liang

13 Nov 2002

TL;DR: In this paper, a document processing system for use in identifying a segmented document includes a data store of layout graph models that are classified and/or labeled, and a matching module makes a determination of a match between a layout graph sample for the segmented documents and a particular layout graph model.

...read moreread less

Abstract: A document processing system for use in identifying a segmented document includes a data store of layout graph models that are classified and/or labeled A matching module makes a determination of a match between a layout graph sample for the segmented document and a particular layout graph model The matching module uses a correlator to generate an identified, segmented document that is classified and/or labeled based on the segmented document, the layout graph model, and the determination of a match

...read moreread less

Patent•

Tracking document usage

[...]

Bruce L. Johnson, Leonard T. Schroath, Bradley J. Anderson, William I. Herrmann

06 Sep 2002

TL;DR: In this article, a system and methods enable the gathering and transferring of usage information for an electronic document so that the document's usage history can be tracked, through the execution of a tracking module located within the electronic document.

...read moreread less

Abstract: A system and methods enable the gathering and transferring of usage information for an electronic document so that the document's usage history can be tracked. A document history is recorded into an electronic document through the execution of a tracking module located within the electronic document. When the electronic document is accessed, the tracking module executes to record document history information into the electronic document. The disclosed system and methods provide a convenient way to track secured documents, maintain document databases, and offer feedback to authors on how documents are used so that document contents can be tailored to better suit the needs of an audience.

...read moreread less

Journal Article•DOI•

Substitution deciphering based on HMMs with applications to compressed document processing

[...]

Dar-Shyang Lee¹•Institutions (1)

Ricoh¹

01 Dec 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work proposes a new solution to substitution deciphering based on hidden Markov models that is more accurate than relaxation and much more robust in the presence of noise, making it useful for applications in compressed document processing.

...read moreread less

Abstract: It has been shown that simple substitution ciphers can be solved using statistical methods such as probabilistic relaxation. However, the utility of such solutions has been limited by their inability to cope with noise encountered in practical applications. We propose a new solution to substitution deciphering based on hidden Markov models. We show that our algorithm is more accurate than relaxation and much more robust in the presence of noise, making it useful for applications in compressed document processing. Recovering character interpretations from the sequence of cluster identifiers in a symbolically compressed document can be treated as a cipher problem. Although a significant amount of noise is present in the cluster sequence, enough information can be recovered with a robust deciphering algorithm to accomplish certain document analysis tasks. The feasibility of this approach is demonstrated in a multilingual document duplicate detection system.

...read moreread less

Patent•

Techniques for determining electronic document information for paper documents

[...]

Jonathan J. Hull¹, Jamey Graham¹, Dar-Shyang Lee¹, Peter E. Hart¹•Institutions (1)

Ricoh¹

03 Sep 2002

TL;DR: In this article, the electronic document information determined for a paper document may include information identifying an electronic document corresponding to the paper document and identifying a location where the electronic documents are stored or a pointer or reference to the e-document.

...read moreread less

Abstract: Techniques for determining electronic document information for a paper document. The electronic document information determined for a paper document may include information identifying an electronic document corresponding to the paper document. The electronic document information may also include information identifying a location where the electronic document is stored or a pointer or reference to the electronic document. The electronic document information determined for a paper document may be stored along with identification code information read from an identification tag that is physically associated with the paper document. The electronic document information for a paper document may also be stored in an identification tag that is physically associated with the paper document or physically associated with another paper document generated based upon the paper document.

...read moreread less

Book Chapter•DOI•

smartFIX: A Requirements-Driven System for Document Analysis and Understanding

[...]

Andreas Dengel¹, Bertin Klein¹•Institutions (1)

German Research Centre for Artificial Intelligence¹

19 Aug 2002

TL;DR: The system smartFIX which is a document analysis and understanding system developed by the DFKI spin-off INSIDERS permits the processing of documents ranging from fixed format forms to unstructured letters of any format.

...read moreread less

Abstract: Although the internet offers a wide-spread platform for information interchange, day-to-day work in large companies still means the processing of tens of thousands of printed documents every day. This paper presents the system smartFIX which is a document analysis and understanding system developed by the DFKI spin-off INSIDERS. It permits the processing of documents ranging from fixed format forms to unstructured letters of any format. Apart from the architecture, the main components and system characteristics, we also show some results when applying smartFIX to medical bills and prescriptions.

...read moreread less

Patent•

System and method for electronic document processing

[...]

Breen Gaughan, William Finger

11 Sep 2002

TL;DR: In this article, a system and method for providing electronic document processing via a network such as the Internet is described, where electronic documents are generated, processed, and reviewed by different users fulfilling different roles within a loan documentation process.

...read moreread less

Abstract: A system and method for providing electronic document processing via a network such as the Internet. A superuser defines access rules by which other users can access the system. Electronic documents are generated, processed, and reviewed by different users fulfilling different roles within a loan documentation process. An originator initiates electronic document processing by transmitting electronic documents to a document server. An electronic document processor evaluates the electronic documents and determines their applicability to defined documentation processes. An electronic document manager defines the documentation processes and balances processing activities for a plurality of electronic document processors. Ultimately, the electronic documents are made available to a plurality of electronic document recipients in specific formats specifiable by each of the electronic document recipients.

...read moreread less

Patent•

System and method for processing digital documents utilizing secure communications over a network

[...]

David A. E. Wall

04 Jan 2002

TL;DR: In this paper, a system and method for processing communications between a sender computing device (202) and at least one recipient computing devices (204) is provided. But it does not specify recipient identity verification.

...read moreread less

Abstract: A system and method for processing communications between a sender computing device (202) and at least one recipient computing device (204) are provided. A sender (202) establishes a secure communication with a document processing server (206) and requests the processing of an electronic document, which can include the appending of a digital signature. The document processing server (206) processes the electronic document and establishes secure communications with one or more designated recipients (204). The document processing server (206) can implement sender (202) specified recipient identity verification and provide further processing of the electronic document as designated by the recipients (204).

...read moreread less

Patent•

Document security system

[...]

Jonathan J. Hull¹, Jamey Graham¹, Dar-Shyang Lee¹, Hideki Segawa¹•Institutions (1)

Ricoh¹

03 Sep 2002

TL;DR: In this paper, a surface suitable for placement of documents is configured for monitoring RFID tagged documents, such documents can be monitored in a document processing device to control access to the document processing functions.

...read moreread less

Abstract: Document monitoring provides a measure of document security. Documents incorporating radio frequency identification (RFID) tags can be monitored by appropriate interrogation components for movement activity. A surface suitable for placement of documents is configured for monitoring RFID tagged documents. Such documents can be monitored in a document processing device to control access to the document processing functions.

...read moreread less

Patent•

Document identification device, document definition method and document identification method

[...]

Kazuaki Yokota

27 Nov 2002

TL;DR: In this paper, a plurality of document definition information for identifying documents, and format control information for recognizing a character recorded on a document corresponding to each of the plurality of definition information are held beforehand.

...read moreread less

Abstract: A plurality of document definition information for identifying documents, and format control information for recognizing a character recorded on a document corresponding to each of the plurality of document definition information are held beforehand, documents targeted for character recognition are identified as specific documents based on document images of the entered documents targeted for character recognition and the document definition information and, based on a result of the identification, character recognition is executed by using corresponding format control information. A document definition device adds a plane area of each of documents to be identified to the document definition. An OCR device checks the plane area on the document by using the document definition before check of a preprint accompanied by character recognition.

...read moreread less

Book Chapter•DOI•

Homogeneous Ants for Web Document Similarity Modeling and Categorization

[...]

Kok Meng Hoe, Weng-Kin Lai, Tracy S. Y. Tai

12 Sep 2002-Lecture Notes in Computer Science

TL;DR: This paper presents a preliminary investigation of applying a homogeneous multi-agent clustering system based on the self-organization behavior of the ants to the high-dimensional problem of web document categorization.

...read moreread less

Abstract: The self-organizing and autonomous behavior of social insects such as ants presents an interesting and powerful metaphor for applications in the retrieval and management of large and fast growing amount of online information. The explosive growth of web documents has increasingly made more difficult and costly the manual task of organizing the documents into meaningful categories by human experts. Hence, it is desirable that some degree of automation be incorporated into the classification process to enable better scalability and prevent human classifiers from being overwhelmed by the deluge of information. This paper presents a preliminary investigation of applying a homogeneous multi-agent clustering system based on the self-organization behavior of the ants to the high-dimensional problem of web document categorization. A description of the text processing needed to obtain significant document features is included. The system will be evaluated on multi-class online English documents obtained from a popularly used search engine.

...read moreread less

Proceedings Article•DOI•

Progress in document reconstruction

[...]

A.L. Spitz

11 Aug 2002

TL;DR: This work combines information from a language model and character image pattern matching to iteratively reduce ambiguity in document images to at least partially resolves the character content without optical character recognition.

...read moreread less

Abstract: We combine information from a language model and character image pattern matching to iteratively reduce ambiguity in document images. Combining word shape information and lists of similar bitmap patterns in a document at least partially resolves the character content without optical character recognition. We present the output in various ways. suitable for human readers or for differing downstream processes.

...read moreread less

Proceedings Article•DOI•

Online handwritten Indian script recognition: a human motor function based framework

[...]

Utpal Garain¹, Bidyut B. Chaudhuri¹, Tamaltaru Pal¹•Institutions (1)

Indian Statistical Institute¹

11 Aug 2002

TL;DR: The primary concern of the approach is the modeling of human motor functionality while writing characters by looking at the whole pen trajectory where the time evaluation of the pen coordinates plays a crucial role.

...read moreread less

Abstract: This paper presents the online handwriting recognition for Indian scripts. The primary concern of the approach is the modeling of human motor functionality while writing characters. This is achieved by looking at the whole pen trajectory where the time evaluation of the pen coordinates plays a crucial role. A low complexity classifier was designed and the proposed similarity measure appears to be quite robust against wide variations in writing styles. Initially, the approach was applied for online recognition of handwritten characters in Devnagari and Bangla, the two major Indian scripts. A test on a dataset of considerable size shows promising recognition rates: 97.29% for Devnagari and 96.34% for Bangla.

...read moreread less

Patent•

Document processing utilizing a version managing part

[...]

Toshikazu Ohwada¹, Katsumi Kanasaki¹•Institutions (1)

Ricoh¹

20 Nov 2002

TL;DR: In this article, the same contents (content data) of a document among versions are shared so as to reduce the storage area, instead of accumulating content data separately for each version, each version is related to the content data accumulated in the storage space shared among the versions.

...read moreread less

Abstract: When there are the same contents (content data) of a document among versions, the contents of the document are shared so as to reduce the storage area. Accordingly, instead of accumulating content data separately for each version, each version is related to the content data accumulated in the storage area shared among the versions. When a document α has versions 1 through 3 and each of versions 1 through 3 has sections 1 and 2, three sections 1 of the versions 1 through 3 share content data 1 and the section 2 of each of the versions 1 through 3 has different content data 2, 3, or 4. The content data 1, 2, 3, or 4 indicated by version information are searched for from a content data DB, to be edited. Only when the content data 1, 2, 3, or 4 are changed, new content data are registered.

...read moreread less

Patent•

Image processing device

[...]

Morita Toshiaki

11 Jan 2002

TL;DR: In this paper, the authors proposed a method to automatically detect plural documents and to output images onto sheets of paper in different modes such as for alignment of inclination when the plural documents are placed on a document table.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To automatically detect plural documents and to output images onto sheets of paper in different modes such as for alignment of inclination when the plural documents are placed on a document table. SOLUTION: When read image data D1 are data of plural documents OR1, OR2, etc., which are placed on a document table 9 and are read at the same time, the document edges E1, E2, etc., of document image data d1, d2, etc., in the read image data D1 are detected by a document edge detecting section 5 according to difference in density. The plural document image data d1, d2, etc., are cut down by a plural document detecting section 6 according to the detected document edges E1, E2, etc., and, after processing such as individual alignment of inclination by a plural document processing section 7, images are output onto sheets 10 of paper by an image output device 3.

...read moreread less

Patent•

Network interconnected financial document processing devices

[...]

John E. Jones, Paul A. Jones, William J. Jones, Douglas U. Mennie

09 Jan 2002

Patent•

Digital watermarks as a communication channel in documents for controlling document processing devices

[...]

Geoffrey B. Rhoads, Philip R. Patterson, Ronald S. Miolla

12 Apr 2002

TL;DR: In this paper, digital watermarks are embedded in documents to create a communication channel between document handling devices such as copiers (24, 26), printers (64, 66), scanners (52, 54) and fax machines.

...read moreread less

Abstract: Digital watermarks are embedded in documents (44, 60) to create a communication channel between document handling devices such as copiers (24, 26), printers (64, 66), scanners (52, 54) and fax machines (24, 26). The digital watermarks are used to control document reproduction and transmission operations. The digital watermarks are also used to embed transaction information in documents (44, 60), to link the document (44, 60) to an original, electronic version stored on a network (48, 50), or to trace the document handling history of a document (44, 60).

...read moreread less