Showing papers on "Optical character recognition published in 1998"

PDF

Open Access

Journal Article•DOI•

Gradient-based learning applied to document recognition

[...]

Yann LeCun¹, Léon Bottou², Léon Bottou³, Yoshua Bengio⁴, Yoshua Bengio⁵, Yoshua Bengio³, Patrick Haffner³ - Show less +3 more•Institutions (5)

Bell Labs¹, École Normale Supérieure², AT&T³, École Polytechnique de Montréal⁴, Alcatel-Lucent⁵

01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

...read moreread less

42,067 citations

Proceedings Article•DOI•

Automatic text location in images and video frames

[...]

Anil K. Jain¹, Bin Yu•Institutions (1)

Michigan State University¹

16 Aug 1998

TL;DR: Compared with some traditional text location methods, this method has the following advantages: 1) low computational cost; 2) robust to font size; and 3) high accuracy.

...read moreread less

Abstract: Automatic text location (without character recognition capabilities) deals with extracting image regions that contain text only. The images of these regions can then be fed to an optical character recognition module or highlighted for users. This is very useful in a number of applications such as database indexing and converting paper documents to their electronic versions. The performance of our automatic text location algorithm is shown in several applications. Compared with some traditional text location methods, our method has the following advantages: 1) low computational cost; 2) robust to font size; and 3) high accuracy.

...read moreread less

560 citations

Patent•

Voice-output reading system with gesture-based navigation

[...]

James T. Sears, David Goldberg

22 Oct 1998

TL;DR: In this paper, an optical-input print reading device with voice output for people with impaired or no vision is presented, in which the user provides input to the system from hand gestures.

...read moreread less

Abstract: An optical-input print reading device with voice output for people with impaired or no vision in which the user provides input to the system from hand gestures. Images of the text to be read, on which the user performs finger- and hand-based gestural commands, are input to a computer, which decodes the text images into their symbolic meanings through optical character recognition, and further tracks the location and movement of the hand and fingers in order to interpret the gestural movements into their command meaning. In order to allow the user to select text and align printed material, feedback is provided to the user through audible and tactile means. Through a speech synthesizer, the text is spoken audibly. For users with residual vision, visual feedback of magnified and image enhanced text is provided. Multiple cameras of the same or different field of view can improve performance. In addition, alternative device configurations allow portable operation, including the use of cameras located on worn platforms, such as eyeglasses, or on a fingertip system. The use of gestural commands is natural, allowing for rapid training and ease of use. The device also has application as an aid in learning to read, and for data input and image capture for home and business uses.

...read moreread less

425 citations

Journal Article•DOI•

A complete printed Bangla OCR system

[...]

Bidyut B. Chaudhuri¹, Umapada Pal¹•Institutions (1)

Indian Statistical Institute¹

01 Mar 1998-Pattern Recognition

TL;DR: A complete Optical Character Recognition (OCR) system for printed Bangla, the fourth most popular script in the world, is presented and extension of the work to Devnagari, the third most popular Script in the World, is discussed.

...read moreread less

381 citations

Journal Article•DOI•

Off-line Arabic character recognition: the state of the art

[...]

Adnan Amin¹•Institutions (1)

University of New South Wales¹

01 Mar 1998-Pattern Recognition

TL;DR: In this article, the authors present the state of Arabic character recognition research throughout the last two decades and present the main objective of this paper is to present the current state of the research.

...read moreread less

319 citations

Journal Article•DOI•

High quality document image compression with "DjVu"

[...]

Léon Bottou¹, Patrick Haffner¹, Paul G. Howard¹, Patrice Yvon Simard¹, Yoshua Bengio¹, Yann LeCun¹ - Show less +2 more•Institutions (1)

AT&T Labs¹

01 Jul 1998-Journal of Electronic Imaging

TL;DR: A new image compression technique called DjVu is presented that enables fast transmission of document images over low-speed connections, while faithfully reproducing the visual aspect of the document, including color, fonts, pictures, and paper texture.

...read moreread less

312 citations

Journal Article•DOI•

Rotation invariant texture features and their use in automatic script identification

[...]

Tieniu Tan

01 Jul 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Rotation invariant texture features are computed based on an extension of the popular multi-channel Gabor filtering technique, and their effectiveness is tested with 300 randomly rotated samples of 15 Brodatz textures to solve a practical but hitherto mostly overlooked problem in document image processing.

...read moreread less

Abstract: Concerns the extraction of rotation invariant texture features and the use of such features in script identification from document images Rotation invariant texture features are computed based on an extension of the popular multi-channel Gabor filtering technique, and their effectiveness is tested with 300 randomly rotated samples of 15 Brodatz textures These features are then used in an attempt to solve a practical but hitherto mostly overlooked problem in document image processing-the identification of the script of a machine printed document Automatic script and language recognition is an essential front-end process for the efficient and correct use of OCR and language translation products in a multilingual environment Six languages (Chinese, English, Greek, Russian, Persian, and Malayalam) are chosen to demonstrate the potential of such a texture-based approach in script identification

...read moreread less

293 citations

Proceedings Article•DOI•

Video OCR for digital news archive

[...]

Toshio Sato¹, Takeo Kanade¹, E.K. Hughes¹, Michael A. Smith¹•Institutions (1)

Carnegie Mellon University¹

03 Jan 1998

TL;DR: This paper applies an interpolation filter, multi-frame integration and a combination of four filters to solve the problems of character recognition for videos: low resolution characters and extremely complex backgrounds.

...read moreread less

Abstract: Video OCR is a technique that can greatly help to locate topics of interest in a large digital news video archive via the automatic extraction and reading of captions and annotations. News captions generally provide vital search information about the video being presented, the names of people and places or descriptions of objects. In this paper, two difficult problems of character recognition for videos are addressed: low resolution characters and extremely complex backgrounds. We apply an interpolation filter, multi-frame integration and a combination of four filters to solve these problems. Segmenting characters is done by a recognition-based segmentation method and intermediate character recognition results are used to improve the segmentation. The overall recognition results are good enough for use in news indexing. Performing video OCR on news video and combining its results with other video understanding techniques will improve the overall understanding of the news video content.

...read moreread less

284 citations

Journal Article•DOI•

An off-line cursive handwriting recognition system

[...]

Andrew W. Senior¹, Anthony J. Robinson²•Institutions (2)

IBM¹, University of Cambridge²

01 Mar 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Describes a complete system for the recognition of off-line handwriting, including segmentation and normalization of word images to give invariance to scale, slant, slope and stroke thickness.

...read moreread less

Abstract: Describes a complete system for the recognition of off-line handwriting. Preprocessing techniques are described, including segmentation and normalization of word images to give invariance to scale, slant, slope and stroke thickness. Representation of the image is discussed and the skeleton and stroke features used are described. A recurrent neural network is used to estimate probabilities for the characters represented in the skeleton. The operation of the hidden Markov model that calculates the best word in the lexicon is also described. Issues of vocabulary choice, rejection, and out-of-vocabulary word recognition are discussed.

...read moreread less

271 citations

Patent•

Financial transaction processing systems and methods

[...]

Louis J. Krouse, Eric F. Strovink

15 Jun 1998

TL;DR: In this article, an optically scanned image (34, 208) of at least a portion of document containing visual data, in a particular format, representing information related to the financial transaction was generated.

...read moreread less

Abstract: The present invention provides financial transaction processing systems and methods. One preferred embodiment of a method according to one aspect of the present invention includes generating an optically scanned image (34, 208) of at least a portion of document containing visual data, in a particular format, representing information related to the financial transaction. Recognition characteristics (32, 204) are generated from the scanned image and are compared (40, 220) to respective sets of reference recognition characteristics generated from respective other transaction documents having different respective formats to determine therefrom whether the particular format of the visual data matches one of the respective formats of the other documents. When such a match is found to exist, location is determined (40, 218) of a field in the scanned image to which optical character recognition may be applied to generate therefrom the information, based upon the respective format found to match the particular format of the visual data. Optical character recognition is then utilized to generate said visual data (60, 232) from said location.

...read moreread less

266 citations

Journal Article•DOI•

Optical font recognition using typographical features

[...]

A. Zramdini, R. Ingold¹•Institutions (1)

University of Fribourg¹

01 Aug 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new statistical approach based on global typographical features is proposed to the widely neglected problem of font recognition that aims at the identification of the typeface, weight, slope and size of the text from an image block without any knowledge of the content of that text.

...read moreread less

Abstract: A new statistical approach based on global typographical features is proposed to the widely neglected problem of font recognition. It aims at the identification of the typeface, weight, slope and size of the text from an image block without any knowledge of the content of that text. The recognition is based on a multivariate Bayesian classifier and operates on a given set of known fonts. The effectiveness of the adopted approach has been experimented on a set of 280 fonts. Font recognition accuracies of about 97 percent were reached on high-quality images. In addition, rates higher than 99.9 percent were obtained for weight and slope detection. Experiments have also shown the system robustness to document language and text content and its sensitivity to text length.

...read moreread less

Proceedings Article•DOI•

Automatic text extraction from video for content-based annotation and retrieval

[...]

Jae-Chang Shim¹, C. Dorai, R. Bolle•Institutions (1)

Andong National University¹

16 Aug 1998

TL;DR: This work has developed a scheme for automatically extracting text from digital images and videos for content annotation and retrieval that results in segmented characters that can be directly processed by an OCR system to produce ASCII text.

...read moreread less

Abstract: Efficient content-based retrieval of image and video databases is an important application due to rapid proliferation of digital video data on the Internet and corporate intranets. Text either embedded or superimposed within video frames is very useful for describing the contents of the frames, as it enables both keyword and free-text based search, automatic video logging, and video cataloging. We have developed a scheme for automatically extracting text from digital images and videos for content annotation and retrieval. We present our approach to robust text extraction from video frames, which can handle complex image backgrounds, deal with different font sizes, font styles, and font appearances such as normal and inverse video. Our algorithm results in segmented characters that can be directly processed by an OCR system to produce ASCII text. Results from our experiments with over 5000 frames obtained from twelve MPEG video streams demonstrate the good performance of our system in terms of text identification accuracy and computational efficiency.

...read moreread less

Patent•

Networked fax routing via email

[...]

Hassan Alam, Horace Dediu, Scot Tupaj

29 Apr 1998

TL;DR: In this paper, a processor-based fax routing method receives digital data representing a facsimile document and performs OCR on the image data extracting therefrom texts for the keyword, the name of the addressee, and other text present in the document.

...read moreread less

Abstract: A processor-based fax routing method receives digital data representing a facsimile document. Without performing optical character recognition ("OCR"), the method identifies in the image data a keyword block of text, and an addressee-name block of text that is located near the keyword block of text. The fax routing method then performs OCR on the image data extracting therefrom texts for the keyword, the name of the addressee, and other text present in the facsimile. Using probabilities computed between the text of the name of the addressee and names in a list of possible addressees, and between the keyword and keywords in a list of keywords, the fax routing method determines an addressee for the document. The fax routing method then converts all text into email addressed to the fax's addressee, and stores the email onto an email server from which it may be retrieved.

...read moreread less

Book•DOI•

Document Analysis Systems II

[...]

Suzanne L. Taylor, Jonathan J. Hull

01 Apr 1998

TL;DR: Evaluating the performance of techniques for the extraction of primitives from line drawings composed of horizontal and vertical lines and evaluating the development of a general framework for intelligent document image retrieval.

...read moreread less

Abstract: Evaluating the performance of techniques for the extraction of primitives from line drawings composed of horizontal and vertical lines, J.F. Arias et al the development of a general framework for intelligent document image retrieval, D. Doermann et al perdition of OCR accuracy using a neural network, J. Gonzalez et al evaluating Japanese document recognition in the Internet/intranet environment, T. Hong et al DocBrowse - a system for textual and graphical querying on degraded document image data, M.Y. Jaisimha et al language identification in complex, unoriented and degraded document images, D. Lee et al document analysis and the World Wide Web, D. Lopresti and J. Zhou language-independent and segmentation-free optical character recognition, J. Makhoul et al documents on the move - DA&IR-driven mail piece processing today and tomorrow, U. Miletzki priming the recognizer, G. Nagy and Y. Xu semiautomatic production of highly accurate word bounding box ground truth, R.P. Rogers et al SPAM - a scientific paper access method, A.L. Spitz automated CAD conversion with the machine drawing understanding system, L. Wenyin and D. Dori. (Part contents)

...read moreread less

Journal Article•DOI•

Automatic text location in images and video frames

[...]

Anil K. Jain¹, Bin Yu¹•Institutions (1)

Michigan State University¹

01 Dec 1998-Pattern Recognition

TL;DR: This work proposes a new text location algorithm that is suitable in a number of applications, including conversion of newspaper advertisements from paper documents to their electronic versions, World Wide Web search, color image indexing and video indexing, and emphasize on extracting important text with large size and high contrast.

...read moreread less

Journal Article•DOI•

Segmentation of off-line cursive handwriting using linear programming☆

[...]

Berrin Yanikoglu¹, Peter A. Sandon¹•Institutions (1)

IBM¹

01 Dec 1998-Pattern Recognition

TL;DR: This work introduces a new segmentation algorithm, guided in part by the global characteristics of the handwriting, which finds the successive segmentation points by evaluating a cost function at each point along the baseline.

...read moreread less

Book•

Image processing and pattern recognition

[...]

Cornelius T. Leondes

01 Jan 1998

TL;DR: Alpaydin and Gurgen, Comparison of Statistical and Neural Classifiers and their Applications to Optical Character Recognition and Speech Classification and Chen and Chang, Learning Algorithms and Applications of Principal Component Analysis.

...read moreread less

Abstract: Lampinen, Pattern Recognition. Alpaydin and Gurgen, Comparison of Statistical and Neural Classifiers and their Applications to Optical Character Recognition and Speech Classification. Sun and Nekovei, MedicalImaging. Takeda and Omatu, Paper Currency Recognition. Cordella and Stefano, Neural Network Classification Reliability: Problems and Applications. Yagi, Kobayaski, and Matsumoto, Parallel Analog Image Processing: Solving Regularization Problems with Architecture Inspired by the Vertebrate Retinal Circuit. Setiono, Algorithmic Techniques and their Applications. Chen and Chang, Learning Algorithms and Applications of Principal Component Analysis. Merat and Villalobos, Learning Evaluation and Pruning Techniques.

...read moreread less

Journal Article•DOI•

Off-line recognition of Chinese handwriting by multifeature and multilevel classification

[...]

Yuan Yan Tang¹, Lo-Ting Tu², Jiming Liu², Seong-Whan Lee³, Win-Win Lin⁴ - Show less +1 more•Institutions (4)

Hong Kong Baptist University¹, Concordia University², Industrial Technology Research Institute³, Korea University⁴

01 May 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this article, an off-line recognition system based on multifeature and multilevel classification is presented for handwritten Chinese characters, where 10 classes of multifeatures, such as peripheral shape features, stroke density features, and stroke direction features, are used in this system.

...read moreread less

Abstract: In this paper, an off-line recognition system based on multifeature and multilevel classification is presented for handwritten Chinese characters. Ten classes of multifeatures, such as peripheral shape features, stroke density features, and stroke direction features, are used in this system. The multilevel classification scheme consists of a group classifier and a five-level character classifier, where two new technologies, overlap clustering and Gaussian distribution selector are developed. Experiments have been conducted to recognize 5,401 daily-used Chinese characters. The recognition rate is about 90 percent for a unique candidate, and 98 percent for multichoice with 10 candidates.

...read moreread less

Book Chapter•DOI•

The T-Recs Table Recognition and Analysis System

[...]

Thomas Kieninger, Andreas Dengel

04 Nov 1998

TL;DR: A new approach to table structure recognition as well as to layout analysis that realizes a bottom-up clustering of given word segments, whereas conventional table structure recognizers all rely on the detection of some separators such as delineation or significant white space to analyze a page from the top-down.

...read moreread less

Abstract: This paper presents a new approach to table structure recognition as well as to layout analysis The discussed recognition process differs significantly from existing approaches as it realizes a bottom-up clustering of given word segments, whereas conventional table structure recognizers all rely on the detection of some separators such as delineation or significant white space to analyze a page from the top-down The following analysis of the recognized layout elements is based on the construction of a tile structure and detects row- and/or column spanning cells as well as sparse tables with a high degree of confidence The overall system is completely domain independent, optionally neglects textual contents and can thus be applied to arbitrary mixed-mode documents (with or without tables) of any language and even operates on low quality OCR documents (eg facsimiles)

...read moreread less

Journal Article•DOI•

Classification of machine-printed and handwritten texts using character block layout variance☆

[...]

Kuo Chin Fan¹, Liang Shen Wang¹, Yin Tien Tu¹•Institutions (1)

National Central University¹

01 Sep 1998-Pattern Recognition

TL;DR: A machine-printed and handwritten text classification method to automatically identify the identity of texts segmented from a document image to facilitate later optical character recognition task.

...read moreread less

Patent•

Method for inset detection in document layout analysis

[...]

Robert Cooperman¹•Institutions (1)

Xerox¹

30 Apr 1998

TL;DR: In this paper, a method for detecting insets in the structure of a document page so as to further complement the document layout and textual information provided in an optical character recognition system is presented.

...read moreread less

Abstract: The present invention is a method for detecting insets in the structure of a document page so as to further complement the document layout and textual information provided in an optical character recognition system. A system employing the present method preferably includes a document layout analysis system wherein the inset detection methodology is used to extend the capability of an associated character recognition package to more accurately recreate the document being processed.

...read moreread less

Journal Article•DOI•

Neural network-based systems for handprint OCR applications

[...]

M.D. Ganis, C.L. Wilson, J.L. Blue

01 Aug 1998-IEEE Transactions on Image Processing

TL;DR: An NN classification scheme based on an enhanced multilayer perceptron (MLP) is presented and an end-to-end system for form-based handprint OCR applications designed by the National Institute of Standards and Technology (NIST) Visual Image Processing Group is described.

...read moreread less

Abstract: Over the last five years or so, neural network (NN)-based approaches have been steadily gaining performance and popularity for a wide range of optical character recognition (OCR) problems, from isolated digit recognition to handprint recognition. We present an NN classification scheme based on an enhanced multilayer perceptron (MLP) and describe an end-to-end system for form-based handprint OCR applications designed by the National Institute of Standards and Technology (NIST) Visual Image Processing Group. The enhancements to the MLP are based on (i) neuron activations functions that reduce the occurrences of singular Jacobians; (ii) successive regularization to constrain the volume of the weight space; and (iii) Boltzmann pruning to constrain the dimension of the weight space. Performance characterization studies of NN systems evaluated at the first OCR systems conference and the NIST form-based handprint recognition system are also summarized.

...read moreread less

Patent•

Using OCR to enter graphics as text into a clipboard

[...]

Shmuel Ur¹•Institutions (1)

IBM¹

07 Apr 1998

TL;DR: In this article, a computer system is provided for transferring graphical textual information into an application program, which consists of an information transfer means, activated in response to a user action on an input device of the computer system, for identifying on the computer display screen a user selected source of textual information, and transferring said textual information as a bit image into a first predetermined location of computer memory.

...read moreread less

Abstract: A computer system is provided for transferring graphical textual information into an application program. The arrangement comprises an information transfer means, activated in response to a user action on an input device of the computer system, for identifying on the computer display screen a user selected source of textual information, and transferring said textual information as a bit image into a first predetermined location of the computer memory. The arrangement using optical character recognition logic (OCR) coupled to the information transfer means, for generating a character code for each character image identified in the image stored in the first memory location. The generated character codes are stored by said information transfer means into a second predetermined location of the computer memory. The source information is available to this second location to be inserted into a destination application program by being pasted into a user defined screen location.

...read moreread less

Journal Article•DOI•

Matching document images with ground truth

[...]

John D. Hobby¹•Institutions (1)

Alcatel-Lucent¹

01 Feb 1998-International Journal on Document Analysis and Recognition

TL;DR: A more robust procedure is to follow up by using an optimization algorithm to refine the transformation by finding a transformation that matches a scanned image to the machine-readable document description that was used to print the original.

...read moreread less

Abstract: Since optical character recognition systems often require very large amounts of training data for optimum performance, it is important to automate the process of finding ground truth character identities for document images. This is done by finding a transformation that matches a scanned image to the machine-readable document description that was used to print the original. Rather than depend on finding feature points, a more robust procedure is to follow up by using an optimization algorithm to refine the transformation. The function to optimize can be based on the character bounding boxes – it is not necessary to have access to the actual character shapes used when printing the original.

...read moreread less

Proceedings Article•DOI•

PC based number plate recognition system

[...]

C. Coetzee¹, Charl P. Botha¹, D. Weber¹•Institutions (1)

Stellenbosch University¹

07 Jul 1998

TL;DR: A PC based number plate recognition system is presented, using the Niblack algorithm, which was found to outperform all binarization techniques previously used in similar systems.

...read moreread less

Abstract: A PC based number plate recognition system is presented Digital gray-level images of cars are thresholded using the Niblack algorithm, which was found to outperform all binarization techniques previously used in similar systems A simple yet highly effective rule-based algorithm detects the position and size of number plates Characters are segmented from the thresholded plate using blob-colouring, and passed as 15/spl times/15 pixel bitmaps to a neural network based optical character recognition (OCR) system A novel dimension reduction technique reduces the neural network inputs from 225 to 50 features Six small networks in parallel are used, each recognising six characters The system can recognize single and double line plates under varying lighting conditions and slight rotation Successful recognition of complete registration plates is about 861%

...read moreread less

Journal Article•DOI•

A script-independent methodology for optical character recognition

[...]

John Makhoul¹, Richard Schwartz¹, Christopher LaPre¹, Issam Bazzi¹•Institutions (1)

BBN Technologies¹

01 Sep 1998-Pattern Recognition

TL;DR: A methodology for OCR that exhibits the following properties: script-independent feature extraction, training, and recognition components; no separate segmentation at the character and word levels; and the training is performed automatically on data that is also not presegmented.

...read moreread less

Proceedings Article•

Translation camera

[...]

Yasuhiko Watanabe¹, Yoshihiro Okada, Yeun-Bae Kim, Tetsuya Takeda•Institutions (1)

Ryukoku University¹

16 Aug 1998

TL;DR: A camera system which translates Japanese texts in a scene using a digital camera, which extracts character strings from a region which a user specifies, and translates them into English.

...read moreread less

Abstract: We propose a camera system which translates Japanese texts in a scene. The system is portable and consists of four components: digital camera, character image extraction process, character recognition process, and translation process. The system extracts character strings from a region which specifies and translates them into English.

...read moreread less

Journal Article•DOI•

Text-line extraction and character recognition of document headlines with graphical designs using complementary similarity measure

[...]

M. Sawaki¹, N. Hagita¹•Institutions (1)

Nippon Telegraph and Telephone¹

01 Oct 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A method for recognizing characters on graphical designs and a new projection feature that separates text-line regions from backgrounds, and adaptive thresholding in displacement matching are introduced are proposed.

...read moreread less

Abstract: A method for recognizing characters on graphical designs is proposed. A new projection feature that separates text-line regions from backgrounds, and adaptive thresholding in displacement matching are introduced. Experimental results for newspaper headlines with graphical designs show a recognition rate of 97.7 percent.

...read moreread less

Proceedings Article•DOI•

Document image mosaicing

[...]

A.P. Whichello¹, Hong Yan•Institutions (1)

University of Sydney¹

16 Aug 1998

TL;DR: An automatic mosaicing process for document images is described, using an image pyramid and sequential similarity to reduce computation time and present results for binarised document images with data captured using a digital camera.

...read moreread less

Abstract: If it is impossible to capture all the image in one scan with the available equipment, a montage can be made from separately scanned pieces. We describe an automatic mosaicing process for document images. The image shifts are found by a correlation technique, using an image pyramid and sequential similarity to reduce computation time. Image placement and overlap is used to reject incorrect solutions. We present results for binarised document images with data captured using a digital camera.

...read moreread less

Proceedings Article•DOI•

Automatic processing of document annotations

[...]

Jacob Stevens¹, Andrew H. Gee¹, Christopher R. Dance²•Institutions (2)

University of Cambridge¹, Xerox²

16 Sep 1998

TL;DR: A system for reliably establishing correspondences between printed words and their electronic counterparts, without performing optical character recognition, which might have interesting applications in document database retrieval, since it allows an electronic document to be indexed by a printed version of itself.

...read moreread less

Abstract: A common authoring technique involves making annotations on a printed draft and then typing the corrections into a computer at a later date. In this paper, we describe a system that goes some way towards automating this process. The author simply passes the annotated documents through a sheetfeed scanner and then brings up the electronic document in a text editor. The system then works out where the annotated words are and allows the author to skip from one annotation to the next at the touch of a key. At the heart of the system lies a procedure for reliably establishing correspondences between printed words and their electronic counterparts, without performing optical character recognition. This procedure might have interesting applications in document database retrieval, since it allows an electronic document to be indexed by a printed version of itself.

...read moreread less