Showing papers on "Optical character recognition published in 2007"

PDF

Open Access

Proceedings Article•DOI•

[...]

Ray Smith¹•Institutions (1)

23 Sep 2007

TL;DR: The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy, is described in a comprehensive overview.

...read moreread less

Abstract: The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy, is described in a comprehensive overview. Emphasis is placed on aspects that are novel or at least unusual in an OCR engine, including in particular the line finding, features/classification methods, and the adaptive classifier.

...read moreread less

1,530 citations

Proceedings Article•DOI•

Breaking Visual CAPTCHAs with Naive Pattern Recognition Algorithms

[...]

Jeff Yan¹, A.S. El Ahmad¹•Institutions (1)

Newcastle University¹

01 Dec 2007

TL;DR: In this paper, the authors used simple pattern recognition algorithms but exploited fatal design errors that were discovered in each CAPTCHA scheme and showed that their simple attacks can also break many other schemes deployed on the Internet at the time of writing.

...read moreread less

Abstract: Visual CAPTCHAs have been widely used across the Internet to defend against undesirable or malicious bot programs. In this paper, we document how we have broken most such visual schemes provided at Captchaservice.org, a publicly available web service for CAPTCHA generation. These schemes were effectively resistant to attacks conducted using a high-quality Optical Character Recognition program, but were broken with a near 100% success rate by our novel attacks. In contrast to early work that relied on sophisticated computer vision or machine learning algorithms, we used simple pattern recognition algorithms but exploited fatal design errors that we discovered in each scheme. Surprisingly, our simple attacks can also break many other schemes deployed on the Internet at the time of writing: their design had similar errors. We also discuss defence against our attacks and new insights on the design of visual CAPTCHA schemes.

...read moreread less

250 citations

Journal Article•DOI•

A survey of document image classification: problem statement, classifier architecture and performance evaluation

[...]

Nawei Chen¹, Dorothea Blostein¹•Institutions (1)

Queen's University¹

22 May 2007-International Journal on Document Analysis and Recognition

TL;DR: This work focuses on techniques that classify single-page typeset document images without using OCR results, and brings to light important issues in designing a document classifier, including the definition of document classes, the choices of document features and feature representation, and the choice of classification algorithm and learning mechanism.

...read moreread less

Abstract: Document image classification is an important step in Office Automation, Digital Libraries, and other document image analysis applications. There is great diversity in document image classifiers: they differ in the problems they solve, in the use of training data to construct class models, and in the choice of document features and classification algorithms. We survey this diverse literature using three components: the problem statement, the classifier architecture, and performance evaluation. This brings to light important issues in designing a document classifier, including the definition of document classes, the choice of document features and feature representation, and the choice of classification algorithm and learning mechanism. We emphasize techniques that classify single-page typeset document images without using OCR results. Developing a general, adaptable, high-performance classifier is challenging due to the great variety of documents, the diverse criteria used to define document classes, and the ambiguity that arises due to ill-defined or fuzzy document classes.

...read moreread less

181 citations

Patent•

Pure adversarial approach for identifying text content in images

[...]

Jonathan J. Oliver

16 Aug 2007

TL;DR: In this article, a pure adversarial optical character recognition (OCR) approach is used to identify text content in images, where an image and a search term are input to a pure-adversarial OCR module, which searches the image for presence of the search term.

...read moreread less

Abstract: A pure adversarial optical character recognition (OCR) approach in identifying text content in images. An image and a search term are input to a pure adversarial OCR module, which searches the image for presence of the search term. The image may be extracted from an email by an email processing engine. The OCR module may split the image into several character-blocks that each has a reasonable probability of containing a character (e.g., an ASCII character). The OCR module may form a sequence of blocks that represent a candidate match to the search term and calculate the similarity of the candidate sequence to the search term. The OCR module may be configured to output whether or not the search term is found in the image and, if applicable, the location of the search term in the image.

...read moreread less

163 citations

Journal Article•DOI•

OCR binarization and image pre-processing for searching historical documents

[...]

Maya R. Gupta¹, N.P. Jacobson¹, Eric Garcia¹•Institutions (1)

University of Washington¹

01 Feb 2007-Pattern Recognition

TL;DR: Results for 12 pages from six newspapers of differing quality show that performance varies widely by image, but that the classic Otsu method and Otsi-based methods perform best on average.

...read moreread less

152 citations

Journal Article•DOI•

Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text

[...]

Tonghua Su¹, Tianwen Zhang¹, De-Jun Guan²•Institutions (2)

Harbin Institute of Technology¹, Harbin Engineering University²

22 May 2007-International Journal on Document Analysis and Recognition

TL;DR: The statistics show that the HIT-MW database has an excellent representation of the real handwriting and many new applications concerning real handwriting recognition can be supported by the database.

...read moreread less

Abstract: A Chinese handwriting database named HIT-MW is presented to facilitate the offline Chinese handwritten text recognition. Both the writers and the texts for handcopying are carefully sampled with a systematic scheme. To collect naturally written handwriting, forms are distributed by postal mail or middleman instead of face to face. The current version of HIT-MW includes 853 forms and 186,444 characters that are produced under an unconstrained condition without preprinted character boxes. The statistics show that the database has an excellent representation of the real handwriting. Many new applications concerning real handwriting recognition can be supported by the database.

...read moreread less

142 citations

Journal Article•DOI•

Text line extraction from multi-skewed handwritten documents

[...]

Subhadip Basu¹, Chitrita Chaudhuri¹, Mahantapas Kundu¹, Mita Nasipuri¹, Debabrota Basu¹ - Show less +1 more•Institutions (1)

Jadavpur University¹

01 Jun 2007-Pattern Recognition

TL;DR: A novel text line extraction technique is presented for multi-skewed document images of handwritten English or Bengali text that assumes that hypothetical water flows, from both left and right sides of the image frame, face obstruction from characters of text lines.

...read moreread less

114 citations

Proceedings Article•DOI•

A system for understanding imaged infographics and its applications

[...]

Weihua Huang¹, Chew Lim Tan¹•Institutions (1)

National University of Singapore¹

28 Aug 2007

TL;DR: Two practical applications of the system built for recognizing and understanding imaged infographics located in document pages are introduced, including supplement to traditional optical character recognition (OCR) system and providing enriched information for question answering (QA).

...read moreread less

Abstract: Information graphics, or infographics, are visual representations of information, data or knowledge. Understanding of infographics in documents is a relatively new research problem, which becomes more challenging when infographics appear as raster images. This paper describes technical details and practical applications of the system we built for recognizing and understanding imaged infographics located in document pages. To recognize infographics in raster form, both graphical symbol extraction and text recognition need to be performed. The two kinds of information are then auto-associated to capture and store the semantic information carried by the infographics. Two practical applications of the system are introduced in this paper, including supplement to traditional optical character recognition (OCR) system and providing enriched information for question answering (QA). To test the performance of our system, we conducted experiments using a collection of downloaded and scanned infographic images. Another set of scanned document pages from the University of Washington document image database were used to demonstrate how the system output can be used by other applications. The results obtained confirm the practical value of the system.

...read moreread less

105 citations

Journal Article•DOI•

Word matching using single closed contours for indexing handwritten historical documents

[...]

Tomasz Adamek¹, Noel E. O'Connor¹, Alan F. Smeaton¹•Institutions (1)

Dublin City University¹

01 Apr 2007-International Journal on Document Analysis and Recognition

TL;DR: This paper proposes a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles and demonstrates that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits.

...read moreread less

Abstract: Effective indexing is crucial for providing convenient access to scanned versions of large collections of historically valuable handwritten manuscripts. Since traditional handwriting recognizers based on optical character recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution (Lavrenko et al. in proc. document Image Analysis for Libraries (DIAL'04), pp. 278---287, 2004). Such techniques attempt to recognize words based on scalar and profile-based features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes (Adamek and O'Connor in IEEE Trans Circuits Syst Video Technol 5:2004). We demonstrate that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits. Our experiments show a recognition accuracy of 83%, which considerably exceeds the performance of other systems reported in the literature.

...read moreread less

103 citations

Journal Article•

Tesseract: an open-source optical character recognition engine

[...]

Anthony W. Kay

01 Jul 2007-Linux Journal

94 citations

Book Chapter•DOI•

The State of the Art of Document Image Degradation Modelling

[...]

Henry S. Baird¹•Institutions (1)

Lehigh University¹

01 Jan 2007

TL;DR: The literature on models of document image degradation is reviewed, and open problems include the search for methods for comparing competing models and sound methodologies for the use of synthetic data in engineering.

...read moreread less

Abstract: The literature on models of document image degradation is reviewed, and open problems are listed. In response to the unpleasant fact that the accuracy of document recognition algorithms falls drastically when image quality degrades even slightly, researchers in the last decade have intensiied their study of explicit, quantitative, parameter-ized models of image defects that occur during printing and scanning. Several models have been proposed, some motivated by the physics of image formation and others by the surface statistics of image distributions. A wide range of techniques for estimating parameters of these models has been explored. These models, in the form of pseudo-random generators of synthetic images, permit, for the rst time, investigations into fundamental properties of concrete image recognition problems including the Bayes error of problems and the asymptotic accuracy and domain of competency of classiier technologies. The use of massive sets of synthetic images, in the construction and testing of high-performance classiiers, has accelerated in the last few years. Open problems include the search for methods for comparing competing models and sound methodologies for the use of synthetic data in engineering.

...read moreread less

Journal Article•DOI•

Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK)

[...]

Mohammad S. Khorsheed¹•Institutions (1)

King Abdulaziz City for Science and Technology¹

01 Sep 2007-Pattern Recognition Letters

TL;DR: The system decomposes the document image into text line images and extracts a set of simple statistical features from a narrow window which is sliding a long that text line and injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK).

...read moreread less

Patent•

OCR input to search engine

[...]

Brant L. Candelore¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

08 Mar 2007

TL;DR: In this paper, a method of carrying out a search using a search engine consistent with certain embodiments involves extracting selected text from a video frame containing text by optical character recognition (OCR) processing of the selected text, loading the text extracted from the OCR processing as a search string into search engine; executing the search using the search engine operating on the search string; receiving search results from the search engines; and displaying the search results for viewing on a display.

...read moreread less

Abstract: A method of carrying out a search using a search engine consistent with certain embodiments involves extracting selected text from a video frame containing text by optical character recognition (OCR) processing of the selected text from the video frame; loading the text extracted from the OCR processing as a search string into a search engine; executing the search using the search engine operating on the search string; receiving search results from the search engine; and displaying the search results for viewing on a display. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract.

...read moreread less

Proceedings Article•DOI•

Image Preprocessing for Improving OCR Accuracy

[...]

W. Bieniecki¹, Szymon Grabowski¹, W. Rozenberg¹•Institutions (1)

University of Łódź¹

23 May 2007

TL;DR: This paper deals with the preprocessing step before text recognition, specifically with images from a digital camera, and confirms importance of image preprocessing in OCR applications.

...read moreread less

Abstract: Digital cameras are convenient image acquisition devices: they are fast, versatile, mobile, do not touch the object, and are relatively cheap. In OCR applications, however, digital cameras suffer from a number of limitations, like geometrical distortions. In this paper, we deal with the preprocessing step before text recognition, specifically with images from a digital camera. Experiments, performed with the FineReader 7.0 software as the back-end recognition tool, confirm importance of image preprocessing in OCR applications.

...read moreread less

Journal Article•DOI•

A simple and efficient optical character recognition system for basic symbols in printed Kannada text

[...]

R. Sanjeev Kunte, R. D. Sudhaker Samuel

01 Oct 2007-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: An OCR system developed for the recognition of basic characters in printed Kannada text, which can handle different font sizes and font types and can be extended for the Recognition of other south Indian languages, especially for Telugu.

...read moreread less

Abstract: Optical Character Recognition (OCR) systems have been effectively developed for the recognition of printed characters of non-Indian languages. Efforts are on the way for the development of efficient OCR systems for Indian languages, especially for Kannada, a popular South Indian language. We present in this paper an OCR system developed for the recognition of basic characters (vowels and consonants) in printed Kannada text, which can handle different font sizes and font types. Hu’s invariant moments and Zernike moments that have been progressively used in pattern recognition are used in our system to extract the features of printed Kannada characters. Neural classifiers have been effectively used for the classification of characters based on moment features. An encouraging recognition rate of 96.8% has been obtained. The system methodology can be extended for the recognition of other south Indian languages, especially for Telugu.

...read moreread less

Patent•

Systems and Methods for Isolating On-Screen Textual Data

[...]

Robert Rodriguez¹, Eric Brueggemann¹•Institutions (1)

Citrix Systems¹

05 Oct 2007

TL;DR: In this paper, a system for obtaining, recognizing and taking an action on text displayed by an application that is performed in a non-intrusive and application agnostic manner is described.

...read moreread less

Abstract: The systems and methods of the client agent describe herein provides a solution to obtaining, recognizing and taking an action on text displayed by an application that is performed in a non-intrusive and application agnostic manner. In response to detecting idle activity of a cursor on the screen, the client agent captures a portion of the screen relative to the position of the cursor. The portion of the screen may include a textual element having text, such as a telephone number or other contact information. The client agent calculates a desired or predetermined scanning area based on the default fonts and screen resolution as well as the cursor position. The client agent performs optical character recognition on the captured image to determine any recognized text. By performing pattern matching on the recognized text, the client agent determines if the text has a format or content matching a desired pattern, such as phone number. In response to determining the recognized text corresponds to a desired pattern, the client agent displays a user interface element on the screen near the recognized text. The user interface element may be displayed as an overlay or superimposed to the textual element such that it seamlessly appears integrated with the application. The user interface element is selectable to take an action associated with the recognized text.

...read moreread less

Journal Article•DOI•

Restoring 2D Content from Distorted Documents

[...]

Michael S. Brown¹, Mingxuan Sun², Ruigang Yang³, Lin Yun³, W.B. Seales³ - Show less +1 more•Institutions (3)

Nanyang Technological University¹, Georgia Institute of Technology², University of Kentucky³

01 Nov 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a framework to restore the 2D content printed on documents in the presence of geometric distortion and nonuniform illumination, and assumes no parametric model of the document's surface and is suitable for arbitrary distortions.

...read moreread less

Abstract: This paper presents a framework to restore the 2D content printed on documents in the presence of geometric distortion and nonuniform illumination. Compared with text-based document imaging approaches that correct distortion to a level necessary to obtain sufficiently readable text or to facilitate optical character recognition (OCR), our work targets nontextual documents where the original printed content is desired. To achieve this goal, our framework acquires a 3D scan of the document's surface together with a high-resolution image. Conformal mapping is used to rectify geometric distortion by mapping the 3D surface back to a plane while minimizing angular distortion. This conformal "deskewing" assumes no parametric model of the document's surface and is suitable for arbitrary distortions. Illumination correction is performed by using the 3D shape to distinguish content gradient edges from illumination gradient edges in the high-resolution image. Integration is performed using only the content edges to obtain a reflectance image with significantly less illumination artifacts. This approach makes no assumptions about light sources and their positions. The results from the geometric and photometric correction are combined to produce the final output.

...read moreread less

Patent•

Text detection on mobile communications devices

[...]

Frank Siegemund¹•Institutions (1)

Microsoft¹

29 Oct 2007

TL;DR: In this article, a mobile communications device with an integrated camera is directed towards text and a video stream is analyzed in real time to detect one or more words in a specified region of the video frames and to indicate the detected words on a display.

...read moreread less

Abstract: A mobile communications device with an integrated camera is directed towards text. A video stream is analyzed in real time to detect one or more words in a specified region of the video frames and to indicate the detected words on a display. Users can select a word in a video stream and subsequently move or extend the initial selection. It is thus possible to select multiple words. A subregion of the video frame comprising the detected word(s) is pre-processed and compressed before being sent to a remote optical character recognition (OCR) function which may be integrated in an online service such as an online search service.

...read moreread less

Journal Article•

Urdu Nastaleeq Optical Character Recognition

[...]

Zaheer Ahmad, Jehanzeb Khan Orakzai, Inam Shamsher, Awais Adnan

20 Aug 2007-World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering

TL;DR: The characters recognition technique presented here is using the inherited complexity of Urdu script to solve the problem.

...read moreread less

Abstract: This paper discusses the Urdu script characteristics, Urdu Nastaleeq and a simple but a novel and robust technique to recognize the printed Urdu script without a lexicon. Urdu being a family of Arabic script is cursive and complex script in its nature, the main complexity of Urdu compound/connected text is not its connections but the forms/shapes the characters change when it is placed at initial, middle or at the end of a word. The characters recognition technique presented here is using the inherited complexity of Urdu script to solve the problem. A word is scanned and analyzed for the level of its complexity, the point where the level of complexity changes is marked for a character, segmented and feeded to Neural Networks. A prototype of the system has been tested on Urdu text and currently achieves 93.4% accuracy on the average.

...read moreread less

Proceedings Article•DOI•

Computer Assisted Transcription of Handwritten Text Images

[...]

Alejandro Héctor Toselli, Verónica Romero, Enrique Vidal, Luis Rodríguez

23 Sep 2007

TL;DR: A new interactive, on-line framework which, rather than full automation, aims at assisting the human in the proper recognition- transcription process; that is, facilitate and speed up their transcription task of handwritten texts.

...read moreread less

Abstract: To date, automatic handwriting recognition systems are far from being perfect and often they need a post editing where a human intervention is required to check and correct the results of such systems. We propose to have a new interactive, on-line framework which, rather than full automation, aims at assisting the human in the proper recognition- transcription process; that is, facilitate and speed up their transcription task of handwritten texts. This framework combines the efficiency of automatic handwriting recognition systems with the accuracy of the human transcriptor. The best result is a cost-effective perfect transcription of the handwriting text images.

...read moreread less

Journal Article•DOI•

A generalised framework for script identification

[...]

Gopal Datt Joshi¹, Saurabh Garg¹, Jayanthi Sivaswamy¹•Institutions (1)

International Institute of Information Technology¹

12 Nov 2007-International Journal on Document Analysis and Recognition

TL;DR: A generalised, hierarchical framework for script identification is proposed and a set of energy and intensity space features for this task are presented to establish the utility of a global approach to the classification of scripts.

...read moreread less

Abstract: Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script-specific OCR in a multi-lingual environment. In this paper, we model script identification as a texture classification problem and examine a global approach inspired by human visual perception. A generalised, hierarchical framework is proposed for script identification. A set of energy and intensity space features for this task is also presented. The framework serves to establish the utility of a global approach to the classification of scripts. The framework has been tested on two datasets: 10 Indian and 13 world scripts. The obtained accuracy of identification across the two datasets is above 94%. The results demonstrate that the framework can be used to develop solutions for script identification from document images across a large set of script classes.

...read moreread less

Book Chapter•DOI•

Natural Scene Text Understanding

[...]

Céline Mancas-Thillou, Bernard Gosselin

01 Jun 2007

TL;DR: Research on document image analysis entered a new era where breakthroughs are required: traditional document analysis systems fail against this new and promising acquisition mode and main differences and reasons of failures will be detailed in this section.

...read moreread less

Abstract: In a society driven by visual information and with the drastic expansion of low-priced cameras, vision techniques are more and more considered and text recognition is nowadays a fast changing field, which is included in a large spectrum, named text understanding. Previously, text recognition was dealing with documents only; those which were acquired with flatbed, sheet-fed or mounted imaging devices. Recently, handheld scanners such as pen-scanners appeared to acquire small parts of text on a fairly planar surface such as that of a business card. Issues having an impact on image processing are limited to sensor noise, skewed documents and inherent degradations to the document itself. Based on this classical acquisition method, optical character recognition (OCR) systems have been designed for many years to reach a high level of recognition with constrained documents, meaning those falling into traditional layout, with relatively clean backgrounds such as regular letters, forms, faxes, checks and so on and with a sufficient resolution (at least 300 dots per inch (dpi)). With the recent explosion of handheld imaging devices (HIDs), i.e. digital cameras, standalone or embedded in cellular phones or personal digital assistants (PDAs), research on document image analysis entered a new era where breakthroughs are required: traditional document analysis systems fail against this new and promising acquisition mode and main differences and reasons of failures will be detailed in this section. Small, light, and handy, these devices enable the removal of all constraints and all objects, such as natural scenes (NS) in different situations in streets, at home or in planes may be now acquired! Moreover, recent studies [Kim, 2005] announced a decline in scanner sales while projecting that sales of HIDs will keep increasing over the next 10 years.

...read moreread less

Journal Article•DOI•

Assessing Optical Music Recognition Tools

[...]

Pierfrancesco Bellini¹, Ivan Bruno¹, Paolo Nesi¹•Institutions (1)

University of Florence¹

01 Mar 2007-Computer Music Journal

TL;DR: Optical Music Recognition is typically used today to accelerate the conversion from image music sheets into a symbolic music representation that can be manipulated, thus creating new and revised music editions.

...read moreread less

Abstract: 68 Computer Music Journal As digitization and information technologies advance, document analysis and optical-characterrecognition technologies have become more widely used. Optical Music Recognition (OMR), also commonly known as OCR (Optical Character Recognition) for Music, was first attempted in the 1960s (Pruslin 1966). Standard OCR techniques cannot be used in music-score recognition, because music notation has a two-dimensional structure. In a staff, the horizontal position denotes different durations of notes, and the vertical position defines the height of the note (Roth 1994). Models for nonmusical OCR assessment have been proposed and largely used (Kanai et al. 1995; Ventzislav 2003). An ideal system that could reliably read and “understand” music notation could be used in music production for educational and entertainment applications. OMR is typically used today to accelerate the conversion from image music sheets into a symbolic music representation that can be manipulated, thus creating new and revised music editions. Other applications use OMR systems for educational purposes (e.g., IMUTUS; see www.exodus.gr/imutus), generating customized versions of music exercises. A different use involves the extraction of symbolic music representations to be used as incipits or as descriptors in music databases and related retrieval systems (Byrd 2001). OMR systems can be classified on the basis of the granularity chosen to recognize the music score’s symbols. The architecture of an OMR system is tightly related to the methods used for symbol extraction, segmentation, and recognition. Generally, the music-notation recognition process can be divided into four main phases: (1) the segmentation of the score image to detect and extract symbols; (2) the recognition of symbols; (3) the reconstruction of music information; and (4) the construction of the symbolic music notation model to represent the information (Bellini, Bruno, and Nesi 2004). Music notation may present very complex constructs and several styles. This problem has been recently addressed by the MUSICNETWORK and Motion Picture Experts Group (MPEG) in their work on Symbolic Music Representation (www .interactivemusicnetwork.org/mpeg-ahg). Many music-notation symbols exist, and they can be combined in different ways to realize several complex configurations, often without using well-defined formatting rules (Ross 1970; Heussenstamm 1987). Despite various research systems for OMR (e.g., Prerau 1970; Tojo and Aoyama 1982; Rumelhart, Hinton, and McClelland 1986; Fujinaga 1988, 1996; Carter 1989, 1994; Kato and Inokuchi 1990; Kobayakawa 1993; Selfridge-Field 1993; Ng and Boyle 1994, 1996; Couasnon and Camillerapp 1995; Bainbridge and Bell 1996, 2003; Modayur 1996; Cooper, Ng, and Boyle 1997; Bellini and Nesi 2001; McPherson 2002; Bruno 2003; Byrd 2006) as well as commercially available products, optical music recognition—and more generally speaking, music recognition—is a research field affected by many open problems. The meaning of “music recognition” changes depending on the kind of applications and goals (Blostein and Carter 1992): audio generation from a musical score, music indexing and searching in a library database, music analysis, automatic transcription of a music score into parts, transcoding a score into interchange data formats, etc. For such applications, we must employ common tools to provide answers to questions such as “What does a particular percentagerecognition rate that is claimed by this particular algorithm really mean?” and “May I invoke a common methodology to compare different OMR tools on the basis of my music?” As mentioned in Blostein and Carter (1992) and Miyao and Haralick (2000), there is no standard for expressing the results of the OMR process. Assessing Optical Music Recognition Tools

...read moreread less

Patent•

Decision criteria for automated form population

[...]

Sebastien Dabet¹, Marco Bressan¹, Herve Poirier¹•Institutions (1)

Xerox¹

26 Apr 2007

TL;DR: In this article, a method for selecting fields of an electronic form for automatic population with candidate text segments is presented, which includes, for each of a plurality of fields of the form, computing a field exclusion function based on at least one parameter selected from a text length parameter, an optical character recognition error rate, a tagging error rate and a field relevance parameter.

...read moreread less

Abstract: A method is provided for selecting fields of an electronic form for automatic population with candidate text segments. The candidate text segments can be obtained by capturing an image of a document, applying optical character recognition to the captured image to identify textual content, and tagging candidate text segments in the textual content for fields of the form. The method includes, for each of a plurality of fields of the form, computing a field exclusion function based on at least one parameter selected from a text length parameter, an optical character recognition error rate, a tagging error rate, and a field relevance parameter; and determining whether to select the field for automatic population based on the computed field exclusion function.

...read moreread less

Journal Article•DOI•

Text segmentation in color images using tensor voting

[...]

Jaeguyn Lim¹, Jonghyun Park¹, Gerard Medioni¹•Institutions (1)

University of Southern California¹

01 May 2007-Image and Vision Computing

TL;DR: This work proposes a method to drastically improve segmentation using tensor voting as the main filtering step, and first decompose an image into chromatic and achromatic regions, then identifies text layers using Tensor voting, and removes noise using adaptive median filter iteratively.

...read moreread less

Patent•

One-screen reconciliation of business document image data, optical character recognition extracted data, and enterprise resource planning data

[...]

Jean-Jacques Berard, Nicolas Perotin

03 Oct 2007

TL;DR: In this article, a business document is scanned to create an imaged document and a set of extracted data is extracted from the business document image via optical character recognition (OCR) and compared with data in business information management or enterprise resource planning (ERP) system.

...read moreread less

Abstract: Systems and methods of reconciling data from an imaged document. In one embodiment, a business document is scanned to create a business document image. A set of extracted data is extracted from the business document image via optical character recognition (OCR). The set of OCR extracted data is then compared with data in business information management or enterprise resource planning (ERP) system. A set of ERP data is retrieved from the ERP system that relates to the set of OCR extracted data. The retrieved ERP data is than assigned to the set of OCR extracted data to create a set of assigned data. The business document image is then displayed in a business document image pane, the set of OCR extracted data is displayed in the OCR data pane, and the retrieved ERP data is displayed in the ERP data pane. The set of assigned data is validated, and the ERP system is updated with the set of validated, assigned data. In other embodiments, data is extracted from text files without using OCR.

...read moreread less

Book•

Digital document processing : major directions and recent advances

[...]

Bidyut B. Chaudhuri

01 Jan 2007

TL;DR: This book discusses OCR Technologies for Machine Printed and Hand Printed Japanese Text, Meta-Data Extraction from Bibliographic Documents for the Digital Library, and Biometric and Forensic Aspects of Digital Document Processing.

...read moreread less

Abstract: Reading Systems: An Introduction to Digital Document Processing.- Document Structure and Layout Analysis.- OCR Technologies for Machine Printed and Hand Printed Japanese Text.- Multi-Font Printed Tibetan OCR.- On OCR of a Printed Indian Script.- A Bayesian Network Approach for On-line Handwriting Recognition.- New Advances and New Challenges in On-Line Handwriting Recognition and Electronic Ink Management.- Off-Line Roman Cursive Handwriting Recognition.- Robustness Design of Industrial Strength Recognition Systems.- Arabic Cheque Processing System: Issues and Future Trends.- OCR of Printed Mathematical Expressions.- The State of the Art of Document Image Degradation Modelling.- Advances in Graphics Recognition.- An Introduction to Super-Resolution Text.- Meta-Data Extraction from Bibliographic Documents for the Digital Library.- Document Information Retrieval.- Biometric and Forensic Aspects of Digital Document Processing.- Web Document Analysis.- Semantic Structure Analysis of Web Documents.- Bank Cheque Data Mining: Integrated Cheque Recognition Technologies.

...read moreread less

Journal Article•

OCR For Printed Urdu Script Using Feed Forward Neural Network

[...]

Inam Shamsher, Zaheer Ahmad, Jehanzeb Khan Orakzai, Awais Adnan

22 Oct 2007-World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering

TL;DR: In this article, an optical character recognition system for printed Urdu, a popular Pakistani/Indian script and is the third largest understandable language in the world, especially in the subcontinent but fewer efforts are made to make it understandable to computers.

...read moreread less

Abstract: This paper deals with an Optical Character Recognition system for printed Urdu, a popular Pakistani/Indian script and is the third largest understandable language in the world, especially in the subcontinent but fewer efforts are made to make it understandable to computers. Lot of work has been done in the field of literature and Islamic studies in Urdu, which has to be computerized. In the proposed system individual characters are recognized using our own proposed method/ algorithms. The feature detection methods are simple and robust. Supervised learning is used to train the feed forward neural network. A prototype of the system has been tested on printed Urdu characters and currently achieves 98.3% character level accuracy on average .Although the system is script/ language independent but we have designed it for Urdu characters only.

...read moreread less

Proceedings Article•DOI•

Tamper-proofing of Electronic and Printed Text Documents via Robust Hashing and Data-Hiding

[...]

Renato Villán¹, Sviatoslav Voloshynovskiy¹, Oleksiy Koval¹, Frédéric Deguillaume¹, Thierry Pun¹ - Show less +1 more•Institutions (1)

University of Geneva¹

01 Mar 2007

TL;DR: The combination of robust text hashing and text data-hiding technologies as an efficient solution to the problem of authentication and tamper-proofing of text documents that can be distributed in electronic or printed forms is advocated.

...read moreread less

Abstract: In this paper, we deal with the problem of authentication and tamper-proofing of text documents that can be distributed in electronic or printed forms. We advocate the combination of robust text hashing and text data-hiding technologies as an efficient solution to this problem. First, we consider the problem of text data-hiding in the scope of the Gel'fand-Pinsker data-hiding framework. For illustration, two modern text data-hiding methods, namely color index modulation (CIM) and location index modulation (LIM), are explained. Second, we study two approaches to robust text hashing that are well suited for the considered problem. In particular, both approaches are compatible with CIM and LIM. The first approach makes use of optical character recognition (OCR) and a classical cryptographic message authentication code (MAC). The second approach is new and can be used in some scenarios where OCR does not produce consistent results. The experimental work compares both approaches and shows their robustness against typical intentional/unintentional document distortions including electronic format conversion, printing, scanning, photocopying, and faxing.

...read moreread less

Patent•

Methods and systems for recycling and re-use of manufactured items

[...]

Rutherford Peter Bruce Browne

04 Apr 2007

TL;DR: In this paper, the use of an information medium such as a Barcode, Radio Frequency Identification Device (RFID) tag or other machine readable medium, such as may be readable by Optical Character Recognition (OCR), to identify a manufactured item.

...read moreread less

Abstract: The described embodiments involve the use of an information medium, such as a Barcode, Radio Frequency Identification Device (RFID) tag or other machine readable medium, such as may be readable by Optical Character Recognition (OCR), to identify a manufactured item. Alternatively, a Machine Vision Identification System (MVIS) may be used to facilitate the identification of the item. Once an item is identified, the materials and subassemblies it is made of may also be identified and used to facilitate its repair, replacement, refurbishment, remarketing and recycling by either a human or robotic device or combination thereof.

...read moreread less

Collapse