scispace - formally typeset
Open AccessProceedings ArticleDOI

A segmentation method for touching italic characters

TLDR
By improving the recognition rate on italic fonts, this work uses slant projection and contour analysis and the shortest path approach for accurately locating the cut path of each candidate segmentation point.
Abstract
Segmentation is an essential part of a recognition system. It is difficult to handle touching characters, especially for italic fonts. We present a method to achieve the accurate segmentation of touching italic characters. It is free of slant correction, so extra noises will not be introduced. We use slant projection and contour analysis to find the segmentation points. Then the shortest path approach is adopted for accurately locating the cut path of each candidate segmentation point. Based on dynamic programming, we can find the best segmentation result from those cut paths. By this method, we can improve the recognition rate on italic fonts.

read more

Citations
More filters
Journal ArticleDOI

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

TL;DR: It is concluded that successful results can be achieved by considering the special specifications of Farsi and Arabic manuscripts and the proposed overall and non-uniform deslanting methods.
Proceedings ArticleDOI

Italic or Roman: Word Style Recognition without A Priori Knowledge for Old Printed Documents

TL;DR: An Italic/Roman word type recognition system without a priori knowledge on the characters' font is presented and the results show a ratio of 100 %recognition for Italic words and 97.2 % for Roman words.

Contribution à la numérisation des documents imprimés du XVIIIème siècle

Loris Eynard
TL;DR: The Gazette de Leyde as discussed by the authors is a journal politique parmi les plus influent du XVIIIeme siecle and represente un corpus de plus de 140, 000 pages reparti sur plus d'un Siecle de parution.
Proceedings ArticleDOI

Italic detection and rectification

TL;DR: The rationale of the proposed method is that the difference of certain features derived from italic style characters after shear transformation will be canceled, whereas the difference will be more obvious for non-italic style (normal style) characters.
References
More filters
Journal ArticleDOI

A survey of methods and strategies in character segmentation

TL;DR: H holistic approaches that avoid segmentation by recognizing entire character strings as units are described, including methods that partition the input image into subimages, which are then classified.
Book

Handbook of Character Recognition and Document Image Analysis

TL;DR: Arabic character recognition, A. Amin automatic reading of braille documents, and Antonacopoulos techniques for improving OCR results.
Journal ArticleDOI

Machine printed character segmentation —; An overview

TL;DR: An overview of the character segmentation techniques in machine-printed documents is presented, which will cover techniques for segmenting uniformed or proportional fonts, broken and touching characters; techniques based on text image features and techniquesbased on recognition results.
Journal ArticleDOI

Segmentation of touching characters in printed document recognition

TL;DR: A dynamic recursive segmentation algorithm is developed for effectively segmenting touching characters based on both pixel and profile projections using contextual information and spell checking to correct errors caused by incorrect recognition and segmentation.
Journal ArticleDOI

Segmentation of merged characters by neural networks and shortest path

TL;DR: A hybrid method is proposed which combines a neural network-based deferred segmentation scheme with conventional immediate segmentation techniques, and significantly improves its ability to read omnifont document text.
Related Papers (5)