scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Handwritten Syriac character recognition using order structure invariance

23 Aug 2004-Vol. 2, pp 562-565
TL;DR: Experiments indicate that the method tolerates the variations found in real handwriting while having high discrimination power, and is potentially applicable to cursive scripts similar to Syriac such as Aramaic and Arabic.
Abstract: This paper demonstrates how order structure invariance may be used for matching handwritten characters in which non-rigid deformation is tolerated and a training set is not required. We make two contributions. First we show how to define and use order structure invariants together with local object features. Second, we show how recognition by alignment can be accomplished on a sparse set of points by matching order structure invariants. We evaluate the method on the problem of recognising handwritten Syriac characters obtained from historical documents. Experiments indicate that the method tolerates the variations found in real handwriting while having high discrimination power. The method is potentially applicable to cursive scripts similar to Syriac such as Aramaic and Arabic.
Citations
More filters
Proceedings ArticleDOI
22 Aug 2015
TL;DR: Details on the collection of securely dated letter samples from Syriac documents dating between 500 and 1100 CE are given and automatic techniques used to process the initial human input so as to produce high-quality segmented character samples ready for analysis are described.
Abstract: Paleographers study ancient and historical handwriting in order to learn more about documents of significant interest and their creators. Computational tools and methods can aid this task in numerous ways, particularly for languages and scripts that are not widely known today. One project currently underway seeks to gather a collection of securely dated letter samples from Syriac documents dating between 500 and 1100 CE. The set comprises over 60,000 human-selected character samples. This paper gives details on the collection and describes the automatic techniques used to process the initial human input so as to produce high-quality segmented character samples ready for analysis.

12 citations

Proceedings ArticleDOI
05 Jun 2012
TL;DR: This work investigates the special challenge of the graffiti image retrieval problem and proposes a series of novel techniques to overcome the challenges and shows that the proposed bounding box framework outperforms the traditional image retrieval framework with better retrieval results and improved computational efficiency.
Abstract: Research of graffiti character recognition and retrieval, as a branch of traditional optical character recognition (OCR), has started to gain attention in recent years. We have investigated the special challenge of the graffiti image retrieval problem and propose a series of novel techniques to overcome the challenges. The proposed bounding box framework locates the character components in the graffiti images to construct meaningful character strings and conduct image-wise and semantic-wise retrieval on the strings rather than the entire image. Using real world data provided by the law enforcement community to the Pacific Northwest National Laboratory, we show that the proposed framework outperforms the traditional image retrieval framework with better retrieval results and improved computational efficiency.

10 citations


Cites background from "Handwritten Syriac character recogn..."

  • ...and match characters based on their shape and structure information, such as skeleton feature [11, 13], shape context [1], and order structure invariance [3]....

    [...]

Proceedings ArticleDOI
16 Sep 2011
TL;DR: A novel algorithm for offline handwriting style identification and document retrieval is developed by developing a feature vector based upon the estimated affine transformation of actual observed characters, character parts, and voids within characters as compared to a hypothetical average or ideal form.
Abstract: Thousands of documents written in Syriac script by early Christian theologians are of unknown provenance and uncertain date, partly due to a shortage of human expertise. This paper addresses the problem of attribution by developing a novel algorithm for offline handwriting style identification and document retrieval, demonstrated on a set of documents in the Estrangelo variant of Syriac writing. The method employs a feature vector based upon the estimated affine transformation of actual observed characters, character parts, and voids within characters as compared to a hypothetical average or ideal form. Experiments on seventy-six pages from nineteen Syriac manuscripts written by different scribes show that the method can identify pages written in the same hand with high precision, even with documents that exhibit various challenging forms of degradation.

9 citations


Cites background or methods from "Handwritten Syriac character recogn..."

  • ...Clocksin’s studies of handwriting recognition on Estrangelo texts [6, 5]....

    [...]

  • ...Automatic identification should be feasible [2, 5], but is left as future work....

    [...]

  • ...Most obviously, the selection of character samples should be automated, perhaps using Clocksin’s method [5] or some other....

    [...]

Proceedings ArticleDOI
07 Nov 2009
TL;DR: This paper presents a method to assist the indexation of digitized Syriac manuscripts based on a word spotting approach that should locate all the occurrences of a certain query word image.
Abstract: This paper presents a method to assist the indexation of digitized Syriac manuscripts. Syriac belongs to the Aramaic branch of Semitic languages, it is written from right to left intentionally tilted by an angle of approximately 45°. The proposed method is based on a word spotting approach that should locate all the occurrences of a certain query word image. The method is based on a selective sliding window technique from which directional features are extracted. Matching between features is done using Euclidean distance correspondence. The proposed method does not require any prior information, it is also fully independent of a word to character segmentation algorithm, which would be extremely difficult to realize due to the tilted nature of the handwriting.

7 citations


Cites background from "Handwritten Syriac character recogn..."

  • ...Besides the works of William Clocksin [1], and [2], and some works of our own [3], no work has been published on Syriac manuscripts analysis and recognition, supporting the idea that the community of researchers working on this type of documents is very restricted....

    [...]

  • ...In some cases, a false occurrence may be a derivative of the query word, since almost all grammatical functions in Syriac are written as prefixes and suffixes not as separate words [1], and [2]....

    [...]

Proceedings ArticleDOI
18 Jun 2008
TL;DR: A method that should find all occurrences of a certain query word image, based on a selective sliding window technique, from which to extract directional features and afterwards perform a matching using Euclidean distance correspondence between features is developed.
Abstract: This paper presents a contribution to Word Spotting applied for digitized Syriac manuscripts. The Syriac language was wrongfully accused of being a dead language and has been set aside by the domain of handwriting recognition. Yet it is a very fascinating handwriting that combines the word structure and calligraphy of the Arabic handwriting with the particularity of being intentionally written tilted by an angle of approximately 45deg. For the spotting process, we developed a method that should find all occurrences of a certain query word image, based on a selective sliding window technique, from which we extract directional features and afterwards perform a matching using Euclidean distance correspondence between features. The proposed method does not require any prior information, and does not depend of a word to character segmentation algorithm which would be extremely complex to realize due to the tilted nature of the handwriting.

7 citations

References
More filters
Journal ArticleDOI
TL;DR: An algorithm--the TPS-RPM algorithm--with the thin-plate spline (TPS) as the parameterization of the non-rigid spatial mapping and the softassign for the correspondence is developed.

1,633 citations


"Handwritten Syriac character recogn..." refers methods in this paper

  • ...Affine [10], projective [11], and non-rigid 2D cubic spline [6] transforms have been used....

    [...]

Book ChapterDOI
01 Jan 2000
TL;DR: A recent user survey about cognition aspects of image retrieval shows that users are more interested in retrieval by shape than by color and texture, and systems such as IBM’s Query By Image Content, QBIC, is relatively successful in retrieving by colors, but performs poorly when searching on shape.
Abstract: Large image databases are used in an extraordinary number of multimedia applications in fields such as entertainment, business, art, engineering, and science. Retrieving images by their content, as opposed to external features, has become an important operation. A fundamental ingredient for content-based image retrieval is the technique used for comparing images. There are two general methods for image comparison: intensity based (color and texture) and geometry based (shape). A recent user survey about cognition aspects of image retrieval shows that users are more interested in retrieval by shape than by color and texture [62]. However, retrieval by shape is still considered one of the most difficult aspects of content-based search. Indeed, systems such as IBM’s Query By Image Content, QBIC [57], perhaps one of the most advanced image retrieval systems to date, is relatively successful in retrieving by color and texture, but performs poorly when searching on shape. A similar behavior is exhibited in the new Alta Vista photo finder [10].

636 citations


Additional excerpts

  • ...Representing the natural deformations of shapes is an important problem in shape recognition [16], and many different approaches are found in the literature including point distribution models [8], the finite element method [13], elastic templates [9], nonlinear active models [15] and mixture models [1]....

    [...]

Journal ArticleDOI
TL;DR: A fundamental open problem in computer vision—determining pose and correspondence between two sets of points in space—is solved with a novel, fast, robust and easily implementable algorithm using a combination of optimization techniques.

532 citations


"Handwritten Syriac character recogn..." refers methods in this paper

  • ...Affine [10], projective [11], and non-rigid 2D cubic spline [6] transforms have been used....

    [...]

Journal ArticleDOI
TL;DR: Improved formulation of modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available image information, and is applicable to data of any dimensionality.
Abstract: Modal matching is a new method for establishing correspondences and computing canonical descriptions. The method is based on the idea of describing objects in terms of generalized symmetries, as defined by each object's eigenmodes. The resulting modal description is used for object recognition and categorization, where shape similarities are expressed as the amounts of modal deformation energy needed to align the two objects. In general, modes provide a global-to-local ordering of shape deformation and thus allow for selecting which types of deformations are used in object alignment and comparison. In contrast to previous techniques, which required correspondence to be computed with an initial or prototype shape, modal matching utilizes a new type of finite element formulation that allows for an object's eigenmodes to be computed directly from available image information. This improved formulation provides greater generality and accuracy, and is applicable to data of any dimensionality. Correspondence results with 2D contour and point feature data are shown, and recognition experiments with 2D images of hand tools and airplanes are described. >

527 citations

Journal ArticleDOI
TL;DR: This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.
Abstract: Off-line recognition requires transferring the text under consideration into an image file. This represents the only available solution to bring the printed materials to the electronic media. However, the transferring process causes the system to lose the temporal information of that text. Other complexities that an off-line recognition system has to deal with are the lower resolution of the document and the poor binarisation, which can contribute to readability when essential features of the characters are deleted or obscured. Recognising Arabic script presents two additional challenges: orthography is cursive and letter shape is context sensitive. Certain character combinations form new ligature shapes, which are often font-dependent. Some ligatures involve vertical stacking of characters. Since not all letters connect, word boundary location becomes an interesting problem, as spacing may separate not only words, but also certain characters within a word. Various techniques have been implemented to achieve high recognition rates. These techniques have tackled different aspects of the recognition system. This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.

207 citations


"Handwritten Syriac character recogn..." refers background in this paper

  • ...From a character recognition perspective, Syriac is similar to Arabic, and the existing research in Arabic character recognition has been comprehensively surveyed [12] recently....

    [...]