scispace - formally typeset
Search or ask a question

Showing papers on "Devanagari published in 2014"


Proceedings ArticleDOI
05 Dec 2014
TL;DR: This paper describes the submission for FIRE 2014 Shared Task on Transliterated Search, which features two sub-tasks: Query word labeling and Mixed-script Ad hoc retrieval for Hindi Song Lyrics.
Abstract: This paper describes our submission for FIRE 2014 Shared Task on Transliterated Search. The shared task features two sub-tasks: Query word labeling and Mixed-script Ad hoc retrieval for Hindi Song Lyrics.Query Word Labeling is on token level language identification of query words in code-mixed queries and back-transliteration of identified Indian language words into their native scripts. We have developed letter based language models for the token level language identification of query words and a structured perceptron model for back-transliteration of Indic words.The second subtask for Mixed-script Ad hoc retrieval for Hindi Song Lyrics is to retrieve a ranked list of songs from a corpus of Hindi song lyrics given an input query in Devanagari or transliterated Roman script. We have used edit distance based query expansion and language modeling followed by relevance based reranking for the retrieval of relevant Hindi Song lyrics for a given query.

71 citations


Journal ArticleDOI
TL;DR: This paper evaluated the results by dividing the speech sample into some segments and used the zero crossing rate and energy calculations to separate the voiced and unvoiced parts of speech and suggested that zero crossing rates are low for voiced part and high for unvoicing part.
Abstract: In speech analysis, the voiced-unvoiced decision is usually performed in extracting the information from the speech signals. In this paper, we performed two methods to separate the voicedunvoiced parts of speech from a speech signal. These are zero crossing rate (ZCR) and energy. In here, we evaluated the results by dividing the speech sample into some segments and used the zero crossing rate and energy calculations to separate the voiced and unvoiced parts of speech. The results suggest that zero crossing rates are low for voiced part and high for unvoiced part where as the energy is high for voiced part and low for unvoiced part. Therefore, these methods are proved more effective in separation of voiced and unvoiced speech

41 citations


Journal ArticleDOI
TL;DR: A statistical script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models and an exhaustive study of filler models and background models for better representation of background or non-keyword text.

38 citations


Book ChapterDOI
01 Jan 2014
TL;DR: From the experimental results, it has been found that the methodology provides higher recognition accuracies with lesser or equal numbers of features selected for each dataset.
Abstract: A new feature selection methodology on the basis of features’ combined class separability power, using the framework of Axiomatic Fuzzy Set (AFS) theory has been proposed here The AFS theory provides the rules for logic operations needed to interpret the combinations of features from the fuzzy feature set Based on these combinational rules, class separability power of the combined features is determined and subsequently the most powerful subset of the feature set is selected The performance of this methodology is evaluated upon for recognition of handwritten numerals of five popular Indic scripts viz Bangla, Devanagari, Roman, Telugu and Arabic with SVM based classifier using gradient based directional feature set and quad-tree based longest-run feature set separately and compared with six widely used feature selection techniques From the experimental results, it has been found that the methodology provides higher recognition accuracies with lesser or equal numbers of features selected for each dataset

32 citations


Journal ArticleDOI
TL;DR: The proposed classification system preprocess and normalize the 27000 handwritten character images into 30x30 pixels images and divides them into zones and produces three classes depending on presence or absence of vertical bar.
Abstract: Compound character recognition of Devanagari script is one of the challenging tasks since the characters are complex in structure and can be modified by writing combination of two or more characters. These compound characters occurs 12 to 15% in the Devanagari Script. The moment based techniques are being successfully applied to several image processing problems and represents a fundamental tool to generate feature descriptors where the Zernike moment technique has a rotation invariance property which found to be desirable for handwritten character recognition. This paper discusses extraction of features from handwritten compound characters using Zernike moment feature descriptor and proposes SVM and k-NN based classification system. The proposed classification system preprocess and normalize the 27000 handwritten character images into 30x30 pixels images and divides them into zones. The pre-classification produces three classes depending on presence or absence of vertical bar. Further Zernike moment feature extraction is performed on each zone. The overall recognition rate of proposed system using SVM and k-NN classifier is upto 98.37%, and 95.82% respectively.

26 citations


Proceedings ArticleDOI
01 Oct 2014
TL;DR: A new strategy for the segmentation of conjuncts, and overlapping characters in Devanagari script on Hindi language is shown, focused around Cluster Detection technique and gives 95% correctness for segmenting touching, conjunct characters and 88% effectiveness for overlapping characters.
Abstract: Optical Character Recognition alludes to the methodology of taking images or photos of letters or typewritten content and changing over them into information that a machine can easily interpret, e.g. organizations and libraries taking physical duplicates of books, magazines, or other old printed material and utilizing OCR to put them into computers. Segmentation is the indispensable and most difficult part of OCR process, and it gets to be additionally difficult with handwritten text due to varieties in writing styles and presence of abnormalities. This paper shows a new strategy for the segmentation of conjuncts, and overlapping characters in Devanagari script on Hindi language. The proposed algorithm is focused around Cluster Detection technique and gives 95% correctness for segmenting touching, conjunct characters and 88% effectiveness for overlapping characters.

22 citations


Proceedings ArticleDOI
15 Dec 2014
TL;DR: The recent study of a novel combination of two feature vectors for holistic recognition of offline handwritten word images shows sharp improvement in recognition accuracy over the use of any of the individual feature representation schemes.
Abstract: In this article, we describe our recent study of a novel combination of two feature vectors for holistic recognition of offline handwritten word images. In the literature, both contour and skeleton based feature representations have been studied for offline handwriting recognition purpose. However, to the best of our knowledge, there is no such study in which combination of the two feature representations have been considered for the purpose. In the proposed recognition scheme, we use multiclass SVM as the classifier. We have implemented the proposed approach for holistic recognition of Devanagari handwritten town names and tested its performance on a large handwritten word sample database of 100 Indian town names written in Devanagari. Experimental results show sharp improvement in recognition accuracy over the use of any of the individual feature representation schemes. The proposed approach is script independent and can be used for development of a holistic handwritten word image recognition of any script.

20 citations


Proceedings ArticleDOI
03 Apr 2014
TL;DR: This work has presented a method to recognize the handwritten Marathi numerals using multilayer feed-forward neural network, and the overall recognition rate is 97%.
Abstract: Marathi is one of the ancient Indian languages majorly spoken in the state of Maharashtra. Marathi is one of the Devanagari script and the literals and numerals are almost similar to Hindi. Recognition of handwritten Marathi numerals is quite challenging task because people have the practice of writing these numerals in variant ways. In this work we have presented a method to recognize the handwritten Marathi numerals using multilayer feed-forward neural network. The scanned document image is pre-processed to eliminate the noise and care is taken to link the broken characters. Each numeral is segmented from the document and it is resized to 7 × 5 pixels using cubic interpolation. While resizing a technique is used to provide better representation for every pixel in segmented numeral. This resized numeral is converted into a vector with 35 values before inputting it to the neural network. We have used 100 sets containing 1000 numerals for this experimentation, of which 50 sets are used for training the network and 50 sets for the testing purpose. The overall recognition rate of the proposed method is 97%.

16 citations


Proceedings ArticleDOI
01 Dec 2014
TL;DR: A novel approach for Devanagari text extraction from natural scene images using mathematical morphological operations to extract the headlines and the effectiveness of the adaptive thresholding approach was observed.
Abstract: In scenic images, information in the form of text provides vital clues for most applications based on image processing. These include assisted navigation content based image retrieval, automatic geocoding and understanding the scene. But in a multicolored complex background, it is quite a daunting task to locate the text. This task is daunting because of non-uniformity in illumination, complexity of the backdrop, and differences in the size font & line-orientation of the text. We propose a novel approach for Devanagari text extraction from natural scene images in this paper. We can use a text-to-speech engine or Optical Character Reader to recognize the extracted text. The basis of our scheme is to analyze the CCs. This is done to extract Devanagari text from scenic images captured by camera. The presence of head line is unique to this script. Our scheme makes use of mathematical morphological operations to extract the headlines. Also the binarization of scenic images was studied. Here the effectiveness of the adaptive thresholding approach was observed. The algorithm was tested on Devanagari text contained within a collection of 100 scenic images.

15 citations


Journal ArticleDOI
TL;DR: In this paper, the authors follow the development of these scripts, demonstrating how they gave rise to the new scripts in South India, Indonesia and the Philippines, and demonstrate the basic relationships between these scripts with cursory descriptions of their structural correspondences.
Abstract: Several scripts in northern and southern India, Indonesia and the Philippines developed from informal varieties of Devanagari restricted to intimate, shorthand-like uses by members of mercantile occupations. The mercantile varieties took a characteristic quasi-abjad form with postconsonantal vowels unspelt. This paper follows the development of these scripts, demonstrating how they gave rise to the new scripts in South India, Indonesia and the Philippines. The basic relationships between these scripts are demonstrated with cursory descriptions of their structural correspondences, followed by a discussion for each of the ways the orthographic system changed back to a more classic abugida as a result of borrowing from prestige contact scripts or innovations in the use of existing resources. In addition to these more typical phenomena, we describe some quirky spelling conventions in Sumatran, Sulawesi and Philippine scripts, tracing them to practices used to teach combinations of vowel and coda signs on cons...

14 citations


Proceedings ArticleDOI
01 Dec 2014
TL;DR: A Word and Character Segmentation method for machine printed Devanagari text and some basic morphological operations on the scanned document images are proposed and got much better results.
Abstract: Finding Structural Layout, Text Line Segmentation, Word Level Segmentation and Character Level Segmentation is major step in offline OCR systems for Devanagari Script in Document Image Processing. This paper proposes a Word and Character Segmentation method for machine printed Devanagari text. A complete word and character segmentation system for Devanagari printed text is presented here. Sometimes, interline space and fused characters make line segmentation and character segmentation a difficult task respectively. We have tested our method on documents in Marathi scripts. A novel technique of character segmentation for printed Devanagari text is presented here. After removing the Shirorekha (header line) of Devanagari text, the bounding boxes are used to surround the segmented characters. Results obtained from this method are encouraging because of morphological operations. In this method we are proposing some basic morphological operations on the scanned document images and got much better results.

Journal ArticleDOI
TL;DR: This article proposes the use of constrained clustering using automatically derived domain constraints to find a minimal set of stroke clusters and results indicate substantial improvement in recognition accuracy and/or reduction in memory and computation time when compared to alternate modeling techniques.
Abstract: Writer-specific character writing variations such as those of stroke order and stroke number are an important source of variability in the input when handwriting is captured “online” via a stylus and a challenge for robust online recognition of handwritten characters and words. It has been shown by several studies that explicit modeling of character allographs is important for achieving high recognition accuracies in a writer-independent recognition system. While previous approaches have relied on unsupervised clustering at the character or stroke level to find the allographs of a character, in this article we propose the use of constrained clustering using automatically derived domain constraints to find a minimal set of stroke clusters. The allographs identified have been applied to Devanagari character recognition using Hidden Markov Models and Nearest Neighbor classifiers, and the results indicate substantial improvement in recognition accuracy and/or reduction in memory and computation time when compared to alternate modeling techniques.

Proceedings ArticleDOI
01 Dec 2014
TL;DR: This paper describes a new feature set, called the extended directional features (EDF) for use in the recognition of online handwritten strokes, specifically to recognize strokes that form a basis for producing Devanagari script.
Abstract: This paper describes a new feature set, called the extended directional features (EDF) for use in the recognition of online handwritten strokes. We use EDF specifically to recognize strokes that form a basis for producing Devanagari script, which is the most widely used Indian language script. It should be noted that stroke recognition in handwritten script is equivalent to phoneme recognition in speech signals and is generally very poor and of the order of 20% for singing voice. Experiments are conducted for the automatic recognition of isolated handwritten strokes. Initially we describe the proposed feature set, namely EDF and then show how this feature can be effectively utilized for writer independent script recognition through stroke recognition. Experimental results show that the extended directional feature set performs well with about 65+% stroke level recognition accuracy for writer independent data set.

Proceedings ArticleDOI
01 Nov 2014
TL;DR: A text and script independent method is proposed for identification of writer for handwritten scripts/languages using correlation and homogeneity properties of Gray Level Co-occurrence Matrices of the handwritten document images.
Abstract: If a set of writers know writing of more than one scripts/languages, identification of such writers is difficult and challenging problem of research One method is to design a script independent writer identification algorithm to identify the writer of underlying handwritten document Hence a text and script independent method is proposed for identification of writer for handwritten scripts/languages using correlation and homogeneity properties of Gray Level Co-occurrence Matrices of the handwritten document images The feature vector of size 40 is obtained from each input handwritten document image Handwritten documents are collected from the same 100 writers in Roman, Kannada and Devanagari scripts Using nearest neighbor classifier with modified 4-fold cross validation the results for writer identification are obtained Identification accuracies are 8275%, 8275% and 8525% when the handwritten documents are in only one script Roman, Kannada and Devanagari scripts respectively The writer identification rates are 806250%, 8375% and 84% respectively for Roman-Kannada, Roman-Devanagari and Kannada-Devanagari handwritten input documents The writer identification rate is 821995% for the input documents of Roman-Kannada-Devanagari

Journal ArticleDOI
TL;DR: An algorithm for segmentation of skewed lines, touching lines present in the text document and broken parts in upper modifiers or space present between the upper modifiers is developed.
Abstract: days, a vast research is going in Optical Character Recognition (OCR) of handwritten Documents in Indian scripts. A lot of handwritten data is existed in Devanagari script which is still to be recognized. Segmentation is the key step of OCR process. Segmentation is the process of extracting the valuable segments from the text document which are used in the process of recognition of characters. Line segmentation is the process of segmenting the text document into lines. Afterwards, word segmentation and character segmentation is carried out. This paper only deals with the Line segmentation of handwritten documents in Hindi. Devanagari script is the basic script to write Hindi, Marathi, Sanskrit and Nepali languages. In this paper the brief introduction of various existing techniques for segmentation of handwritten text is discussed. Also, develops an algorithm for segmentation of skewed lines, touching lines present in the text document and broken parts in upper modifiers or space present between the upper modifiers. This algorithm is implemented on large database collected from various writers. The proposed algorithm integrated the Projection based method, gap detection between text lines and neighbor pixel analysis method.

Journal ArticleDOI
TL;DR: FMRI data analyses revealed that reading Devanagari words elicited robust activations in bilateral occipito-temporal, inferior frontal and precentral regions as well as both cerebellar hemispheres, and was attributed to increased visual processing demands arising from the complex visuospatial arrangement of symbols in this ancient script.
Abstract: Objectives: The current study used functional MRI (fMRI) to obtain a comprehensive understanding of the neural network underlying visual word recognition in Hindi/Devanagari, an alphasyllabic - partly alphabetic and partly syllabic Indian writing system on which little research has hitherto been carried out. Materials and Methods: Sixteen (5F, 11M) neurologically healthy, native Hindi/Devanagari readers aged 21 to 50 named aloud 240 Devanagari words which were either visually linear - had no diacritics or consonant ligatures above or below central plane of text, e.g. फल, वाहन, or nonlinear - had at least one diacritic and/or ligature, e.g. फल, किरण, and which further included 120 words each of high and low frequency. Words were presented in alternating high and low frequency blocks of 10 words each at 2s/word in a block design, with linear and nonlinear words in separate runs. Word reading accuracy was manually coded, while fMRI images were acquired on a 3T scanner with an 8-channel head-coil, using a T2*-weighted EPI sequence (TR/TE = 2s/35ms). Results: After ensuring high word naming accuracy (M = 97.6%, SD = 2.3), fMRI data analyses (at FDR P < 0.005) revealed that reading Devanagari words elicited robust activations in bilateral occipito-temporal, inferior frontal and precentral regions as well as both cerebellar hemispheres. Other common areas of activation included left inferior parietal and right superior temporal cortices. Primary differences seen between nonlinear and linear word reading networks were in the right temporal areas and cerebellum. Conclusion: Distinct from alphabetic scripts, which are linear in their spatial organization, and recruit a primarily left-lateralized network for word reading, our results revealed a bilateral reading network for Devanagari. We attribute the additional activations in Devanagari to increased visual processing demands arising from the complex visuospatial arrangement of symbols in this ancient script.

Proceedings ArticleDOI
01 Sep 2014
TL;DR: A Gabor filter based technique has been developed for offline script identification from handwritten document images on four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu.
Abstract: Script identification from a document image is a complex real life problem in a multi-script country like India. The work becomes more challenging when handwritten documents are considered. In this paper, a Gabor filter based technique has been developed for offline script identification from handwritten document images. The work is carried out at document level on four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu. A total of 157 handwritten document images are considered from these four scripts for experimentation. The data set is divided into training and test set in 2:1 ratio. A feature vector of 20 dimensions is constructed using Gabor filter and Morphological reconstruction. Finally, using MLP classifier, a recognition accuracy of 95.4% is obtained on test data without any rejection.

Proceedings ArticleDOI
07 Dec 2014
TL;DR: The results of the study shows that with minimal training, a user is able to achieve acceptable comfort and speed with the logical design based keyboard.
Abstract: Indian text input is an area which is now being studied by not only language experts but designers and developers. Since a decade there is work in progress for designs that are intended to develop easy and efficient keyboard for different Indian Languages scripts. Swarachakra (for android) is one novel attempt to resolve this problem. In our studies we found that alphabetical keyboard layout performed better than the Inscript layout for Devanagari Script.This case study discusses the evaluation and evolution of Swarachakra as virtual keyboard. It talks about the various degrees of usability testing which has been done to check its efficacy. User group included were students, adults and elder people with literacy level varying from graduate to low literate. The results of the study shows that with minimal training, a user is able to achieve acceptable comfort and speed with the logical design based keyboard.

01 Jan 2014
TL;DR: A detailed overview of feature extraction and classification techniques for the character recognition process of Indian scripts done in over past few decades is given.
Abstract: The Constitution of India, under its Eight Schedule, has recognized Hindi (in Devanagari Script) and English as Official languages of Union Government, along with other 22 languages as Scheduled languages and given status and official encouragement to these Scheduled Languages. Most of the Optical recognition research work has been done on Devanagari, Telugu, and Bangla scripts etc. Development o f OCR system for Indian scripts has many application areas like preservation of ancient manuscripts and literature written in different Indian scripts and making these available through digital libraries. Feature extraction and classification are the essential steps in the process of character recognition that affects the overall accuracy of the recognition system. This paper gives a detailed overview of feature extraction and classification techniques for the character recognition process of Indian scripts done in over past few decades

Journal ArticleDOI
TL;DR: In this paper, an artificial neural network based classifier and statistical and structural method based feature extraction method has been employed for the recognition of the Devanagari script, which achieved 98% - 99% accuracy except special characters.
Abstract: Handwriting is the most effective way by which civilized people speaks. Devanagari is the basic Script widely used all over India. Many Indian languages like Hindi, Marathi, Rajasthani are based on Devanagari Script. In the proposed work multistage approach i.e. an artificial neural network based classifier and statistical and structural method based feature extraction method has been employed for the recognition of the script. Optical isolated Marathi words are taken as an input image from the scanner. An input image is preprocessed and segmented. The key step is feature extraction, features are extracted in terms of various structural and statistical features like End points, middle bar, loop, end bar, aspect ratio etc. Feature vector is applied to Self organizing map (SOM) which is one of the classifier of an artificial neural Network.SOM is trained for such 3000 different characters collected from 500 persons. The characters are classified into three different classes. The proposed classifier attains 98% - 99% accuracy except special characters.

Proceedings ArticleDOI
05 Dec 2014
TL;DR: The authors used back transliteration to reduce spelling variations, and a set of hand-tailored rules for consonant mapping to take care of breaking and joining of transliterated words, and implemented query labeling of mixed script content using a supervised learning approach where an SVM classifier was trained using character n-grams as features for language identification.
Abstract: Much of the user generated content on the internet is written in their transliterated form instead of in their indigenous script. Due to this search engines receive a large number of transliterated search queries.This paper presents our approach to handle labelling of queries and ad hoc retrieval of documents based on these queries, as part of the FIRE2014 shared task on transliterated search. The content of each document is written in either the native Devanagari script or its transliterated form in Roman script or a combination of both. The queries to retrieve these documents can also be in mixed script. The task is challenging primarily due to the spelling variations that occur in the transliterated form of search queries. This particular problem is addressed by using back transliteration to reduce spelling variations, and a set of hand-tailored rules for consonant mapping. Sub-word indexing is done to take care of breaking and joining of transliterated words. Implementation of query labelling of the mixed script content was done using a supervised learning approach where an SVM classifier was trained using character n-grams as features for language identification. A Naive Bayes classifier was used for classifying transliterated words that can belong to both Hindi and English when looked at individually.The 2 runs submitted by our team (BITS-Lipyantaran) performs best across all metrics for Subtask 2 among all the teams that participated, with a MRR score of 0.8171 and MAP score of 0.6421.

01 Jan 2014
TL;DR: This work exploits the English corpus for coded words of Devanagari script using the technique of Romanization and shows that this approach has a direct application to the standardization of regional languages.
Abstract: Corpus based stemming has been devised to develop stemmers targeting language independent environment. These stemmers are applicable to all languages based on Latin script. In the present work, we exploit the English corpus for coded words of Devanagari script. We use the technique of Romanization and the stemmer is being tested over 100 randomly chosen Hindi words. We show that this approach has a direct application to the standardization of regional languages. For instance, we standardize the Kumauni language.

Journal ArticleDOI
TL;DR: This paper describes a language and font-detection system for Gurmukhi and Devanagari and explains a font conversion system for converting the ASCII based text into Unicode.
Abstract: The digital text written in an Indian script is difficult to use as such. This is because, there are a number of font formats available for typing, and these font-formats are not mutually compatible. Gurmukhi alone has more than 225 popular ASCII-based fonts whereas this figure is 180 in case of Devanagari. To read the text written in a particular font, that font is required to be installed on that system. This paper describes a language and font-detection system for Gurmukhi and Devanagari. It also explains a font conversion system for converting the ASCII based text into Unicode. Therefore, the proposed system works in two stages: the first stage suggests a statistical model for automatic language-detection (i.e., Gurmukhi or Devanagari) and font- detection; the second stage converts the detected text into Unicode as per font detection. Though we could not train our systems for some fonts due to non- availability of font converters but system and its architecture is open to accept any number of languages/fonts in the future. The existing system supports around 150 popular Gurmukhi font encodings and more than 100 popular Devanagari fonts. We have demonstrated the effectiveness of font detection is 99.6% and Unicode conversion is 100% in all the cases.

Journal ArticleDOI
TL;DR: The algorithm is tested on a variety of NS images captured using a digital camera under variable resolutions, lightening conditions having text of different fonts, styles and backgrounds and the results are compared with other standard techniques.
Abstract: This paper presents a binarization method for camera based natural scene (NS) images based on edge analysis and morphological dilation. Image is converted to grey scale image and edge detection is carried out using canny edge detection. The edge image is dilated using morphological dilation and analyzed to remove edges corresponding to non-text regions. The image is binarized using mean and standard deviation of edge pixels. Post processing of resulting images is done to fill gaps and to smooth text strokes. The algorithm is tested on a variety of NS images captured using a digital camera under variable resolutions, lightening conditions having text of different fonts, styles and backgrounds. The results are compared with other standard techniques. The method is fast and works well for camera based natural scene images.


Journal Article
TL;DR: Recognition of Devanagari character consists of Image correction, segmentation and character recognition which uses Eigen space method which uses Gerschgorin's theorem for comparison.
Abstract: Recognition of Devanagari character consists of Image correction, segmentation and character recognition. Image correction digitizes the input characters making it available for further processing. Principle component analysis is used to discover the hidden and unclear part and segmentation separates individual characters to identify each character. The most crucial part of any character recognition system is the process of segmentation as characters are recognized individually. The result of recognition is dependent on the accuracy of segmentation. For extraction and recognition we used Eigen space method which uses Gerschgorin's theorem for comparison. Handwritten Devanagari script is nowadays a popular topic for researchers as less work is done on this topic. Handwritten Devanagari characters are difficult to recognize due to the presence of header line and various modifiers. Recognition of fused characters is also a major concern for researchers as fused character is treated as a single character resulting in an error.

Journal Article
TL;DR: In this article, the authors describe the different techniques of character recognition for Gujarati and Devanagari script for character recognition is usually referred to as OCR and describe the basics of characters recognition, its type, challenges associated with it and the special properties of Gujarati this article.
Abstract: In this paper, we describe the different techniques of character recognition for Gujarati and Devanagari script. Character recognition is usually referred to as OCR. Review of this paper will provide a way for researcher to develop a tool for Gujarati and Devanagari script recognition. This paper describes basics of character recognition, its type, challenges associated with it and the special properties of Gujarati and Devanagari script.

Journal Article
TL;DR: A recognition model is described for recognizing handwritten Devanagari characters and achieves the accuracy rate of recognition which range from 75% to 80%.
Abstract: In this paper, a recognition model is described for recognizing handwritten Devanagari characters. The scanned image database of handwritten Devanagari character form several different writers was used to train and test to this classifier model. This model first preprocess (normalization, binarization, crop) then extracts the feature set. Based on the extracted feature database it classifies the characters. This model achieves the accuracy rate of recognition which range from 75% to 80%.

Journal ArticleDOI
TL;DR: In this work an artificial neural network based classifier and statistical and structural method based feature extraction approach is used for the recognition of the script Devanagari.
Abstract: is the most effective way by which civilized people speaks. Devanagari is the basic Script widely used all over India. Many Indian languages like Hindi, Marathi, Rajasthani are based on Devanagari Script. Devanagari Scripts Hindi language is the third common language used all over the word. In the proposed work an artificial neural network based classifier and statistical and structural method based feature extraction approach is used for the recognition of the script. Optical isolated Marathi Characters are taken as an input image from the scanner. An input image is preprocessed and segmented. Features are extracted in terms of various structural and statistical features like End points, middle bar, loop, end bar, aspect ratio etc. Feature vector is applied to Self organizing map (SOM) which is one of the classifier of an artificial neural Network.SOM is trained for such 5000 different characters collected from 500 persons. The characters are classified into three different classes. The proposed classifier attains 93% accuracy.

Journal ArticleDOI
TL;DR: An attempt is made to address the most important results reported so far and it is also tried to highlight the beneficial directions of the research till date.
Abstract: In India, many people use Devanagari script for documentation. There has been a significant improvement in the research related to the recognition of printed as well as handwritten Devanagari text in the past few years. Basically Character recognition techniques associate a symbolic identity with the image of a character. Since creating an algorithm with a one hundred percent correct recognition rate is quite probably impossible in our world of noise and different font styles, it is important to design character recognition algorithms with these failures in mind so that when mistakes are inevitably made, they will at least be understandable and predictable to the person working with the program. An attempt is made to address the most important results reported so far and it is also tried to highlight the beneficial directions of the research till date.