scispace - formally typeset
Search or ask a question
Topic

Devanagari

About: Devanagari is a research topic. Over the lifetime, 655 publications have been published within this topic receiving 7428 citations. The topic is also known as: Deva nagari & Hindi Script.


Papers
More filters
Journal ArticleDOI
TL;DR: This work focuses on the Devanagari script that has 46 categories of characters that makes training a difficult task, especially when the number of samples are few, and proposes deep structure learning of image quadrants, based on learning the hidden state activations derived from convolutional neural networks that are trained separately on five image quadrant.
Abstract: Ancient Indic languages were written in the Devanagari script from which most of the modern-day Indic writing systems have evolved. The digitisation of ancient Devanagari manuscripts, now archived in national museums, is a part of the language documentation and digital archiving initiative of the Government of India. The challenge in digitizing these handwritten scripts is the lack of adequate datasets for training machine learning models. In our work, we focus on the Devanagari script that has 46 categories of characters that makes training a difficult task, especially when the number of samples are few. We propose deep structure learning of image quadrants, based on learning the hidden state activations derived from convolutional neural networks that are trained separately on five image quadrants. The second phase of our learning module comprises of a deep neural network that learns the hidden state activations of the five convolutional neural networks, fused by concatenation. The experiments prove that the proposed deep structure learning outperforms the state of the art.

9 citations

Proceedings ArticleDOI
01 Dec 2017
TL;DR: A dataset of 24253 News articles was extracted and the extractive summaries results were evaluated on various parameters with manual gold summaries of exactly 60 words each.
Abstract: With immense amount of data growing on web in Hindi, a text summariser would be helpful in summarising Government data, medical reports, news, and research articles. Hindi is the fourth most-spoken first language in the world. Hindi written in the Devanagari script is the official language of the Government of India. There is no public dataset for extractive summarisation available in Hindi and thus a dataset of 24253 News articles was extracted and the extractive summaries results were evaluated on various parameters with manual gold summaries of exactly 60 words each.

9 citations

Book ChapterDOI
15 Dec 2005
TL;DR: Experimental results show the validity and efficiency of the developed scheme for recognition of characters of this script, and the major challenge in developing the proposed scheme lay in striking the right balance between definiteness and flexibility to derive optimal solutions for out of sample data.
Abstract: In this paper, a Devanagari script recognition scheme based on a novel algorithm is proposed. Devanagari script poses new challenges in the field of pattern recognition primarily due to the highly cursive nature of the script seen across its diverse character set. In the proposed algorithm, the character is initially subjected to a simple noise removal filter. Based on a reference co-ordinate system, the significant contours of the character are extracted and characterized as a contour set. The recognition of the character involves comparing these contour sets with those in the enrolled database. The matching of these contour sets is achieved by characterizing each contour based on its length, its relative position in the reference co-ordinate system and an interpolation scheme which eliminates displacement errors. In the Devanagari script, similar contour sets are observed among few characters, hence this method helps to filter out disparate characters and narrow down the possibilities to a limited set. The next step involves focusing on the subtle yet vital differences between the similar contours in this limited set. This is done by a prioritization scheme which concentrates only on those portions of character which reflect its uniqueness. The major challenge in developing the proposed scheme lay in striking the right balance between definiteness and flexibility to derive optimal solutions for out of sample data. Experimental results show the validity and efficiency of the developed scheme for recognition of characters of this script.

9 citations

Proceedings ArticleDOI
25 Aug 2013
TL;DR: This paper developed a novel part-based model technique that can use either the machine printed or the handwritten dataset for training on Devanagari character recognition from scene images and presents the results on the publicly available dataset (DSIW2K) containing images of street scenes taken in New Delhi, India.
Abstract: Character recognition in scene images is an extremely challenging task. Although several techniques are reported performing well, they pertain to English only. This paper focuses on Devanagari character recognition from scene images. Devanagari script is very popular language and has very typical characteristics different from other scripts, particularly English. Combination of basic Devanagari consonants and vowels in multi-variegated ways can yield as many as 100s of characters. Building a classifier to recognize all these classes will be a difficult task. To alleviate this problem, a novel part-based model technique is proposed. 40 basic classes were identified from the Devanagari script for the same purpose. The technique was proposed so as to classify an instance of one these classes in any given test sample. Procuring a large dataset for training is not feasible in the case of scene images. To simultaneously solve this problem, we developed our technique that can use either the machine printed or the handwritten dataset for training. We present our results on the publicly available dataset (DSIW2K) containing images of street scenes taken in New Delhi, India.

9 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
77% related
Support vector machine
73.6K papers, 1.7M citations
75% related
Image segmentation
79.6K papers, 1.8M citations
74% related
Convolutional neural network
74.7K papers, 2M citations
74% related
Encryption
98.3K papers, 1.4M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
202298
202148
202061
201938
201843