Proceedings ArticleDOI
Managing multilingual OCR project using XML
TLDR
This paper describes how a new XML based tagging scheme has been exploited to achieve the objectives of the project aimed at developing OCR for 11 scripts of Indian origin for which mature OCR technology was not available.Abstract:
This paper presents an XML-based scheme for managing a large multilingual OCR project. In particular we describe how a new XML based tagging scheme has been exploited to achieve the objectives of the project. Managing a large multi-lingual OCR project involving multiple research groups, developing script specific and script independent technologies in a collaborative fashion is a challenging problem. In this paper, we present some of the software and data management strategies designed for the project aimed at developing OCR for 11 scripts of Indian origin for which mature OCR technology was not available.read more
Citations
More filters
Proceedings ArticleDOI
Experiences of integration and performance testing of multilingual OCR for printed Indian scripts
Deepak Arya,C. V. Jawahar,Chakravorty Bhagvati,Tushar Patnaik,Bidyut B. Chaudhuri,Gurpreet Singh Lehal,Santanu Chaudhury,A. G. Ramakrishna +7 more
TL;DR: The project is an attempt to implement an integrated platform for OCR of different Indian languages and currently is being enhanced for handling the space and time constraints, achieving higher recognition accuracies and adding new functionalities.
Overview of Xml based Knowledge Representation using Scripts
Pranita P. Deshmukh,M. S. Ali +1 more
TL;DR: A symbol vocabulary and a system of logic are combined to enable inferences about elements in the knowledge representation to create new knowledge representation sentences by using various techniques.
Proceedings ArticleDOI
Information retrieval system based on ontology
TL;DR: Over the years, the volume of information available through the world wide web has been increasing continuously, and never has so much information readily available and shared among so many people.
References
More filters
Proceedings ArticleDOI
Multimedia ontology learning for automatic annotation and video browsing
TL;DR: This work uses MOWL, a multimedia extension of Web Ontology Language (OWL) which is capable of describing domain concepts in terms of their media properties and of capturing the inherent uncertainties involved.
Proceedings ArticleDOI
Schema extraction for multimedia XML document retrieval
Jong P. Yoon,Sung-Rim Kim +1 more
TL;DR: This paper proposes a method of schema extraction for multimedia XML data that leveled schemas are then leveled with respect to the frequency of topological document structures in a database.
Book ChapterDOI
Building Data Sets for Indian Language OCR Research
TL;DR: This chapter presents the activities in this direction of developing robust document understanding systems for Indian languages using a corpus of document images in Indian scripts, and describes the process it follows to obtain word- and symbol-level annotated data sets.