Managing multilingual OCR project using XML
TL;DR: This paper describes how a new XML based tagging scheme has been exploited to achieve the objectives of the project aimed at developing OCR for 11 scripts of Indian origin for which mature OCR technology was not available.
Abstract: This paper presents an XML-based scheme for managing a large multilingual OCR project. In particular we describe how a new XML based tagging scheme has been exploited to achieve the objectives of the project. Managing a large multi-lingual OCR project involving multiple research groups, developing script specific and script independent technologies in a collaborative fashion is a challenging problem. In this paper, we present some of the software and data management strategies designed for the project aimed at developing OCR for 11 scripts of Indian origin for which mature OCR technology was not available.
...read more
Citations
25 citations
Cites methods from "Managing multilingual OCR project u..."
...It should follow the Input/Output XML scheme specified for the project [1]....
[...]
...XML has been used as architecture specication language and enables handling huge amount of data in such large projects [1]....
[...]
2 citations
1 citations
References
428 citations
221 citations
43 citations
36 citations
35 citations