Institution
International Institute of Information Technology, Hyderabad
Education•Hyderabad, India•
About: International Institute of Information Technology, Hyderabad is a education organization based out in Hyderabad, India. It is known for research contribution in the topics: Authentication & Internet security. The organization has 2048 authors who have published 3677 publications receiving 45319 citations. The organization is also known as: IIIT Hyderabad & International Institute of Information Technology (IIIT).
Topics: Authentication, Internet security, Wireless sensor network, Machine translation, Deep learning
Papers published on a yearly basis
Papers
More filters
••
TL;DR: A system to compensate for the auditory processing difficulties in case of dyslexia and a system for autism where the need for traditional languages and instead use pictures for communication are proposed.
Abstract: Autism and dyslexia are both developmental disorders of neural origin. As we still do not understand the neural basis of these disorders fully, technology can take two approaches in helping those affected. The first is to compensate externally for a known difficulty and the other is to achieve the same function using a completely different means. To demonstrate the first option, we are developing a system to compensate for the auditory processing difficulties in case of dyslexia and to demonstrate the second option we propose a system for autism where we remove the need for traditional languages and instead use pictures for communication.
24 citations
••
30 Sep 2010TL;DR: Anusaaraka is a Language Accessor cum Machine Translation system based on the fundamental premise of sharing the load producing good enough results according to the needs of the reader, and is so designed that a user can contribute to it and participate in improving its quality.
Abstract: Most research in Machine translation is about having the computers completely bear the load of translating one human language into another. This paper looks at the machine translation problem afresh and observes that there is a need to share the load between man and machine, distinguish reliable knowledge from the heuristics, provide a spectrum of outputs to serve different strata of people, and finally make use of existing resources instead of reinventing the wheel. This paper describes a unique approach to develop machine translation system based on the insights of information dynamics from Paninian Grammar Formalism. Anusaaraka is a Language Accessor cum Machine Translation system based on the fundamental premise of sharing the load producing good enough results according to the needs of the reader. The system promises to give faithful representation of the translated text, no loss of information while translating and graceful degradation (robustness) in case of failure. The layered output provides an access to all the stages of translation making the whole process transparent. Thus, Anusaaraka differs from the Machine Translation systems in two respects: (1) its commitment to faithfulness and thereby providing a layer of 100% faithful output so that a user with some training can “access the source text” faithfully. (2) The system is so designed that a user can contribute to it and participate in improving its quality. Further Anusaaraka provides an eclectic combination of the Apertium architecture with the forward chaining expert system, allowing use of both the deep parser and shallow parser outputs to analyze the SL text. Existing language resources (parsers, taggers, chunkers) available under GPL are used instead of rewriting it again. Language data and linguistic rules are independent from the core programme, making it easy for linguists to modify and experiment with different language phenomena to improve the system. Users can become contributors by contributing new word sense disambiguation (WSD) rules of the ambiguous words through a web-interface available over internet. The system uses forward chaining of expert system to infer new language facts from the existing language data. It helps to solve the complex behavior of language translation by applying specific knowledge rather than specific technique creating a vast language knowledge base in electronic form. Or in other words, the expert system facilitates the transformation of subject matter expert's (SME) knowledge available with humans into a computer processable knowledge base.
24 citations
••
TL;DR: Simulations in noisy environments demonstrate that taking the heteroscedastic noise into account causes the direction of arrival (DOA) estimation to fail at lower SNR, often at 20 dB lowerSNR.
24 citations
••
07 Apr 2014TL;DR: A web based OCR system which follows a unified architecture for seven Indian languages, is robust against popular degradations, follows a segmentation free approach, addresses the UNICODE re-ordering issues, and can enable continuous learning with user inputs and feedbacks is proposed.
Abstract: The current Optical Character Recognition OCR systems for Indic scripts are not robust enough for recognizing arbitrary collection of printed documents. Reasons for this limitation includes the lack of resources (e.g. not enough examples with natural variations, lack of documentation available about the possible font/style variations) and the architecture which necessitates hard segmentation of word images followed by an isolated symbol recognition. Variations among scripts, latent symbol to UNICODE conversion rules, non-standard fonts/styles and large degradations are some of the major reasons for the unavailability of robust solutions. In this paper, we propose a web based OCR system which (i) follows a unified architecture for seven Indian languages, (ii) is robust against popular degradations, (iii) follows a segmentation free approach, (iv) addresses the UNICODE re-ordering issues, and (v) can enable continuous learning with user inputs and feedbacks. Our system is designed to aid the continuous learning while being usable i.e., we capture the user inputs (say example images) for further improving the OCRs. We use the popular BLSTM based transcription scheme to achieve our target. This also enables incremental training and refinement in a seamless manner. We report superior accuracy rates in comparison with the available OCRs for the seven Indian languages.
24 citations
•
01 Jan 2008TL;DR: This paper shows experiments with various feature combinations for Telugu NER and observed that the prefix and suffix information helps a lot in finding the class of the token.
Abstract: Named Entity Recognition(NER) is the task of identifying and classifying tokens in a text document into predefined set of classes. In this paper we show our experiments with various feature combinations for Telugu NER. We also observed that the prefix and suffix information helps a lot in finding the class of the token. We also show the effect of the training data on the performance of the system. The best performing model gave an Fb=1 measure of 44.91. The language independent features gave an Fb=1 measure of 44.89 which is close to Fb=1 measure obtained even by including the language dependent features.
24 citations
Authors
Showing all 2066 results
Name | H-index | Papers | Citations |
---|---|---|---|
Ravi Shankar | 66 | 672 | 19326 |
Joakim Nivre | 61 | 295 | 17203 |
Aravind K. Joshi | 59 | 249 | 16417 |
Ashok Kumar Das | 56 | 278 | 9166 |
Malcolm F. White | 55 | 172 | 10762 |
B. Yegnanarayana | 54 | 340 | 12861 |
Ram Bilas Pachori | 48 | 182 | 8140 |
C. V. Jawahar | 45 | 479 | 9582 |
Saurabh Garg | 40 | 206 | 6738 |
Himanshu Thapliyal | 36 | 201 | 3992 |
Monika Sharma | 36 | 238 | 4412 |
Ponnurangam Kumaraguru | 33 | 269 | 6849 |
Abhijit Mitra | 33 | 240 | 7795 |
Ramanathan Sowdhamini | 33 | 256 | 4458 |
Helmut Schiessel | 32 | 117 | 3527 |