Institution

International Institute of Information Technology, Hyderabad

Education•Hyderabad, India•

About: International Institute of Information Technology, Hyderabad is a education organization based out in Hyderabad, India. It is known for research contribution in the topics: Authentication & Internet security. The organization has 2048 authors who have published 3677 publications receiving 45319 citations. The organization is also known as: IIIT Hyderabad & International Institute of Information Technology (IIIT).

...read moreread less

Topics: Authentication, Internet security, Wireless sensor network, Machine translation, Deep learning ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Multilingual OCR for Indic Scripts

[...]

Minesh Mathew¹, Ajeet Kumar Singh¹, C. V. Jawahar¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

11 Apr 2016

TL;DR: An end-to-end RNN based architecture which can detect the script and recognize the text in a segmentation-free manner is proposed for this purpose and demonstrated for 12 Indian languages and English.

...read moreread less

Abstract: In Indian scenario, a document analysis system has to support multiple languages at the same time. With emerging multilingualism in urban India, often bilingual, trilingual or even more languages need to be supported. This demands development of a multilingual OCR system which can work seamlessly across Indic scripts. In our approach the script is identified at word level, prior to the recognition of the word. An end-to-end RNN based architecture which can detect the script and recognize the text in a segmentation-free manner is proposed for this purpose. We demonstrate the approach for 12 Indian languages and English. It is observed that, even with the similar architecture, performance on Indian languages are poorer compared to English. We investigate this further. Our approach is evaluated on a large corpus comprising of thousands of pages. The Hindi OCR is compared with other popular OCRs for the language, as a further testimony for the efficacy of our method.

...read moreread less

40 citations

Proceedings Article•DOI•

Robust Recognition of Degraded Documents Using Character N-Grams

[...]

Shrey Dutta¹, Naveen Sankaran¹, K. Pramod Sankar², C. V. Jawahar¹•Institutions (2)

International Institute of Information Technology, Hyderabad¹, Xerox²

27 Mar 2012

TL;DR: A novel recognition approach that results in a 15% decrease in word error rate on heavily degraded Indian language document images by exploiting the additional context present in the character n-gram images, which enables better disambiguation between confusing characters in the recognition phase.

...read moreread less

Abstract: In this paper we present a novel recognition approach that results in a 15% decrease in word error rate on heavily degraded Indian language document images. OCRs have considerably good performance on good quality documents, but fail easily in presence of degradations. Also, classical OCR approaches perform poorly over complex scripts such as those for Indian languages. We address these issues by proposing to recognize character n-gram images, which are basically groupings of consecutive character/component segments. Our approach is unique, since we use the character n-grams as a primitive for recognition rather than for post processing. By exploiting the additional context present in the character n-gram images, we enable better disambiguation between confusing characters in the recognition phase. The labels obtained from recognizing the constituent n-grams are then fused to obtain a label for the word that emitted them. Our method is inherently robust to degradations such as cuts and merges which are common in digital libraries of scanned documents. We also present a reliable and scalable scheme for recognizing character n-gram images. Tests on English and Malayalam document images show considerable improvement in recognition in the case of heavily degraded documents.

...read moreread less

40 citations

Proceedings Article•DOI•

Weave&Rec: A Word Embedding based 3-D Convolutional Network for News Recommendation

[...]

Dhruv Khattar¹, Vaibhav Kumar¹, Vasudeva Varma¹, Manish Gupta¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

17 Oct 2018

TL;DR: A novel deep learning model for news recommendation which utilizes the content of the news articles as well as the sequence in which the articles were read by the user as its input.

...read moreread less

Abstract: An effective news recommendation system should harness the historical information of the user based on her interactions as well as the content of the articles. In this paper we propose a novel deep learning model for news recommendation which utilizes the content of the news articles as well as the sequence in which the articles were read by the user. To model both of these information, which are essentially of different types, we propose a simple yet effective architecture which utilizes a 3-dimensional Convolutional Neural Network which takes the word embeddings of the articles present in the user history as its input. Using such a method endows the model with the capability to automatically learn spatial (features of a particular article) as well as temporal features (features across articles read by a user) which signify the interest of the user. At test time, we use this in combination with a 2-dimensional Convolutional Neural Network for recommending articles to users. On a real-world dataset our method outperformed strong baselines which also model the news recommendation problem using neural networks.

...read moreread less

40 citations

Proceedings Article•DOI•

ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard

[...]

Rui Zhang, Mingkun Yang¹, Xiang Bai¹, Baoguang Shi², Dimosthenis Karatzas, Shijian Lu³, C. V. Jawahar⁴, Yongsheng Zhou, Qianyi Jiang, Qi Song, Nan Li, Kai Zhou, Lei Wang, Dong Wang, Minghui Liao¹ - Show less +11 more•Institutions (4)

Huazhong University of Science and Technology¹, Microsoft², Nanyang Technological University³, International Institute of Information Technology, Hyderabad⁴

01 Sep 2019

TL;DR: The ICDAR2019-ReCTS this article, which mainly focuses on reading Chinese text on signboard, has attracted great interest and the final results of the competition are presented in this article.

...read moreread less

Abstract: Chinese scene text reading is one of the most challenging problems in computer vision and has attracted great interest. Different from English text, Chinese has more than 6000 commonly used characters and Chinese characters can be arranged in various layouts with numerous fonts. The Chinese signboards in street view are a good choice for Chinese scene text images since they have different backgrounds, fonts and layouts. We organized a competition called ICDAR2019-ReCTS, which mainly focuses on reading Chinese text on signboard. This report presents the final results of the competition. A large-scale dataset of 25,000 annotated signboard images, in which all the text lines and characters are annotated with locations and transcriptions, were released. Four tasks, namely character recognition, text line recognition, text line detection and end-to-end recognition were set up. Besides, considering the Chinese text ambiguity issue, we proposed a multi ground truth (multi-GT) evaluation method to make evaluation fairer. The competition started on March 1, 2019 and ended on April 30, 2019. 262 submissions from 46 teams are received. Most of the participants come from universities, research institutes, and tech companies in China. There are also some participants from the United States, Australia, Singapore, and Korea. 21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4. The official website for the competition is http://rrc.cvc.uab.es/?ch=12.

...read moreread less

40 citations

Proceedings Article•

Statistical Transliteration for Cross Langauge Information Retrieval using HMM alignment and CRF

[...]

Surya Ganesh¹, Sree Harsha, Prasad Pingali, Vasudeva Varma¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

01 Jan 2008

TL;DR: The results show that the technique perfoms better than the existing transliteration system which uses HMM alignment and conditional probabilities derived from counting the alignments.

...read moreread less

Abstract: In this paper we present a statistical transliteration technique that is language independent. This technique uses Hidden Markov Model (HMM) alignment and Conditional Random Fields (CRF), a discriminative model. HMM alignment maximizes the probability of the observed (source, target) word pairs using the expectation maximization algorithm and then the character level alignments (n-gram) are set to maximum posterior predictions of the model. CRF has efficient training and decoding processes which is conditioned on both source and target languages and produces globally optimal solutions. We apply this technique for Hindi-English transliteration task. The results show that our technique perfoms better than the existing transliteration system which uses HMM alignment and conditional probabilities derived from counting the alignments.

...read moreread less

40 citations

Collapse

Authors

Showing all 2066 results

Name	H-index	Papers	Citations
Ravi Shankar	66	672	19326
Joakim Nivre	61	295	17203
Aravind K. Joshi	59	249	16417
Ashok Kumar Das	56	278	9166
Malcolm F. White	55	172	10762
B. Yegnanarayana	54	340	12861
Ram Bilas Pachori	48	182	8140
C. V. Jawahar	45	479	9582
Saurabh Garg	40	206	6738
Himanshu Thapliyal	36	201	3992
Monika Sharma	36	238	4412
Ponnurangam Kumaraguru	33	269	6849
Abhijit Mitra	33	240	7795
Ramanathan Sowdhamini	33	256	4458
Helmut Schiessel	32	117	3527

Network Information

Related Institutions (5)

Microsoft

86.9K papers, 4.1M citations

90% related

Facebook

10.9K papers, 570.1K citations

89% related

Google

39.8K papers, 2.1M citations

38.6K papers, 1.3M citations

87% related

Carnegie Mellon University

104.3K papers, 5.9M citations

87% related

Performance

Metrics

3,712

Papers

63,279

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	10
2022	29
2021	373
2020	440
2019	367
2018	364