scispace - formally typeset
Search or ask a question
Institution

International Institute of Information Technology, Hyderabad

EducationHyderabad, India
About: International Institute of Information Technology, Hyderabad is a education organization based out in Hyderabad, India. It is known for research contribution in the topics: Computer science & Authentication. The organization has 2048 authors who have published 3677 publications receiving 45319 citations. The organization is also known as: IIIT Hyderabad & International Institute of Information Technology (IIIT).


Papers
More filters
Proceedings ArticleDOI
25 Jul 2020
TL;DR: This work investigates a growing body of work that seeks to improve recommender systems through the use of review text and observes several cases where state-of-the-art methods fail to outperform existing baselines, especially as they deviate from a few narrowly-defined settings where reviews are useful.
Abstract: We investigate a growing body of work that seeks to improve recommender systems through the use of review text. Generally, these papers argue that since reviews 'explain' users' opinions, they ought to be useful to infer the underlying dimensions that predict ratings or purchases. Schemes to incorporate reviews range from simple regularizers to neural network approaches. Our initial findings reveal several discrepancies in reported results, partly due to (e.g.) copying results across papers despite changes in experimental settings or data pre-processing. First, we attempt a comprehensive analysis to resolve these ambiguities. Further investigation calls for discussion on a much larger problem about the "importance" of user reviews for recommendation. Through a wide range of experiments, we observe several cases where state-of-the-art methods fail to outperform existing baselines, especially as we deviate from a few narrowly-defined settings where reviews are useful. We conclude by providing hypotheses for our observations, that seek to characterize under what conditions reviews are likely to be helpful. Through this work, we aim to evaluate the direction in which the field is progressing and encourage robust empirical evaluation.

38 citations

Proceedings ArticleDOI
03 Dec 2018
TL;DR: This work presents a new mutation strategy that maximizes the likelihood of triggering memory-corruption bugs by generating fewer, but better inputs, and implements this strategy in a fully functional fuzzer which it is called TIFF (Type Inference-based Fuzzing Framework).
Abstract: Developers commonly use fuzzing techniques to hunt down all manner of memory corruption vulnerabilities during the testing phase. Irrespective of the fuzzer, input mutation plays a central role in providing adequate code coverage, as well as in triggering bugs. However, each class of memory corruption bugs requires a different trigger condition. While the goal of a fuzzer is to find bugs, most existing fuzzers merely approximate this goal by targeting their mutation strategies toward maximizing code coverage. In this work, we present a new mutation strategy that maximizes the likelihood of triggering memory-corruption bugs by generating fewer, but better inputs. In particular, our strategy achieves bug-directed mutation by inferring the type of the input bytes. To do so, it tags each offset of the input with a basic type (e.g., 32-bit integer, string, array etc.), while deriving mutation rules for specific classes of bugs. We infer types by means of in-memory data-structure identification and dynamic taint analysis, and implement our novel mutation strategy in a fully functional fuzzer which we call TIFF (Type Inference-based Fuzzing Framework). Our evaluation on real-world applications shows that type-based fuzzing triggers bugs much earlier than existing solutions, while maintaining high code coverage. For example, on several real-world applications and libraries (e.g., poppler, mpg123 etc.), we find real bugs (with known CVEs) in almost half of the time and upto an order of magnitude fewer inputs than state-of-the-art fuzzers.

38 citations

Journal ArticleDOI
TL;DR: This work proposes a multi-dictionary SBL algorithm that simultaneously can process observations generated by different underlying dictionaries sharing the same sparsity profile, and shows how spatial aliasing can be avoided while processing multi-frequency observations using SBL.

38 citations

Proceedings ArticleDOI
09 Jun 2010
TL;DR: A Collection OCR which takes advantage of the fact that multiple examples of the same word may occur in a document or collection, and makes no language specific assumptions, and should be applicable to a large number of languages.
Abstract: Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. We describe a Collection OCR which takes advantage of the fact that multiple examples of the same word (often in the same font) may occur in a document or collection. The idea here is that an OCR or a reCAPTCHA like process generates a partial set of recognized words. In the second stage, a nearest neighbor algorithm compares the remaining word-images to those already recognized and propagates labels from the nearest neighbors. It is shown that by using an approximate fast nearest neighbor algorithm based on Hierarchical K-Means (HKM), we can do this accurately and efficiently. It is also shown that profile based features perform much better than SIFT and Pyramid Histogram of Gradient (PHOG) features. We believe that this is because profile features are more robust to word degradations (common in our documents). This approach is applied to a collection of Telugu books - a language for which no commercial OCR exists. We show from a selection of 33 Telugu books that starting with OCR labels for only 30% of the collection we can recognize the remaining 70% of the words in the collection with 70% accuracy using this approach. Since the approach makes no language specific assumptions, it should be applicable to a large number of languages. In particular we are interested in its applicability to Indic languages and scripts.

38 citations

Journal ArticleDOI
TL;DR: In this paper, the production characteristics of laughter are analysed at call and bout levels using EGG and speech signals and parameters representing degree of change and temporal changes in the production features are derived to study the discriminating characteristics of laughing from normal speech.

38 citations


Authors

Showing all 2066 results

NameH-indexPapersCitations
Ravi Shankar6667219326
Joakim Nivre6129517203
Aravind K. Joshi5924916417
Ashok Kumar Das562789166
Malcolm F. White5517210762
B. Yegnanarayana5434012861
Ram Bilas Pachori481828140
C. V. Jawahar454799582
Saurabh Garg402066738
Himanshu Thapliyal362013992
Monika Sharma362384412
Ponnurangam Kumaraguru332696849
Abhijit Mitra332407795
Ramanathan Sowdhamini332564458
Helmut Schiessel321173527
Network Information
Related Institutions (5)
Microsoft
86.9K papers, 4.1M citations

90% related

Facebook
10.9K papers, 570.1K citations

89% related

Google
39.8K papers, 2.1M citations

89% related

Carnegie Mellon University
104.3K papers, 5.9M citations

87% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202310
202229
2021373
2020440
2019367
2018364