G
Gaurav Harit
Researcher at Indian Institute of Technology, Jodhpur
Publications - 74
Citations - 630
Gaurav Harit is an academic researcher from Indian Institute of Technology, Jodhpur. The author has contributed to research in topics: Image segmentation & Character (mathematics). The author has an hindex of 13, co-authored 73 publications receiving 523 citations. Previous affiliations of Gaurav Harit include Indian Institutes of Technology & Indian Institute of Technology Delhi.
Papers
More filters
Posted Content
External Knowledge Augmented Text Visual Question Answering
TL;DR: This article proposed a framework to extract, filter, and encode knowledge atop a standard multimodal transformer for text-VQA, which can highlight instance-only cues and thus help deal with training data bias, improve answer entity type correctness and detect multiword named entities.
Posted Content
Extraction of Layout Entities and Sub-layout Query-based Retrieval of Document Images.
TL;DR: An efficient graph-based matching algorithm, integrated with hash-based indexing, to prune a possibly large search space, and handles cases of segmentation pre-processing errors with a symmetry maximization-based strategy and accounting for multiple domain-specific plausible segmentation hypotheses.
Journal ArticleDOI
TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain
TL;DR: In this article , the authors proposed an end-to-end framework for offline processing of handwritten semi-structured documents, and benchmarked it on the FIR dataset, which is more challenging than most existing document analysis datasets, since it combines a wide variety of handwritten text with printed text.
Proceedings ArticleDOI
Core Region Detection for Off-Line Unconstrained Handwritten Latin Words Using Word Envelops
Shilpa Pandey,Gaurav Harit +1 more
TL;DR: A new method for separating ascenders and descenders from an unconstrained handwritten word and identifying its core-region and promising results are obtained by the core- Region detection method when compared with the current state of the art methods.
Posted Content
External Knowledge enabled Text Visual Question Answering
TL;DR: This paper proposed a framework to extract, validate, and reason with knowledge using a standard multimodal transformer for vision language understanding tasks, which can highlight instance-only cues and thus help deal with training data bias, improve answer entity type correctness and detect multiword named entities.