scispace - formally typeset
G

Gaurav Harit

Researcher at Indian Institute of Technology, Jodhpur

Publications -  74
Citations -  630

Gaurav Harit is an academic researcher from Indian Institute of Technology, Jodhpur. The author has contributed to research in topics: Image segmentation & Character (mathematics). The author has an hindex of 13, co-authored 73 publications receiving 523 citations. Previous affiliations of Gaurav Harit include Indian Institutes of Technology & Indian Institute of Technology Delhi.

Papers
More filters
Posted Content

External Knowledge Augmented Text Visual Question Answering

TL;DR: This article proposed a framework to extract, filter, and encode knowledge atop a standard multimodal transformer for text-VQA, which can highlight instance-only cues and thus help deal with training data bias, improve answer entity type correctness and detect multiword named entities.
Posted Content

Extraction of Layout Entities and Sub-layout Query-based Retrieval of Document Images.

TL;DR: An efficient graph-based matching algorithm, integrated with hash-based indexing, to prune a possibly large search space, and handles cases of segmentation pre-processing errors with a symmetry maximization-based strategy and accounting for multiple domain-specific plausible segmentation hypotheses.
Journal ArticleDOI

TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain

TL;DR: In this article , the authors proposed an end-to-end framework for offline processing of handwritten semi-structured documents, and benchmarked it on the FIR dataset, which is more challenging than most existing document analysis datasets, since it combines a wide variety of handwritten text with printed text.
Proceedings ArticleDOI

Core Region Detection for Off-Line Unconstrained Handwritten Latin Words Using Word Envelops

TL;DR: A new method for separating ascenders and descenders from an unconstrained handwritten word and identifying its core-region and promising results are obtained by the core- Region detection method when compared with the current state of the art methods.
Posted Content

External Knowledge enabled Text Visual Question Answering

TL;DR: This paper proposed a framework to extract, validate, and reason with knowledge using a standard multimodal transformer for vision language understanding tasks, which can highlight instance-only cues and thus help deal with training data bias, improve answer entity type correctness and detect multiword named entities.