scispace - formally typeset
D

Dhruv Batra

Researcher at Georgia Institute of Technology

Publications -  272
Citations -  43803

Dhruv Batra is an academic researcher from Georgia Institute of Technology. The author has contributed to research in topics: Question answering & Dialog box. The author has an hindex of 69, co-authored 272 publications receiving 29938 citations. Previous affiliations of Dhruv Batra include Facebook & Toyota Technological Institute at Chicago.

Papers
More filters
Proceedings ArticleDOI

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning

TL;DR: This paper proposed Seq-CVAE which learns a latent space for every word to capture the "intention" about how to complete the sentence by mimicking a representation which summarizes the future.
Proceedings ArticleDOI

Inference for order reduction in Markov random fields

TL;DR: A new algorithm called Order Reduction Inference (ORI) is introduced that searches over a space of order reduction methods to minimize the difficulty of the resultant pairwise inference problem.
Book ChapterDOI

Spatially Aware Multimodal Transformers for TextVQA

TL;DR: The authors proposed a spatially aware self-attention layer, where each visual entity only looks at neighboring entities defined by a spatial graph, and each head in the multi-head selfattention layers focuses on a different subset of relations.
Posted Content

Sim-to-Real Transfer for Vision-and-Language Navigation

TL;DR: To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot's low-level continuous action space, a subgoal model is proposed to identify nearby waypoints, and domain randomization is used to mitigate visual domain differences.
Proceedings ArticleDOI

Embodied Amodal Recognition: Learning to Move to Perceive Objects

TL;DR: Experimental results show that agents with embodiment (movement) achieve better visual recognition performance than passive ones and in order to improve visual recognition abilities, agents can learn strategic paths that are different from shortest paths.