scispace - formally typeset
S

Satwik Kottur

Researcher at Facebook

Publications -  36
Citations -  4269

Satwik Kottur is an academic researcher from Facebook. The author has contributed to research in topics: Dialog box & Computer science. The author has an hindex of 16, co-authored 30 publications receiving 3351 citations. Previous affiliations of Satwik Kottur include Carnegie Mellon University.

Papers
More filters
Posted Content

Deep Sets

TL;DR: The main theorem characterizes the permutation invariant objective functions and provides a family of functions to which any permutation covariant objective function must belong, which enables the design of a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks.
Proceedings Article

Visual Dialog

TL;DR: In this article, the authors introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content, given an image, a dialog history and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately.
Journal Article

Visual Dialog

TL;DR: The authors introduced the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content, given an image, a dialog history and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately.
Proceedings Article

Deep Sets

TL;DR: In this paper, the authors study the problem of designing models for machine learning tasks defined on sets and provide a family of functions to which any permutation invariant objective function must belong.
Proceedings ArticleDOI

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

TL;DR: This work poses a cooperative ‘image guessing’ game between two agents who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images and shows the emergence of grounded language and communication among ‘visual’ dialog agents with no human supervision.