scispace - formally typeset
H

Hao Tan

Researcher at University of North Carolina at Chapel Hill

Publications -  179
Citations -  3722

Hao Tan is an academic researcher from University of North Carolina at Chapel Hill. The author has contributed to research in topics: Medicine & Chemistry. The author has an hindex of 12, co-authored 33 publications receiving 1676 citations.

Papers
More filters
Proceedings ArticleDOI

LXMERT: Learning Cross-Modality Encoder Representations from Transformers

TL;DR: The LXMERT (Learning Cross-Modality Encoder Representations from Transformers) framework, a large-scale Transformer model that consists of three encoders, achieves the state-of-the-art results on two visual question answering datasets and shows the generalizability of the pre-trained cross-modality model.
Posted Content

LXMERT: Learning Cross-Modality Encoder Representations from Transformers

TL;DR: LXMERT as mentioned in this paper proposes a large-scale Transformer model that consists of three encoders: an object relationship encoder, a language encoder and a cross-modality encoder.
Proceedings ArticleDOI

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout

TL;DR: This paper presents a generalizable navigational agent, trained in two stages via mixed imitation and reinforcement learning, outperforming the state-of-art approaches by a large margin on the private unseen test set of the Room-to-Room task, and achieving the top rank on the leaderboard.
Proceedings ArticleDOI

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

TL;DR: This paper proposed a unified framework for the tasks of referring expression comprehension and generation, which consists of three modules: speaker, listener, and reinforcer, and achieved state-of-the-art results on three referring expression datasets.
Posted Content

A Joint Speaker-Listener-Reinforcer Model for Referring Expressions

TL;DR: A unified framework for the tasks of referring expression comprehension and generation is proposed, composed of three modules: speaker, listener, and reinforcer, which achieves state-of-the-art results for both comprehension andgeneration on three referring expression datasets.