Peter Anderson

Researcher at Georgia Institute of Technology

Publications - 55

Citations - 11219

Peter Anderson is an academic researcher from Georgia Institute of Technology. The author has contributed to research in topics: Closed captioning & Question answering. The author has an hindex of 22, co-authored 55 publications receiving 7333 citations. Previous affiliations of Peter Anderson include University of New South Wales & Australian National University.

Papers

PDF

Open Access

More filters

Proceedings ArticleDOI

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Peter Anderson, +6 more

TL;DR: In this paper, a bottom-up and top-down attention mechanism was proposed to enable attention to be calculated at the level of objects and other salient image regions, which achieved state-of-the-art results on the MSCOCO test server.

...read moreread less

Posted Content

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.

Peter Anderson, +6 more

- 25 Jul 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A combined bottom-up and top-down attention mechanism that enables attention to be calculated at the level of objects and other salient image regions is proposed, demonstrating the broad applicability of this approach to VQA.

...read moreread less

Book ChapterDOI

SPICE: Semantic Propositional Image Caption Evaluation

Peter Anderson, +3 more

TL;DR: This paper proposed a new automated caption evaluation metric defined over scene graphs coined SPICE, which captures human judgments over model-generated captions better than other automatic metrics (e.g., system-level correlation of 0.88 with human judgments on the MS COCO dataset, versus 0.43 for CIDEr and 0.53 for METEOR).

...read moreread less

Posted Content

SPICE: Semantic Propositional Image Caption Evaluation

Peter Anderson, +3 more

- 29 Jul 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is hypothesized that semantic propositional content is an important component of human caption evaluation, and a new automated caption evaluation metric defined over scene graphs coined SPICE is proposed, which can answer questions such as which caption-generator best understands colors?

...read moreread less

Proceedings ArticleDOI

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

Peter Anderson, +8 more

TL;DR: The Room-to-Room (R2R) dataset as mentioned in this paper provides a large-scale reinforcement learning environment based on real imagery for visually-grounded natural language navigation in real buildings.

...read moreread less

Collapse