RED: Reinforced Encoder-Decoder Networks for Action Anticipation

doi:10.5244/C.31.92

Open AccessProceedings ArticleDOI

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

Jiyang Gao, +2 more

Chats0

TLDR

Wang et al. as mentioned in this paper proposed a Reinforced encoder-decoder (RED) network for action anticipation, which takes multiple history representations as input and learns to anticipate a sequence of future representations.

Abstract:

Action anticipation aims to detect an action before it happens. Many real world applications in robotics and surveillance are related to this predictive capability. Current methods address this problem by first anticipating visual representations of future frames and then categorizing the anticipated representations to actions. However, anticipation is based on a single past frame's representation, which ignores the history trend. Besides, it can only anticipate a fixed future time. We propose a Reinforced Encoder-Decoder (RED) network for action anticipation. RED takes multiple history representations as input and learns to anticipate a sequence of future representations. One salient aspect of RED is that a reinforcement module is adopted to provide sequence-level supervision; the reward function is designed to encourage the system to make correct predictions as early as possible. We test RED on TVSeries, THUMOS-14 and TV-Human-Interaction datasets for action anticipation and achieve state-of-the-art performance on all datasets.

Citations

PDF

Open Access

More filters

Book ChapterDOI

CTAP: Complementary Temporal Action Proposal Generation

Jiyang Gao, +2 more

TL;DR: This work applies a Proposal-level Actionness Trustworthiness Estimator on the sliding windows proposals to generate the probabilities indicating whether the actions can be correctly detected by actionness scores, and applies CTAP as a proposal generation method in an existing action detector, and shows consistent significant improvements.

...read moreread less

Proceedings ArticleDOI

Temporal Recurrent Networks for Online Action Detection

Mingze Xu, +4 more

TL;DR: A novel framework, the Temporal Recurrent Network (TRN), to model greater temporal context of each frame by simultaneously performing online action detection and anticipation of the immediate future and integrates both of these into a unified end-to-end architecture.

...read moreread less

Posted Content

CTAP: Complementary Temporal Action Proposal Generation

Jiyang Gao, +2 more

- 12 Jul 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Wang et al. as discussed by the authors proposed complementary temporal action proposal (CTAP) generator, which applies a Proposal-level Actionness Trustworthiness Estimator (PATE) on the sliding windows proposals to generate the probabilities indicating whether the actions can be correctly detected by actionness scores, the windows with high scores are collected.

...read moreread less

Book ChapterDOI

Action Anticipation with RBF Kernelized Feature Mapping RNN

Yuge Shi, +2 more

TL;DR: A novel Recurrent Neural Network-based algorithm for future video feature generation and action anticipation called feature mapping RNN, which uses only some of the earliest frames of a video to generate future features with a fraction of the parameters needed in traditional RNN.

...read moreread less

Proceedings ArticleDOI

Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition

Wenhao Wu, +4 more

TL;DR: In this paper, a multi-agent reinforcement learning (MARL) framework is proposed to solve the problems with a novel RNN-based context-aware observation network which jointly models context information among nearby agents and historical states of a specific agent, a policy network which generates the probability distribution over a predefined action space at each step and a classification network for reward calculation as well as final recognition.

...read moreread less

Collapse

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

Citations

CTAP: Complementary Temporal Action Proposal Generation

Temporal Recurrent Networks for Online Action Detection

CTAP: Complementary Temporal Action Proposal Generation

Action Anticipation with RBF Kernelized Feature Mapping RNN

Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition

Related Papers (5)

Deep Residual Learning for Image Recognition

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Learning Spatiotemporal Features with 3D Convolutional Networks

Two-Stream Convolutional Networks for Action Recognition in Videos

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition