scispace - formally typeset
Open AccessProceedings ArticleDOI

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

Reads0
Chats0
TLDR
Wang et al. as mentioned in this paper proposed a Reinforced encoder-decoder (RED) network for action anticipation, which takes multiple history representations as input and learns to anticipate a sequence of future representations.
Abstract
Action anticipation aims to detect an action before it happens. Many real world applications in robotics and surveillance are related to this predictive capability. Current methods address this problem by first anticipating visual representations of future frames and then categorizing the anticipated representations to actions. However, anticipation is based on a single past frame's representation, which ignores the history trend. Besides, it can only anticipate a fixed future time. We propose a Reinforced Encoder-Decoder (RED) network for action anticipation. RED takes multiple history representations as input and learns to anticipate a sequence of future representations. One salient aspect of RED is that a reinforcement module is adopted to provide sequence-level supervision; the reward function is designed to encourage the system to make correct predictions as early as possible. We test RED on TVSeries, THUMOS-14 and TV-Human-Interaction datasets for action anticipation and achieve state-of-the-art performance on all datasets.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

CTAP: Complementary Temporal Action Proposal Generation

TL;DR: This work applies a Proposal-level Actionness Trustworthiness Estimator on the sliding windows proposals to generate the probabilities indicating whether the actions can be correctly detected by actionness scores, and applies CTAP as a proposal generation method in an existing action detector, and shows consistent significant improvements.
Proceedings ArticleDOI

Temporal Recurrent Networks for Online Action Detection

TL;DR: A novel framework, the Temporal Recurrent Network (TRN), to model greater temporal context of each frame by simultaneously performing online action detection and anticipation of the immediate future and integrates both of these into a unified end-to-end architecture.
Posted Content

CTAP: Complementary Temporal Action Proposal Generation

TL;DR: Wang et al. as discussed by the authors proposed complementary temporal action proposal (CTAP) generator, which applies a Proposal-level Actionness Trustworthiness Estimator (PATE) on the sliding windows proposals to generate the probabilities indicating whether the actions can be correctly detected by actionness scores, the windows with high scores are collected.
Book ChapterDOI

Action Anticipation with RBF Kernelized Feature Mapping RNN

TL;DR: A novel Recurrent Neural Network-based algorithm for future video feature generation and action anticipation called feature mapping RNN, which uses only some of the earliest frames of a video to generate future features with a fraction of the parameters needed in traditional RNN.
Proceedings ArticleDOI

Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition

TL;DR: In this paper, a multi-agent reinforcement learning (MARL) framework is proposed to solve the problems with a novel RNN-based context-aware observation network which jointly models context information among nearby agents and historical states of a specific agent, a policy network which generates the probability distribution over a predefined action space at each step and a classification network for reward calculation as well as final recognition.