scispace - formally typeset
Proceedings ArticleDOI

Context Aware Active Learning of Activity Recognition Models

Reads0
Chats0
TLDR
This work proposes a novel active learning technique which not only exploits the informativeness of the individual activity instances but also utilizes their contextual information during the query selection process, this leads to significant reduction in expensive manual annotation effort.
Abstract
Activity recognition in video has recently benefited from the use of the context e.g., inter-relationships among the activities and objects. However, these approaches require data to be labeled and entirely available at the outset. In contrast, we formulate a continuous learning framework for context aware activity recognition from unlabeled video data which has two distinct advantages over most existing methods. First, we propose a novel active learning technique which not only exploits the informativeness of the individual activity instances but also utilizes their contextual information during the query selection process, this leads to significant reduction in expensive manual annotation effort. Second, the learned models can be adapted online as more data is available. We formulate a conditional random field (CRF) model that encodes the context and devise an information theoretic approach that utilizes entropy and mutual information of the nodes to compute the set of most informative query instances, which need to be labeled by a human. These labels are combined with graphical inference techniques for incrementally updating the model as new videos come in. Experiments on four challenging datasets demonstrate that our framework achieves superior performance with significantly less amount of manual labeling.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Learning Loss for Active Learning

TL;DR: In this article, the authors propose a novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, where a small parametric module, named ''loss prediction module'' to a target network, and learn it to predict target losses of unlabeled inputs.
Proceedings ArticleDOI

Temporal Context Network for Activity Localization in Videos

TL;DR: Temporal Context Network (TCN) as mentioned in this paper proposes a novel representation for ranking these proposals, which explicitly captures context around a proposal for ranking it, and then applies non-maximum suppression to obtain final detections.
Book ChapterDOI

Temporal Model Adaptation for Person Re-identification

TL;DR: Zhang et al. as mentioned in this paper proposed a temporal model adaptation scheme with human in the loop, which can be trained in an incremental fashion by means of a stochastic alternating directions methods of multipliers optimization procedure and exploit a graph-based approach to present the most informative probe-gallery matches that should be used to update the model.
Proceedings ArticleDOI

Joint Prediction of Activity Labels and Starting Times in Untrimmed Videos

TL;DR: Experiments demonstrate that the framework for joint prediction of activity label and starting time improves the performance of both, and outperforms the state-of-the-arts.
Posted Content

Learning Loss for Active Learning

TL;DR: A novel active learning method that is simple but task-agnostic, and works efficiently with the deep networks, by attaching a small parametric module, named ``loss prediction module,'' to a target network, and learning it to predict target losses of unlabeled inputs.
References
More filters
Journal ArticleDOI

Incremental Learning for Robust Visual Tracking

TL;DR: A tracking method that incrementally learns a low-dimensional subspace representation, efficiently adapting online to changes in the appearance of the target, and includes a method for correctly updating the sample mean and a forgetting factor to ensure less modeling power is expended fitting older observations.
Proceedings ArticleDOI

Linear spatial pyramid matching using sparse coding for image classification

TL;DR: An extension of the SPM method is developed, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and a linear SPM kernel based on SIFT sparse codes is proposed, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.
Journal ArticleDOI

On Space-Time Interest Points

TL;DR: This paper builds on the idea of the Harris and Förstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time and illustrates how a video representation in terms of local space- time features allows for detection of walking people in scenes with occlusions and dynamic cluttered backgrounds.
Journal ArticleDOI

A survey on vision-based human action recognition

TL;DR: A detailed overview of current advances in vision-based human action recognition is provided, including a discussion of limitations of the state of the art and outline promising directions of research.
Proceedings ArticleDOI

Space-time interest points

Laptev, +1 more
TL;DR: This work builds on the idea of the Harris and Forstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time to detect spatio-temporal events.