scispace - formally typeset
Open AccessProceedings ArticleDOI

Deep Self-Taught Learning for Weakly Supervised Object Localization

TLDR
Li et al. as mentioned in this paper proposed a self-taught learning approach, which makes the detector learn the object-level features reliable for acquiring tight positive samples and afterwards re-train itself based on them.
Abstract
Most existing weakly supervised localization (WSL) approaches learn detectors by finding positive bounding boxes based on features learned with image-level supervision. However, those features do not contain spatial location related information and usually provide poor-quality positive samples for training a detector. To overcome this issue, we propose a deep self-taught learning approach, which makes the detector learn the object-level features reliable for acquiring tight positive samples and afterwards re-train itself based on them. Consequently, the detector progressively improves its detection ability and localizes more informative positive samples. To implement such self-taught learning, we propose a seed sample acquisition method via image-to-object transferring and dense subgraph discovery to find reliable positive samples for initializing the detector. An online supportive sample harvesting scheme is further proposed to dynamically select the most confident tight positive samples and train the detector in a mutual boosting way. To prevent the detector from being trapped in poor optima due to overfitting, we propose a new relative improvement of predicted CNN scores for guiding the self-taught learning process. Extensive experiments on PASCAL 2007 and 2012 show that our approach outperforms the state-of-the-arts, strongly validating its effectiveness.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation

TL;DR: In this article, a generic classification network equipped with convolutional blocks of different dilated rates was designed to produce dense and reliable object localization maps and effectively benefit both weakly and semi-supervised semantic segmentation.
Proceedings ArticleDOI

Adversarial Complementary Learning for Weakly Supervised Object Localization

TL;DR: Adversarial complementary learning (ACoL) as mentioned in this paper leverages one classification branch to dynamically localize some discriminative object regions during the forward pass, which enables the counterpart classifier to discover new and complementary object regions by erasing its discovered regions from the feature maps.
Proceedings ArticleDOI

Attention-Based Dropout Layer for Weakly Supervised Object Localization

TL;DR: Zhang et al. as discussed by the authors proposed an Attention-based Dropout Layer (ADL) which utilizes the self-attention mechanism to process the feature maps of the model, which is composed of two key components: hiding the most discriminative part from the model for capturing the integral extent of object, and highlighting the informative region for improving the recognition power.
Book ChapterDOI

AutoLoc: Weakly-Supervised Temporal Action Localization in Untrimmed Videos

TL;DR: A novel weakly-supervised TAL framework called AutoLoc is developed to directly predict the temporal boundary of each action instance and a novel Outer-Inner-Contrastive (OIC) loss is proposed to automatically discover the needed segment-level supervision for training such a boundary predictor.
Book ChapterDOI

W-TALC: Weakly-Supervised Temporal Activity Localization and Classification

TL;DR: W-TALC is presented, a Weakly-supervised Temporal Activity Localization and Classification framework using only video-level labels that is able to detect activities at a fine granularity and achieve better performance than current state-of-the-art methods.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Related Papers (5)