Large-Scale Long-Tailed Recognition in an Open World
Ziwei Liu,Zhongqi Miao,Xiaohang Zhan,Jiayun Wang,Boqing Gong,Stella X. Yu +5 more
- pp 2537-2546
TLDR
An integrated OLTR algorithm is developed that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world.Abstract:
Real world data often have a long-tailed and open-ended distribution. A practical recognition system must classify among majority and minority classes, generalize from a few known instances, and acknowledge novelty upon a never seen instance. We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes. OLTR must handle imbalanced classification, few-shot learning, and open-set recognition in one integrated algorithm, whereas existing classification approaches focus only on one aspect and deliver poorly over the entire class spectrum. The key challenges are how to share visual knowledge between head and tail classes and how to reduce confusion between tail and open classes. We develop an integrated OLTR algorithm that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world. Our so-called dynamic meta-embedding combines a direct image feature and an associated memory feature, with the feature norm indicating the familiarity to known classes. On three large-scale OLTR datasets we curate from object-centric ImageNet, scene-centric Places, and face-centric MS1M data, our method consistently outperforms the state-of-the-art. Our code, datasets, and models enable future OLTR research and are publicly available at \url{https://liuziwei7.github.io/projects/LongTail.html}.read more
Citations
More filters
Proceedings Article
Decoupling Representation and Classifier for Long-Tailed Recognition
Bingyi Kang,Saining Xie,Marcus Rohrbach,Zhicheng Yan,Albert Gordo,Jiashi Feng,Yannis Kalantidis +6 more
TL;DR: It is shown that it is possible to outperform carefully designed losses, sampling strategies, even complex modules with memory, by using a straightforward approach that decouples representation and classification.
Proceedings ArticleDOI
BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition
TL;DR: Zhang et al. as mentioned in this paper proposed a unified Bilateral-Branch Network (BBN) to take care of both representation learning and classifier learning simultaneously, where each branch does perform its own duty separately.
Proceedings ArticleDOI
Towards Open World Object Detection
TL;DR: In this paper, the authors propose a novel computer vision problem called "Open World Object Detection", where a model is tasked to identify objects that have not been introduced to it as "unknown" and incrementally learn these identified unknown categories without forgetting previously learned classes, when the corresponding labels are progressively received.
Posted Content
Long-tail learning via logit adjustment
Aditya Krishna Menon,Sadeep Jayasumana,Ankit Singh Rawat,Himanshu Jain,Andreas Veit,Sanjiv Kumar +5 more
TL;DR: These techniques revisit the classic idea of logit adjustment based on the label frequencies, either applied post-hoc to a trained model, or enforced in the loss during training, to encourage a large relative margin between logits of rare versus dominant labels.
Proceedings ArticleDOI
Meta-Learning to Detect Rare Objects
TL;DR: A conceptually simple but powerful meta-learning based framework that simultaneously tackles few- shot classification and few-shot localization in a unified, coherent way and introduces a weight prediction meta-model that enables predicting the parameters of category-specific components from few examples.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.