Zero-Shot Learning by Convex Combination of Semantic Embeddings

Open AccessPosted Content

Zero-Shot Learning by Convex Combination of Semantic Embeddings

Mohammad Norouzi, +7 more

- 19 Dec 2013 -

arXiv: Learning

Chats0

TLDR

In this paper, a convex combination of the class label embedding vectors is used to map images into the semantic embedding space via convex combinations of the embeddings, which requires no additional training.

Abstract:

Several recent publications have proposed methods for mapping images into continuous semantic embedding spaces. In some cases the embedding space is trained jointly with the image transformation. In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage. Proponents of these image embedding systems have stressed their advantages over the traditional \nway{} classification framing of image understanding, particularly in terms of the promise for zero-shot learning -- the ability to correctly annotate images of previously unseen object categories. In this paper, we propose a simple method for constructing an image embedding system from any existing \nway{} image classifier and a semantic word embedding model, which contains the $\n$ class labels in its vocabulary. Our method maps images into the semantic embedding space via convex combination of the class label embedding vectors, and requires no additional training. We show that this simple and direct method confers many of the advantages associated with more complex image embedding schemes, and indeed outperforms state of the art methods on the ImageNet zero-shot learning task.

Citations

PDF

Open Access

More filters

Posted Content

Learning to Compare: Relation Network for Few-Shot Learning

Flood Sung, +5 more

- 16 Nov 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Relation Network (RN) as mentioned in this paper learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting.

...read moreread less

Proceedings ArticleDOI

Evaluation of Output Embeddings for Fine-Grained Image Classification

Zeynep Akata, +4 more

- 30 Sep 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, given image and class embeddings, they learn a compatibility function such that matching embedding are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score.

...read moreread less

Posted Content

Semantic Autoencoder for Zero-Shot Learning

Elyor Kodirov, +2 more

- 26 Apr 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE), which outperforms significantly the existing ZSL models with the additional benefit of lower computational cost and beats the state-of-the-art when the SAE is applied to supervised clustering problem.

...read moreread less

Posted Content

Synthesized Classifiers for Zero-Shot Learning

Soravit Changpinyo, +3 more

- 02 Mar 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, a set of "phantom" object classes whose coordinates live in both the semantic space and the model space are introduced, and they can be optimized from labeled data such that the synthesized real object classifiers achieve optimal discriminative performance.

...read moreread less

Posted Content

Zero-Shot Learning via Semantic Similarity Embedding

Ziming Zhang, +1 more

- 15 Sep 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, a max-margin framework is developed to learn source/target embedding functions that map an arbitrary source and target domain data into a same semantic space where similarity can be readily measured.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Journal ArticleDOI

One-shot learning of object categories

Li Fei-Fei, +2 more

- 01 Apr 2006 -

IEEE Transactions on Pattern Analysis an...

TL;DR: It is found that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.

...read moreread less

Proceedings Article

DeViSE: A Deep Visual-Semantic Embedding Model

Andrea Frome, +6 more

TL;DR: This paper presents a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text and shows that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training.

...read moreread less

Proceedings ArticleDOI

Learning to detect unseen object classes by between-class attribute transfer

Christoph H. Lampert, +2 more

TL;DR: The experiments show that by using an attribute layer it is indeed possible to build a learning object detection system that does not require any training images of the target classes, and assembled a new large-scale dataset, “Animals with Attributes”, of over 30,000 animal images that match the 50 classes in Osherson's classic table of how strongly humans associate 85 semantic attributes with animal classes.

...read moreread less

Zero-Shot Learning by Convex Combination of Semantic Embeddings

Citations

Learning to Compare: Relation Network for Few-Shot Learning

Evaluation of Output Embeddings for Fine-Grained Image Classification

Semantic Autoencoder for Zero-Shot Learning

Synthesized Classifiers for Zero-Shot Learning

Zero-Shot Learning via Semantic Similarity Embedding

References

ImageNet: A large-scale hierarchical image database

Efficient Estimation of Word Representations in Vector Space

One-shot learning of object categories

DeViSE: A Deep Visual-Semantic Embedding Model

Learning to detect unseen object classes by between-class attribute transfer

Related Papers (5)

DeViSE: A Deep Visual-Semantic Embedding Model

Describing objects by their attributes

Learning to detect unseen object classes by between-class attribute transfer

ImageNet: A large-scale hierarchical image database

Glove: Global Vectors for Word Representation