scispace - formally typeset
Open AccessPosted Content

Zero-Shot Learning by Convex Combination of Semantic Embeddings

Reads0
Chats0
TLDR
In this paper, a convex combination of the class label embedding vectors is used to map images into the semantic embedding space via convex combinations of the embeddings, which requires no additional training.
Abstract
Several recent publications have proposed methods for mapping images into continuous semantic embedding spaces. In some cases the embedding space is trained jointly with the image transformation. In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage. Proponents of these image embedding systems have stressed their advantages over the traditional \nway{} classification framing of image understanding, particularly in terms of the promise for zero-shot learning -- the ability to correctly annotate images of previously unseen object categories. In this paper, we propose a simple method for constructing an image embedding system from any existing \nway{} image classifier and a semantic word embedding model, which contains the $\n$ class labels in its vocabulary. Our method maps images into the semantic embedding space via convex combination of the class label embedding vectors, and requires no additional training. We show that this simple and direct method confers many of the advantages associated with more complex image embedding schemes, and indeed outperforms state of the art methods on the ImageNet zero-shot learning task.

read more

Citations
More filters
Posted Content

Learning to Compare: Relation Network for Few-Shot Learning

TL;DR: Relation Network (RN) as mentioned in this paper learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting.
Proceedings ArticleDOI

Evaluation of Output Embeddings for Fine-Grained Image Classification

TL;DR: In this article, given image and class embeddings, they learn a compatibility function such that matching embedding are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score.
Posted Content

Semantic Autoencoder for Zero-Shot Learning

TL;DR: This work presents a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE), which outperforms significantly the existing ZSL models with the additional benefit of lower computational cost and beats the state-of-the-art when the SAE is applied to supervised clustering problem.
Posted Content

Synthesized Classifiers for Zero-Shot Learning

TL;DR: In this article, a set of "phantom" object classes whose coordinates live in both the semantic space and the model space are introduced, and they can be optimized from labeled data such that the synthesized real object classifiers achieve optimal discriminative performance.
Posted Content

Zero-Shot Learning via Semantic Similarity Embedding

TL;DR: In this paper, a max-margin framework is developed to learn source/target embedding functions that map an arbitrary source and target domain data into a same semantic space where similarity can be readily measured.
References
More filters
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Journal ArticleDOI

One-shot learning of object categories

TL;DR: It is found that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
Proceedings Article

DeViSE: A Deep Visual-Semantic Embedding Model

TL;DR: This paper presents a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text and shows that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training.
Proceedings ArticleDOI

Learning to detect unseen object classes by between-class attribute transfer

TL;DR: The experiments show that by using an attribute layer it is indeed possible to build a learning object detection system that does not require any training images of the target classes, and assembled a new large-scale dataset, “Animals with Attributes”, of over 30,000 animal images that match the 50 classes in Osherson's classic table of how strongly humans associate 85 semantic attributes with animal classes.
Related Papers (5)