scispace - formally typeset
Open AccessProceedings ArticleDOI

Semantic Autoencoder for Zero-Shot Learning

TLDR
In this paper, an encoder aims to project a visual feature vector into the semantic space as in the existing ZSL models, but the decoder exerts an additional constraint, that the projection/code must be able to reconstruct the original visual feature.
Abstract
Existing zero-shot learning (ZSL) models typically learn a projection function from a feature space to a semantic embedding space (e.g. attribute space). However, such a projection function is only concerned with predicting the training seen class semantic representation (e.g. attribute prediction) or classification. When applied to test data, which in the context of ZSL contains different (unseen) classes without training data, a ZSL model typically suffers from the project domain shift problem. In this work, we present a novel solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the encoder-decoder paradigm, an encoder aims to project a visual feature vector into the semantic space as in the existing ZSL models. However, the decoder exerts an additional constraint, that is, the projection/code must be able to reconstruct the original visual feature. We show that with this additional reconstruction constraint, the learned projection function from the seen classes is able to generalise better to the new unseen classes. Importantly, the encoder and decoder are linear and symmetric which enable us to develop an extremely efficient learning algorithm. Extensive experiments on six benchmark datasets demonstrate that the proposed SAE outperforms significantly the existing ZSL models with the additional benefit of lower computational cost. Furthermore, when the SAE is applied to supervised clustering problem, it also beats the state-of-the-art.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Learning to Compare: Relation Network for Few-Shot Learning

TL;DR: A conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each, which is easily extended to zero- shot learning.
Posted Content

Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly

TL;DR: A new zero-shot learning dataset is proposed, the Animals with Attributes 2 (AWA2) dataset which is made publicly available both in terms of image features and the images themselves and compares and analyzes a significant number of the state-of-the-art methods in depth.
Journal ArticleDOI

Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly

TL;DR: The Animals with Attributes 2 (AWA2) dataset as mentioned in this paper is a new dataset for zero-shot learning, which is publicly available both in terms of image features and the images themselves.
Proceedings ArticleDOI

Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs

TL;DR: In this article, a graph convolutional network (GCN) is used to predict the visual classifiers of unseen categories, which is robust to noise in the learned knowledge graph (KG) given a semantic embedding for each node (representing visual category).
Proceedings ArticleDOI

Generalized Zero-Shot Learning via Synthesized Examples

TL;DR: This work presents a generative framework for generalized zero-shot learning where the training and test classes are not necessarily disjoint, and can generate novel exemplars from seen/unseen classes, given their respective class attributes.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI

Going deeper with convolutions

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Related Papers (5)