Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks

doi:10.1109/WACV45572.2020.9093421

Proceedings ArticleDOI

Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks

- pp 3281-3289

TLDR

An unsupervised protocol for learning a neural embedding of visual style of images using a proxy measure for forming triplets of anchor, similar, and dissimilar images to learn a compact style embedding that is useful for style-based search and retrieval.

Abstract:

We propose an unsupervised protocol for learning a neural embedding of visual style of images. Style similarity is an important measure for many applications such as style transfer, fashion search, art exploration, etc. However, computational modeling of style is a difficult task owing to its vague and subjective nature. Most methods for style based retrieval use supervised training with pre-defined categorization of images according to style. While this paradigm is suitable for applications where style categories are well-defined and curating large datasets according to such a categorization is feasible, in several other cases such a categorization is either ill-defined or does not exist. Our protocol for learning style based representations does not leverage categorical labels but a proxy measure for forming triplets of anchor, similar, and dissimilar images. Using these triplets, we learn a compact style embedding that is useful for style-based search and retrieval. The learned embeddings outperform other unsupervised representations for style-based image retrieval task on six datasets that capture different meanings of style. We also show that by fine-tuning the learned features with dataset-specific style labels, we obtain best results for image style recognition task on five of the six datasets.

Citations

PDF

Open Access

More filters

Posted Content

Discovering beautiful attributes for aesthetic image analysis

Luca Marchesotti, +2 more

- 16 Dec 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes to discover and learn the visual appearance of attributes automatically, using a recently introduced database, called AVA, which contains more than 250,000 images together with their aesthetic scores and textual comments given by photography enthusiasts, and describes how these three key components of AVA can be effectively leveraged to learn visual attributes.

...read moreread less

Posted Content

Content-based Image Retrieval and the Semantic Gap in the Deep Learning Era

Björn Barz, +1 more

- 12 Nov 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is concluded that the key problem for the further advancement of semantic image retrieval lies in the lack of a standardized task definition and an appropriate benchmark dataset.

...read moreread less

Book ChapterDOI

Content-Based Image Retrieval and the Semantic Gap in the Deep Learning Era

Björn Barz, +1 more

TL;DR: In this article, the authors show that the recent advances in instance retrieval transfer to more generic image retrieval scenarios, which is called instance or object retrieval and requires matching fine-grained visual patterns between images.

...read moreread less

Journal ArticleDOI

Considering three elements of aesthetics: Multi-task self-supervised feature learning for image style classification

Hua Zhang, +5 more

- 01 Nov 2022 -

Neurocomputing

TL;DR: Zhang et al. as discussed by the authors proposed a multi-task self-supervised style feature learning algorithm considering the three elements of aesthetics including compositional rules, luminance and color.

...read moreread less

Journal ArticleDOI

Transfer and Unsupervised Learning: An Integrated Approach to Concrete Crack Image Analysis

L.J. Gradisar, +1 more

- 16 Feb 2023 -

Sustainability

TL;DR: In this paper , an integrated methodology for analyzing concrete crack images is proposed using transfer and unsupervised learning, which extracts image features using pre-trained networks and groups them based on similarity using hierarchical clustering.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008 -

Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

Collapse

arXiv: Sound

Multi-Task Learning of Hierarchical Vision-Language Representation

Duy-Kien Nguyen, +1 more

Towards Unsupervised Text Classification Leveraging Experts and Word Embeddings.

Zied Haj-Yahia, +2 more

Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks

Citations

Discovering beautiful attributes for aesthetic image analysis

Content-based Image Retrieval and the Semantic Gap in the Deep Learning Era

Content-Based Image Retrieval and the Semantic Gap in the Deep Learning Era

Considering three elements of aesthetics: Multi-task self-supervised feature learning for image style classification

Transfer and Unsupervised Learning: An Integrated Approach to Concrete Crack Image Analysis

References

Adam: A Method for Stochastic Optimization

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet: A large-scale hierarchical image database

ImageNet Large Scale Visual Recognition Challenge

Visualizing Data using t-SNE

Related Papers (5)

Recognizing Image Style.

Context-Aware Embeddings for Automatic Art Analysis

Representation Learning of Music Using Artist Labels

Multi-Task Learning of Hierarchical Vision-Language Representation

Towards Unsupervised Text Classification Leveraging Experts and Word Embeddings.