scispace - formally typeset
Proceedings ArticleDOI

Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks

TLDR
An unsupervised protocol for learning a neural embedding of visual style of images using a proxy measure for forming triplets of anchor, similar, and dissimilar images to learn a compact style embedding that is useful for style-based search and retrieval.
Abstract
We propose an unsupervised protocol for learning a neural embedding of visual style of images. Style similarity is an important measure for many applications such as style transfer, fashion search, art exploration, etc. However, computational modeling of style is a difficult task owing to its vague and subjective nature. Most methods for style based retrieval use supervised training with pre-defined categorization of images according to style. While this paradigm is suitable for applications where style categories are well-defined and curating large datasets according to such a categorization is feasible, in several other cases such a categorization is either ill-defined or does not exist. Our protocol for learning style based representations does not leverage categorical labels but a proxy measure for forming triplets of anchor, similar, and dissimilar images. Using these triplets, we learn a compact style embedding that is useful for style-based search and retrieval. The learned embeddings outperform other unsupervised representations for style-based image retrieval task on six datasets that capture different meanings of style. We also show that by fine-tuning the learned features with dataset-specific style labels, we obtain best results for image style recognition task on five of the six datasets.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Discovering beautiful attributes for aesthetic image analysis

TL;DR: This work proposes to discover and learn the visual appearance of attributes automatically, using a recently introduced database, called AVA, which contains more than 250,000 images together with their aesthetic scores and textual comments given by photography enthusiasts, and describes how these three key components of AVA can be effectively leveraged to learn visual attributes.
Posted Content

Content-based Image Retrieval and the Semantic Gap in the Deep Learning Era

TL;DR: It is concluded that the key problem for the further advancement of semantic image retrieval lies in the lack of a standardized task definition and an appropriate benchmark dataset.
Book ChapterDOI

Content-Based Image Retrieval and the Semantic Gap in the Deep Learning Era

TL;DR: In this article, the authors show that the recent advances in instance retrieval transfer to more generic image retrieval scenarios, which is called instance or object retrieval and requires matching fine-grained visual patterns between images.
Journal ArticleDOI

Considering three elements of aesthetics: Multi-task self-supervised feature learning for image style classification

TL;DR: Zhang et al. as discussed by the authors proposed a multi-task self-supervised style feature learning algorithm considering the three elements of aesthetics including compositional rules, luminance and color.
Journal ArticleDOI

Transfer and Unsupervised Learning: An Integrated Approach to Concrete Crack Image Analysis

L.J. Gradisar, +1 more
- 16 Feb 2023 - 
TL;DR: In this paper , an integrated methodology for analyzing concrete crack images is proposed using transfer and unsupervised learning, which extracts image features using pre-trained networks and groups them based on similarity using hierarchical clustering.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Journal Article

Visualizing Data using t-SNE

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Related Papers (5)