Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval

doi:10.1016/J.IMAVIS.2017.12.005

Home
/
Papers
/
Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval

Journal Article•DOI•

Learning Deep Similarity Models with Focus Ranking for Fabric Image Retrieval

Daiguo Deng¹, Ruomei Wang¹, Hefeng Wu², Huayong He¹, Qi Li³, Xiaonan Luo⁴ - Show less +2 more•Institutions (4)

Sun Yat-sen University¹, Guangdong University of Foreign Studies², Western Kentucky University³, Guilin University of Electronic Technology⁴

01 Feb 2018-Image and Vision Computing (Elsevier)-Vol. 70, pp 11-20

TL;DR: This paper proposes a novel embedding method termed focus ranking that can be easily unified into a CNN for jointly learning image representations and metrics in the context of fine-grained fabric image retrieval and shows the superiority of the proposed model over existing metric embedding models.

read less

About: This article is published in Image and Vision Computing.The article was published on 2018-02-01 and is currently open access. It has received 30 citations till now. The article focuses on the topics: Image retrieval & Feature learning.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Fabric Image Retrieval System Using Hierarchical Search Based on Deep Convolutional Neural Network

[...]

Jun Xiang¹, Ning Zhang¹, Ruru Pan¹, Weidong Gao¹•Institutions (1)

Jiangnan University¹

21 Feb 2019-IEEE Access

TL;DR: The idea of the proposed framework is that the binary code and feature for representing the image can be learning by a deep CNN when the data labels are available.

...read moreread less

Abstract: Fabric image retrieval is a meaningful issue, due to its potential values in many areas such as textile product design, e-commerce, and inventory management. Meanwhile, it is challenging because of the diversity of fabric appearance. Encourage by the recent breakthrough in the deep convolutional neural network (CNN), a deep learning framework is applied for fabric image retrieval. The idea of the proposed framework is that the binary code and feature for representing the image can be learning by a deep CNN when the data labels are available. The proposed framework employs a hierarchical search strategy that includes coarse-level retrieval and fine-level retrieval. Otherwise, a large-scale wool fabric image retrieval dataset named WFID with about 20 000 images are built to validate the proposed framework. The longitudinal comparison experiments for self-parameter optimization and horizontal comparison experiments for verifying the superiority of the algorithm are performed on this data set. The comparison experimental results indicate the superiority of the proposed framework.

...read moreread less

41 citations

Journal Article•DOI•

Learning non-metric visual similarity for image retrieval

[...]

Noa Garcia¹, George Vogiatzis¹•Institutions (1)

Aston University¹

01 Feb 2019-Image and Vision Computing

TL;DR: It is argued that non-metric similarity functions based on neural networks can build a better model of human visual perception than standard metric distances.

...read moreread less

26 citations

Additional excerpts

...More recently, deep similarity learning has also been applied to fabric image retrieval [9] by using triplets of samples to ensure that similar features are mapped closer than non-similar features....
[...]

Journal Article•DOI•

Image retrieval of wool fabric. Part I: Based on low-level texture features:

[...]

Ning Zhang¹, Jun Xiang¹, Lei Wang¹, Weidong Gao¹, Ruru Pan¹ - Show less +1 more•Institutions (1)

Jiangnan University¹

14 Feb 2019-Textile Research Journal

TL;DR: Experimental results indicate that the framework is effective and superior for image retrieval of wool fabric, providing referential assistance for the worker in the factory and improving retrieval efficiency.

...read moreread less

Abstract: Color is difficult to distinguish by human vision and is described by keywords, resulting in low efficiency of wool fabric retrieval in factories at present. To obtain the process sheets of existin...

...read moreread less

14 citations

Posted Content•

Deep image retrieval: a survey

[...]

Wei Chen, Yu Liu, Weiping Wang, Erwin M. Bakker, Theodoros Georgiou, Paul Fieguth, Li Liu, Michael S. Lew - Show less +4 more

03 Feb 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors organize and review recent content-based image retrieval (CBIR) works that are developed based on deep learning algorithms and techniques, including insights and techniques from recent papers.

...read moreread less

Abstract: In recent years a vast amount of visual content has been generated and shared from various fields, such as social media platforms, medical images, and robotics. This abundance of content creation and sharing has introduced new challenges. In particular, searching databases for similar content, i.e.content based image retrieval (CBIR), is a long-established research area, and more efficient and accurate methods are needed for real time retrieval. Artificial intelligence has made progress in CBIR and has significantly facilitated the process of intelligent search. In this survey we organize and review recent CBIR works that are developed based on deep learning algorithms and techniques, including insights and techniques from recent papers. We identify and present the commonly-used benchmarks and evaluation methods used in the field. We collect common challenges and propose promising future directions. More specifically, we focus on image retrieval with deep learning and organize the state of the art methods according to the types of deep network structure, deep features, feature enhancement methods, and network fine-tuning strategies. Our survey considers a wide variety of recent methods, aiming to promote a global view of the field of instance-based CBIR.

...read moreread less

10 citations

Journal Article•DOI•

AutoRet: A Self-Supervised Spatial Recurrent Network for Content-Based Image Retrieval

[...]

Muhammad Mostafa Monowar, Md. Abdul Hamid, Abu Quwsar Ohi, Madini O. Alassafi, Muhammad F. Mridha - Show less +1 more

01 Mar 2022-Sensors

TL;DR: In this article , a self-supervised image retrieval system is proposed based on deep convolutional neural network (DCNN) for image retrieval, which can work in self-vision and can also be trained on a partially labeled dataset.

...read moreread less

Abstract: Image retrieval techniques are becoming famous due to the vast availability of multimedia data. The present image retrieval system performs excellently on labeled data. However, often, data labeling becomes costly and sometimes impossible. Therefore, self-supervised and unsupervised learning strategies are currently becoming illustrious. Most of the self/unsupervised strategies are sensitive to the number of classes and can not mix labeled data on availability. In this paper, we introduce AutoRet, a deep convolutional neural network (DCNN) based self-supervised image retrieval system. The system is trained on pairwise constraints. Therefore, it can work in self-supervision and can also be trained on a partially labeled dataset. The overall strategy includes a DCNN that extracts embeddings from multiple patches of images. Further, the embeddings are fused for quality information used for the image retrieval process. The method is benchmarked with three different datasets. From the overall benchmark, it is evident that the proposed method works better in a self-supervised manner. In addition, the evaluation exhibits the proposed method’s performance to be highly convincing while a small portion of labeled data are mixed on availability.

...read moreread less

10 citations

1
2
3
4
…
5
6

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

ImageNet Classification with Deep Convolutional Neural Networks

[...]

Alex Krizhevsky¹, Ilya Sutskever¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

...read moreread less

73,978 citations

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

46,906 citations

Journal Article•DOI•

ImageNet Large Scale Visual Recognition Challenge

[...]

Olga Russakovsky¹, Jia Deng², Hao Su¹, Jonathan Krause¹, Sanjeev Satheesh¹, Sean Ma¹, Zhiheng Huang¹, Andrej Karpathy¹, Aditya Khosla³, Michael S. Bernstein¹, Alexander C. Berg⁴, Li Fei-Fei¹ - Show less +8 more•Institutions (4)

Stanford University¹, University of Michigan², Massachusetts Institute of Technology³, University of North Carolina at Chapel Hill⁴

01 Dec 2015-International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

...read moreread less

30,811 citations

Proceedings Article•DOI•

Fully convolutional networks for semantic segmentation

[...]

Jonathan Long¹, Evan Shelhamer¹, Trevor Darrell¹•Institutions (1)

University of California, Berkeley¹

07 Jun 2015

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Abstract: Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet [20], the VGG net [31], and GoogLeNet [32]) into fully convolutional networks and transfer their learned representations by fine-tuning [3] to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes less than one fifth of a second for a typical image.

...read moreread less

28,225 citations

Additional excerpts

...Recently, deep representation learning has been successfully applied to various computer vision areas, such as image classification [10, 22, 23], object detection [24, 25, 26], pixelwise image labeling [27, 28] and human centric analysis [29, 30]....
[...]

Proceedings Article•DOI•

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

[...]

Ross Girshick¹, Jeff Donahue¹, Trevor Darrell¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

23 Jun 2014

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Abstract: Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

...read moreread less

21,729 citations