scispace - formally typeset
Journal ArticleDOI

Click Prediction for Web Image Reranking Using Multimodal Sparse Coding

TLDR
A multimodal hypergraph learning-based sparse coding method is proposed for image click prediction, and the obtained click data is applied to the reranking of images, which shows the use of click prediction is beneficial to improving the performance of prominent graph-based image reranking algorithms.
Abstract
Image reranking is effective for improving the performance of a text-based image search. However, existing reranking algorithms are limited for two main reasons: 1) the textual meta-data associated with images is often mismatched with their actual visual content and 2) the extracted visual features do not accurately describe the semantic similarities between images. Recently, user click information has been used in image reranking, because clicks have been shown to more accurately describe the relevance of retrieved images to search queries. However, a critical problem for click-based methods is the lack of click data, since only a small number of web images have actually been clicked on by users. Therefore, we aim to solve this problem by predicting image clicks. We propose a multimodal hypergraph learning-based sparse coding method for image click prediction, and apply the obtained click data to the reranking of images. We adopt a hypergraph to build a group of manifolds, which explore the complementarity of different features through a group of weights. Unlike a graph that has an edge between two vertices, a hyperedge in a hypergraph connects a set of vertices, and helps preserve the local smoothness of the constructed sparse codes. An alternating optimization procedure is then performed, and the weights of different modalities and the sparse codes are simultaneously obtained. Finally, a voting strategy is used to describe the predicted click as a binary event (click or no click), from the images' corresponding sparse codes. Thorough empirical studies on a large-scale database including nearly 330 K images demonstrate the effectiveness of our approach for click prediction when compared with several other methods. Additional image reranking experiments on real-world data show the use of click prediction is beneficial to improving the performance of prominent graph-based image reranking algorithms.

read more

Citations
More filters
Journal ArticleDOI

Learning to Rank Using User Clicks and Visual Features for Image Retrieval

TL;DR: The proposed approach is based on large margin structured output learning and the visual consistency is integrated with the click features through a hypergraph regularizer term and a novel algorithm to optimize the objective function is designed.
Journal ArticleDOI

Multi-View Intact Space Learning

TL;DR: In this paper, the authors proposed the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data.
Journal ArticleDOI

A Survey of Multi-View Representation Learning

TL;DR: Multi-view representation learning has become a rapidly growing direction in machine learning and data mining areas as mentioned in this paper, and a comprehensive survey of multi-view representations can be found in this paper.
Journal ArticleDOI

Multi-view low-rank sparse subspace clustering

Maria Brbic, +1 more
- 01 Jan 2018 - 
TL;DR: An approach to multi-view subspace clustering that learns a joint subspace representation by constructing affinity matrix shared among all views is presented, relying on the importance of both low-rank and sparsity constraints in the construction of the affinity matrix.
Journal ArticleDOI

A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method

TL;DR: A novel graph-based index structure method is proposed that accelerates the neighbor search operations and also scalable for high dimensional datasets.
References
More filters
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI

Atomic Decomposition by Basis Pursuit

TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
Journal ArticleDOI

Robust Face Recognition via Sparse Representation

TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.
Related Papers (5)