Click Prediction for Web Image Reranking Using Multimodal Sparse Coding

doi:10.1109/TIP.2014.2311377

Journal ArticleDOI

Click Prediction for Web Image Reranking Using Multimodal Sparse Coding

Jun Yu, +2 more

- 11 Mar 2014 -

IEEE Transactions on Image Processing

- Vol. 23, Iss: 5, pp 2019-2032

TLDR

A multimodal hypergraph learning-based sparse coding method is proposed for image click prediction, and the obtained click data is applied to the reranking of images, which shows the use of click prediction is beneficial to improving the performance of prominent graph-based image reranking algorithms.

Abstract:

Image reranking is effective for improving the performance of a text-based image search. However, existing reranking algorithms are limited for two main reasons: 1) the textual meta-data associated with images is often mismatched with their actual visual content and 2) the extracted visual features do not accurately describe the semantic similarities between images. Recently, user click information has been used in image reranking, because clicks have been shown to more accurately describe the relevance of retrieved images to search queries. However, a critical problem for click-based methods is the lack of click data, since only a small number of web images have actually been clicked on by users. Therefore, we aim to solve this problem by predicting image clicks. We propose a multimodal hypergraph learning-based sparse coding method for image click prediction, and apply the obtained click data to the reranking of images. We adopt a hypergraph to build a group of manifolds, which explore the complementarity of different features through a group of weights. Unlike a graph that has an edge between two vertices, a hyperedge in a hypergraph connects a set of vertices, and helps preserve the local smoothness of the constructed sparse codes. An alternating optimization procedure is then performed, and the weights of different modalities and the sparse codes are simultaneously obtained. Finally, a voting strategy is used to describe the predicted click as a binary event (click or no click), from the images' corresponding sparse codes. Thorough empirical studies on a large-scale database including nearly 330 K images demonstrate the effectiveness of our approach for click prediction when compared with several other methods. Additional image reranking experiments on real-world data show the use of click prediction is beneficial to improving the performance of prominent graph-based image reranking algorithms.

Click Prediction for Web Image Reranking Using Multimodal Sparse Coding

Citations

Learning to Rank Using User Clicks and Visual Features for Image Retrieval

Multi-View Intact Space Learning

A Survey of Multi-View Representation Learning

Multi-view low-rank sparse subspace clustering

A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method

References

Distinctive Image Features from Scale-Invariant Keypoints

Regression Shrinkage and Selection via the Lasso

Atomic Decomposition by Basis Pursuit

Robust Face Recognition via Sparse Representation

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Related Papers (5)

Learning to Rank Using User Clicks and Visual Features for Image Retrieval

Multimodal Deep Autoencoder for Human Pose Recovery

Adaptive Hypergraph Learning and its Application in Image Classification

Very Deep Convolutional Networks for Large-Scale Image Recognition

Robust Face Recognition via Sparse Representation