Learning Coupled Feature Spaces for Cross-Modal Matching

A novel coupled linear regression framework to deal with the measure of relevance and coupled feature selection in cross-modal data matching, and an iterative algorithm based on half-quadratic minimization to solve the proposed regularized linear regression problem.

Abstract:

Cross-modal matching has recently drawn much attention due to the widespread existence of multimodal data. It aims to match data from different modalities, and generally involves two basic problems: the measure of relevance and coupled feature selection. Most previous works mainly focus on solving the first problem. In this paper, we propose a novel coupled linear regression framework to deal with both problems. Our method learns two projection matrices to map multimodal data into a common feature space, in which cross-modal data matching can be performed. And in the learning procedure, the ell_21-norm penalties are imposed on the two projection matrices separately, which leads to select relevant and discriminative features from coupled feature spaces simultaneously. A trace norm is further imposed on the projected data as a low-rank constraint, which enhances the relevance of different modal data with connections. We also present an iterative algorithm based on half-quadratic minimization to solve the proposed regularized linear regression problem. The experimental results on two challenging cross-modal datasets demonstrate that the proposed method outperforms the state-of-the-art approaches.

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Visual Domain Adaptation: A survey of recent advances

Vishal M. Patel,Raghuraman Gopalan,Ruonan Li,Rama Chellappa +3 moreUniversity of Maryland, College Park,AT&T Labs,Harvard University

- 02 Apr 2015 -

IEEE Signal Processing Magazine

Show Less

TL;DR: A survey of domain adaptation methods for visual recognition discusses the merits and drawbacks of existing domain adaptation approaches and identifies promising avenues for research in this rapidly evolving field.

...read moreread less

Proceedings ArticleDOI

Adversarial Cross-Modal Retrieval

Bokun Wang,Yang Yang,Xing Xu,Alan Hanjalic,Heng Tao Shen +4 moreUniversity of Electronic Science and Technology of China,Delft University of Technology

Show Less

TL;DR: Comprehensive experimental results show that the proposed ACMR method is superior in learning effective subspace representation and that it significantly outperforms the state-of-the-art cross-modal retrieval methods.

...read moreread less

Journal ArticleDOI

Multi-View Discriminant Analysis

Meina Kan,Shiguang Shan,Haihong Zhang,Shihong Lao,Xilin Chen +4 moreChinese Academy of Sciences,Omron

- 01 Jan 2016 -

IEEE Transactions on Pattern Analysis an...

Show Less

TL;DR: This work proposes a Multi-view Discriminant Analysis (MvDA) approach, which seeks for a single discriminant common space for multiple views in a non-pairwise manner by jointly learning multiple view-specific linear transforms.

...read moreread less

Journal ArticleDOI

Transductive Multi-View Zero-Shot Learning

Yanwei Fu,Timothy M. Hospedales,Tao Xiang,Shaogang Gong +3 moreDisney Research,Queen Mary University of London

- 01 Nov 2015 -

IEEE Transactions on Pattern Analysis an...

Show Less

TL;DR: A novel heterogeneous multi-view hypergraph label propagation method is formulated for zero-shot learning in the transductive embedding space that rectifies the projection shift between the auxiliary and target domains, exploits the complementarity of multiple semantic representations, and significantly outperforms existing methods for both zero- shot and N-shot recognition.

...read moreread less

Journal ArticleDOI

Learning Compact Binary Face Descriptor for Face Recognition

Jiwen Lu,Venice Erin Liong,Xiuzhuang Zhou,Jie Zhou +3 moreTsinghua University,University of Illinois at Urbana–Champaign,Capital Normal University

- 01 Oct 2015 -

IEEE Transactions on Pattern Analysis an...

Show Less

TL;DR: A compact binary face descriptor (CBFD) feature learning method for face representation and recognition that reduces the modality gap of heterogeneous faces at the feature level to make the method applicable to heterogeneous face recognition.

...read moreread less

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Canonical Correlation Analysis: An Overview with Application to Learning Methods

David R. Hardoon,Sandor Szedmak,John Shawe-Taylor +2 moreUniversity of Southampton

- 01 Dec 2004 -

Neural Computation

Show Less

TL;DR: A general method using kernel canonical correlation analysis to learn a semantic representation to web images and their associated text and compares orthogonalization approaches against a standard cross-representation retrieval technique known as the generalized vector space model is presented.

...read moreread less

Proceedings ArticleDOI

A new approach to cross-modal multimedia retrieval

Nikhil Rasiwasia,Jose Costa Pereira,Emanuele Coviello,Gabriel Doyle,Gert R. G. Lanckriet,Roger Levy,Nuno Vasconcelos +6 moreUniversity of California, San Diego

Show Less

TL;DR: It is shown that accounting for cross-modal correlations and semantic abstraction both improve retrieval accuracy and are shown to outperform state-of-the-art image retrieval systems on a unimodal retrieval task.

...read moreread less

Book ChapterDOI

Overview and recent advances in partial least squares

Roman Rosipal,Nicole C. Krämer +1 moreAustrian Research Institute for Artificial Intelligence,Technical University of Berlin

- 23 Feb 2005 -

Lecture Notes in Computer Science

Show Less

TL;DR: Partial Least Squares (PLS) as mentioned in this paper is a wide class of methods for modeling relations between sets of observed variables by means of latent variables, which comprises of regression and classification tasks as well as dimension reduction techniques and modeling tools.

...read moreread less

Journal ArticleDOI

Separating Style and Content with Bilinear Models

Joshua B. Tenenbaum,William T. Freeman +1 moreMassachusetts Institute of Technology,Mitsubishi Electric Research Laboratories

- 01 Jun 2000 -

Neural Computation

Show Less

TL;DR: A general framework for learning to solve two-factor tasks using bilinear models, which provide sufficiently expressive representations of factor interactions but can nonetheless be fit to data using efficient algorithms based on the singular value decomposition and expectation-maximization are presented.

...read moreread less

Proceedings ArticleDOI

Generalized Multiview Analysis: A discriminative latent space

Abhishek Sharma,Abhishek Kumar,Hal Daumé,David W. Jacobs +3 moreUniversity of Maryland, College Park

Show Less

TL;DR: GMA solves a joint, relaxed QCQP over different feature spaces to obtain a single (non)linear subspace and is a supervised extension of Canonical Correlational Analysis (CCA), which is useful for cross-view classification and retrieval.

...read moreread less