scispace - formally typeset
Search or ask a question

Showing papers by "Ivor W. Tsang published in 2014"


Journal ArticleDOI
TL;DR: This paper proposes a novel method called Heterogeneous Feature Augmentation (HFA) based on SVM which can simultaneously learn the target classifier as well as infer the labels of unlabeled target samples and shows that the SHFA and HFA outperform the existing HDA methods.
Abstract: In this paper, we study the heterogeneous domain adaptation (HDA) problem, in which the data from the source domain and the target domain are represented by heterogeneous features with different dimensions. By introducing two different projection matrices, we first transform the data from two domains into a common subspace such that the similarity between samples across different domains can be measured. We then propose a new feature mapping function for each domain, which augments the transformed samples with their original features and zeros. Existing supervised learning methods ( e.g., SVM and SVR) can be readily employed by incorporating our newly proposed augmented feature representations for supervised HDA. As a showcase, we propose a novel method called Heterogeneous Feature Augmentation (HFA) based on SVM. We show that the proposed formulation can be equivalently derived as a standard Multiple Kernel Learning (MKL) problem, which is convex and thus the global solution can be guaranteed. To additionally utilize the unlabeled data in the target domain, we further propose the semi-supervised HFA (SHFA) which can simultaneously learn the target classifier as well as infer the labels of unlabeled target samples. Comprehensive experiments on three different applications clearly demonstrate that our SHFA and HFA outperform the existing HDA methods.

435 citations


Journal ArticleDOI
TL;DR: The objective of this paper is to introduce an effective region-based solution for saliency detection, and to better encode the image features for solving object recognition task by incorporating a saliency map into sparse coding-based spatial pyramid matching (ScSPM) image representation.
Abstract: The objective of this paper is twofold. First, we introduce an effective region-based solution for saliency detection. Then, we apply the achieved saliency map to better encode the image features for solving object recognition task. To find the perceptually and semantically meaningful salient regions, we extract superpixels based on an adaptive mean shift algorithm as the basic elements for saliency detection. The saliency of each superpixel is measured by using its spatial compactness, which is calculated according to the results of Gaussian mixture model (GMM) clustering. To propagate saliency between similar clusters, we adopt a modified PageRank algorithm to refine the saliency map. Our method not only improves saliency detection through large salient region detection and noise tolerance in messy background, but also generates saliency maps with a well-defined object shape. Experimental results demonstrate the effectiveness of our method. Since the objects usually correspond to salient regions, and these regions usually play more important roles for object recognition than background, we apply our achieved saliency map for object recognition by incorporating a saliency map into sparse coding-based spatial pyramid matching (ScSPM) image representation. To learn a more discriminative codebook and better encode the features corresponding to the patches of the objects, we propose a weighted sparse coding for feature coding. Moreover, we also propose a saliency weighted max pooling to further emphasize the importance of those salient regions in feature pooling module. Experimental results on several datasets illustrate that our weighted ScSPM framework greatly outperforms ScSPM framework, and achieves excellent performance for object recognition.

296 citations


Journal ArticleDOI
TL;DR: The state-of-the-art feature selection schemes reported in the field of computational intelligence are reviewed to reveal the inadequacies of existing approaches in keeping pace with the emerging phenomenon of Big Dimensionality.
Abstract: The world continues to generate quintillion bytes of data daily, leading to the pressing needs for new efforts in dealing with the grand challenges brought by Big Data. Today, there is a growing consensus among the computational intelligence communities that data volume presents an immediate challenge pertaining to the scalability issue. However, when addressing volume in Big Data analytics, researchers in the data analytics community have largely taken a one-sided study of volume, which is the "Big Instance Size" factor of the data. The flip side of volume which is the dimensionality factor of Big Data, on the other hand, has received much lesser attention. This article thus represents an attempt to fill in this gap and places special focus on this relatively under-explored topic of "Big Dimensionality", wherein the explosion of features (variables) brings about new challenges to computational intelligence. We begin with an analysis on the origins of Big Dimensionality. The evolution of feature dimensionality in the last two decades is then studied using popular data repositories considered in the data analytics and computational intelligence research communities. Subsequently, the state-of-the-art feature selection schemes reported in the field of computational intelligence are reviewed to reveal the inadequacies of existing approaches in keeping pace with the emerging phenomenon of Big Dimensionality. Last but not least, the "curse and blessing of Big Dimensionality" are delineated and deliberated.

226 citations


Journal ArticleDOI
TL;DR: This paper targets fine-grained image categorization by learning a category-specific dictionary for each category and a shared dictionary for all the categories, and imposes incoherence constraints among the different dictionaries in the objective of feature coding.
Abstract: This paper targets fine-grained image categorization by learning a category-specific dictionary for each category and a shared dictionary for all the categories. Such category-specific dictionaries encode subtle visual differences among different categories, while the shared dictionary encodes common visual patterns among all the categories. To this end, we impose incoherence constraints among the different dictionaries in the objective of feature coding. In addition, to make the learnt dictionary stable, we also impose the constraint that each dictionary should be self-incoherent. Our proposed dictionary learning formulation not only applies to fine-grained classification, but also improves conventional basic-level object categorization and other tasks such as event recognition. Experimental results on five data sets show that our method can outperform the state-of-the-art fine-grained image categorization frameworks as well as sparse coding based dictionary learning frameworks. All these results demonstrate the effectiveness of our method.

172 citations


Proceedings Article
27 Jul 2014
TL;DR: A deep learning approach to learn a feature mapping between cross-domain heterogeneous features as well as a better feature representation for mapped data to reduce the bias issue caused by the cross- domain correspondences.
Abstract: Most previous heterogeneous transfer learning methods learn a cross-domain feature mapping between heterogeneous feature spaces based on a few cross-domain instance-correspondences, and these corresponding instances are assumed to be representative in the source and target domains respectively. However, in many real-world scenarios, this assumption may not hold. As a result, the constructed feature mapping may not be precise due to the bias issue of the correspondences in the target or (and) source domain(s). In this case, a classifier trained on the labeled transformed-source-domain data may not be useful for the target domain. In this paper, we present a new transfer learning framework called Hybrid Heterogeneous Transfer Learning (HHTL), which allows the corresponding instances across domains to be biased in either the source or target domain. Specifically, we propose a deep learning approach to learn a feature mapping between cross-domain heterogeneous features as well as a better feature representation for mapped data to reduce the bias issue caused by the cross-domain correspondences. Extensive experiments on several multilingual sentiment classification tasks verify the effectiveness of our proposed approach compared with some baseline methods.

168 citations


Journal ArticleDOI
TL;DR: A new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data is presented, and then reformulate it as a convex semi-infinite programming (SIP) problem to address the SIP, and an efficient feature generating paradigm is proposed.
Abstract: In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an efficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some efficient cache techniques are also developed. The feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower feature selection bias. Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures, and 2) nonlinear feature selection with explicit feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(1014) features demonstrate the competitive performance of the proposed method over state-of-the-art feature selection methods in terms of generalization performance and training effciency.

150 citations


Proceedings Article
02 Apr 2014
TL;DR: An efficient multi-class heterogeneous domain adaptation method, where data from source and target domains are represented by heterogeneous features of different dimensions, to reconstruct a sparse feature transformation matrix to map the weight vector of classifiers learned from the source domain to the target domain.
Abstract: In this paper, we present an efficient multi-class heterogeneous domain adaptation method, where data from source and target domains are represented by heterogeneous features of different dimensions. Specifically, we propose to reconstruct a sparse feature transformation matrix to map the weight vector of classifiers learned from the source domain to the target domain. We cast this learning task as a compressed sensing problem, where each binary classifier induced from multiple classes can be deemed as a measurement sensor. Based on the compressive sensing theory, the estimation error of the transformation matrix decreases with the increasing number of classifiers. Therefore, to guarantee reconstruction performance, we construct sufficiently many binary classifiers based on the error correcting output coding. Extensive experiments are conducted on both a toy dataset and three real-world datasets to verify the superiority of our proposed method over existing state-of-the-art HDA methods in terms of prediction accuracy.

138 citations


Proceedings Article
21 Jun 2014
TL;DR: An efficient method, called Riemannian Pursuit, that aims to address low rank matrix recovery and fixed-rank optimization problems simultaneously and substantially outperforms existing methods when applied to large-scale and ill-conditioned matrices.
Abstract: Low rank matrix recovery is a fundamental task in many real-world applications. The performance of existing methods, however, deteriorates significantly when applied to ill-conditioned or large-scale matrices. In this paper, we therefore propose an efficient method, called Riemannian Pursuit (RP), that aims to address these two problems simultaneously. Our method consists of a sequence of fixed-rank optimization problems. Each subproblem, solved by a nonlinear Riemannian conjugate gradient method, aims to correct the solution in the most important subspace of increasing size. Theoretically, RP converges linearly under mild conditions and experimental results show that it substantially outperforms existing methods when applied to large-scale and ill-conditioned matrices.

74 citations


Book ChapterDOI
06 Sep 2014
TL;DR: The proposed Feature Disentangling Machine (FDM) integrates sparse support vector machine and multi-task learning in a unified framework, where a novel loss function and a set of constraints are formulated to precisely control the sparsity and naturally disentangle active features.
Abstract: Studies in psychology show that not all facial regions are of importance in recognizing facial expressions and different facial regions make different contributions in various facial expressions. Motivated by this, a novel framework, named Feature Disentangling Machine (FDM), is proposed to effectively select active features characterizing facial expressions. More importantly, the FDM aims to disentangle these selected features into non-overlapped groups, in particular, common features that are shared across different expressions and expression-specific features that are discriminative only for a target expression. Specifically, the FDM integrates sparse support vector machine and multi-task learning in a unified framework, where a novel loss function and a set of constraints are formulated to precisely control the sparsity and naturally disentangle active features. Extensive experiments on two well-known facial expression databases have demonstrated that the FDM outperforms the state-of-the-art methods for facial expression analysis. More importantly, the FDM achieves an impressive performance in a cross-database validation, which demonstrates the generalization capability of the selected features.

70 citations


Journal ArticleDOI
TL;DR: A new regularizer is introduced into the objective function of the recent work spectral hashing to control the mismatch between the resultant hamming embedding and the low-dimensional data representation, which is obtained by using a linear regression function.
Abstract: We propose a new graph based hashing method called spectral embedded hashing (SEH) for large-scale image retrieval. We first introduce a new regularizer into the objective function of the recent work spectral hashing to control the mismatch between the resultant hamming embedding and the low-dimensional data representation, which is obtained by using a linear regression function. This linear regression function can be employed to effectively handle the out-of-sample data, and the introduction of the new regularizer makes SEH better cope with the data sampled from a nonlinear manifold. Considering that SEH cannot efficiently cope with the high dimensional data, we further extend SEH to kernel SEH (KSEH) to improve the efficiency and effectiveness, in which a nonlinear regression function can also be employed to obtain the low dimensional data representation. We also develop a new method to efficiently solve the approximate solution for the eigenvalue decomposition problem in SEH and KSEH. Moreover, we show that some existing hashing methods are special cases of our KSEH. Our comprehensive experiments on CIFAR, Tiny-580K, NUS-WIDE, and Caltech-256 datasets clearly demonstrate the effectiveness of our methods.

63 citations


Journal ArticleDOI
TL;DR: A multi-layer group based tag propagation method, which combines the class label and subgroups of instances with similar tag distribution to annotate test images, and achieves excellent performances in both image classification and annotation tasks.
Abstract: We present a multi-layer group sparse coding framework for concurrent single-label image classification and annotation. By leveraging the dependency between image class label and tags, we introduce a multi-layer group sparse structure of the reconstruction coefficients. Such structure fully encodes the mutual dependency between the class label, which describes image content as a whole, and tags, which describe the components of the image content. Therefore we propose a multi-layer group based tag propagation method, which combines the class label and subgroups of instances with similar tag distribution to annotate test images. To make our model more suitable for nonlinear separable features, we also extend our multi-layer group sparse coding in the Reproducing Kernel Hilbert Space (RKHS), which further improves performances of image classification and annotation. Moreover, we also integrate our multi-layer group sparse coding with kNN strategy, which greatly improves the computational efficiency. Experimental results on the LabelMe, UIUC-Sports and NUS-WIDE-Object databases show that our method outperforms the baseline methods, and achieves excellent performances in both image classification and annotation tasks.

Proceedings ArticleDOI
23 Jun 2014
TL;DR: This paper proposes an algorithm which adaptively utilizes the related exemplars by cross-feature learning and tests it using the large scale TRECVID 2011 dataset and it gains promising performance.
Abstract: We address the challenging problem of utilizing related exemplars for complex event detection while multiple features are available. Related exemplars share certain positive elements of the event, but have no uniform pattern due to the huge variance of relevance levels among different related exemplars. None of the existing multiple feature fusion methods can deal with the related exemplars. In this paper, we propose an algorithm which adaptively utilizes the related exemplars by cross-feature learning. Ordinal labels are used to represent the multiple relevance levels of the related videos. Label candidates of related exemplars are generated by exploring the possible relevance levels of each related exemplar via a cross-feature voting strategy. Maximum margin criterion is then applied in our framework to discriminate the positive and negative exemplars, as well as the related exemplars from different relevance levels. We test our algorithm using the large scale TRECVID 2011 dataset and it gains promising performance.

Proceedings ArticleDOI
01 Jan 2014
TL;DR: A two-phase framework to adapt existing relation extraction classifiers to extract relations for new target domains to address two challenges: negative transfer when knowledge in source domains is used without considering the differences in relation distributions; and lack of adequate labeled samples for rarer relations in the new domain, due to a small labeled data set and imbalance relation distributions.
Abstract: We propose a two-phase framework to adapt existing relation extraction classifiers to extract relations for new target domains. We address two challenges: negative transfer when knowledge in source domains is used without considering the differences in relation distributions; and lack of adequate labeled samples for rarer relations in the new domain, due to a small labeled data set and imbalance relation distributions. Our framework leverages on both labeled and unlabeled data in the target domain. First, we determine the relevance of each source domain to the target domain for each relation type, using the consistency between the clustering given by the target domain labels and the clustering given by the predictors trained for the source domain. To overcome the lack of labeled samples for rarer relations, these clusterings operate on both the labeled and unlabeled data in the target domain. Second, we trade-off between using relevance-weighted sourcedomain predictors and the labeled target data. Again, to overcome the imbalance distribution, the source-domain predictors operate on the unlabeled target data. Our method outperforms numerous baselines and a weakly-supervised relation extraction method on ACE 2004 and YAGO.

Book ChapterDOI
01 Nov 2014
TL;DR: This work presents a deep bi-modal knowledge representation of images based on their visual content and associated tags (text) and a mapping step between the different levels of visual and textual representations allows for the transfer of semantic knowledge between the two modalities.
Abstract: Automatically understanding and modeling a user’s liking for an image is a challenging problem. This is because the relationship between the images features (even semantic ones extracted by existing tools, viz. faces, objects etc.) and users’ ‘likes’ is non-linear, influenced by several subtle factors. This work presents a deep bi-modal knowledge representation of images based on their visual content and associated tags (text). A mapping step between the different levels of visual and textual representations allows for the transfer of semantic knowledge between the two modalities. It also includes feature selection before learning deep representation to identify the important features for a user to like an image. Then the proposed representation is shown to be effective in learning a model of users image ‘likes’ based on a collection of images ‘liked’ by him. On a collection of images ‘liked’ by users (from Flickr) the proposed deep representation is shown to better state-of-art low-level features used for modeling user ‘likes’ by around 15–20 %.

01 Jan 2014
TL;DR: In this article, the parameter setting for, and the proof of lemmas and theorems appeared in the main paper, and the results appeared in this supplementary file, as well.
Abstract: In this supplementary file, we first present the parameter setting for , and then present the proof of the lemmas and theorems appeared in the main paper.