scispace - formally typeset
Search or ask a question

Showing papers on "Feature (computer vision) published in 2012"


Posted Content
TL;DR: The authors randomly omits half of the feature detectors on each training case to prevent complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors.
Abstract: When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.

6,899 citations


Proceedings ArticleDOI
01 Sep 2012
TL;DR: The neighbor embedding SR algorithm so designed is shown to give good visual results, comparable to other state-of-the-art methods, while presenting an appreciable reduction of the computational time.
Abstract: This paper describes a single-image super-resolution (SR) algorithm based on nonnegative neighbor embedding. It belongs to the family of single-image example-based SR algorithms, since it uses a dictionary of low resolution (LR) and high resolution (HR) trained patch pairs to infer the unknown HR details. Each LR feature vector in the input image is expressed as the weighted combination of its K nearest neighbors in the dictionary; the corresponding HR feature vector is reconstructed under the assumption that the local LR embedding is preserved. Three key aspects are introduced in order to build a low-complexity competitive algorithm: (i) a compact but efficient representation of the patches (feature representation) (ii) an accurate estimation of the patches by their nearest neighbors (weight computation) (iii) a compact and already built (therefore external) dictionary, which allows a one-step upscaling. The neighbor embedding SR algorithm so designed is shown to give good visual results, comparable to other state-of-the-art methods, while presenting an appreciable reduction of the computational time.

2,059 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: A conceptually clear and intuitive algorithm for contrast-based saliency estimation that outperforms all state-of-the-art approaches and can be formulated in a unified way using high-dimensional Gaussian filters.
Abstract: Saliency estimation has become a valuable tool in image processing. Yet, existing approaches exhibit considerable variation in methodology, and it is often difficult to attribute improvements in result quality to specific algorithm properties. In this paper we reconsider some of the design choices of previous methods and propose a conceptually clear and intuitive algorithm for contrast-based saliency estimation. Our algorithm consists of four basic steps. First, our method decomposes a given image into compact, perceptually homogeneous elements that abstract unnecessary detail. Based on this abstraction we compute two measures of contrast that rate the uniqueness and the spatial distribution of these elements. From the element contrast we then derive a saliency measure that produces a pixel-accurate saliency map which uniformly covers the objects of interest and consistently separates fore- and background. We show that the complete contrast and saliency estimation can be formulated in a unified way using high-dimensional Gaussian filters. This contributes to the conceptual simplicity of our method and lends itself to a highly efficient implementation with linear complexity. In a detailed experimental evaluation we analyze the contribution of each individual feature and show that our method outperforms all state-of-the-art approaches.

1,711 citations


Journal ArticleDOI
TL;DR: "Radiomics" refers to the extraction and analysis of large amounts of advanced quantitative imaging features with high throughput from medical images obtained with computed tomography, positron emission tomography or magnetic resonance imaging, leading to a very large potential subject pool.

1,608 citations


01 Jan 2012
TL;DR: In this paper, the authors examined two factors that differentiate between successful and unsuccessful brand extensions: product feature similarity and brand concept consistency, and found that consumers take into account not only information about the product-level feature similarity between the new product and the products already associated with the brand, but also the concept consistency between the brand concept and the extension.
Abstract: This article examines two factors that differentiate between successful and unsuccessful brand extensions: product feature similarity and brand concept consistency. The results reveal that, in identifying brand extensions, consumers take into account not only information about the product-level feature similarity between the new product and the products already associated with the brand, but also the concept consistency between the brand concept and the extension. For both function-oriented and prestige-oriented brand names, the most favorable reactions occur when brand extensions are made with high brand concept consistency and high product feature similarity. In addition, the relative impact of these two factors differs to some extent, depending on the nature of the brand-name concept. When a brand's concept is consistent with those of its extension products, the prestige brand seems to have greater extendibility to products with low feature similarity than the functional brand does. Copyright 1991 by the University of Chicago.

1,173 citations


Journal Article
TL;DR: Overall it is concluded that the JMI criterion provides the best tradeoff in terms of accuracy, stability, and flexibility with small data samples.
Abstract: We present a unifying framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical interpretation. This is in response to the question: "what are the implicit statistical assumptions of feature selection criteria based on mutual information?". To answer this, we adopt a different strategy than is usual in the feature selection literature--instead of trying to define a criterion, we derive one, directly from a clearly specified objective function: the conditional likelihood of the training labels. While many hand-designed heuristic criteria try to optimize a definition of feature 'relevancy' and 'redundancy', our approach leads to a probabilistic framework which naturally incorporates these concepts. As a result we can unify the numerous criteria published over the last two decades, and show them to be low-order approximations to the exact (but intractable) optimisation problem. The primary contribution is to show that common heuristics for information based feature selection (including Markov Blanket algorithms as a special case) are approximate iterative maximisers of the conditional likelihood. A large empirical study provides strong evidence to favour certain classes of criteria, in particular those that balance the relative size of the relevancy/redundancy terms. Overall we conclude that the JMI criterion (Yang and Moody, 1999; Meyer et al., 2008) provides the best tradeoff in terms of accuracy, stability, and flexibility with small data samples.

1,058 citations


Book ChapterDOI
07 Oct 2012
TL;DR: An improvement to the Active Shape Model is proposed that allows for greater independence among the facial components and improves on the appearance fitting step by introducing a Viterbi optimization process that operates along the facial contours.
Abstract: We address the problem of interactive facial feature localization from a single image. Our goal is to obtain an accurate segmentation of facial features on high-resolution images under a variety of pose, expression, and lighting conditions. Although there has been significant work in facial feature localization, we are addressing a new application area, namely to facilitate intelligent high-quality editing of portraits, that brings requirements not met by existing methods. We propose an improvement to the Active Shape Model that allows for greater independence among the facial components and improves on the appearance fitting step by introducing a Viterbi optimization process that operates along the facial contours. Despite the improvements, we do not expect perfect results in all cases. We therefore introduce an interaction model whereby a user can efficiently guide the algorithm towards a precise solution. We introduce the Helen Facial Feature Dataset consisting of annotated portrait images gathered from Flickr that are more diverse and challenging than currently existing datasets. We present experiments that compare our automatic method to published results, and also a quantitative evaluation of the effectiveness of our interactive method.

973 citations


Journal ArticleDOI
TL;DR: This work introduces explicit feature maps for the additive class of kernels, such as the intersection, Hellinger's, and χ2 kernels, commonly used in computer vision, and enables their use in large scale problems.
Abstract: Large scale nonlinear support vector machines (SVMs) can be approximated by linear ones using a suitable feature map. The linear SVMs are in general much faster to learn and evaluate (test) than the original nonlinear SVMs. This work introduces explicit feature maps for the additive class of kernels, such as the intersection, Hellinger's, and χ2 kernels, commonly used in computer vision, and enables their use in large scale problems. In particular, we: 1) provide explicit feature maps for all additive homogeneous kernels along with closed form expression for all common kernels; 2) derive corresponding approximate finite-dimensional feature maps based on a spectral analysis; and 3) quantify the error of the approximation, showing that the error is independent of the data dimension and decays exponentially fast with the approximation order for selected kernels such as χ2. We demonstrate that the approximations have indistinguishable performance from the full kernels yet greatly reduce the train/test times of SVMs. We also compare with two other approximation methods: Nystrom's approximation of Perronnin et al. [1], which is data dependent, and the explicit map of Maji and Berg [2] for the intersection kernel, which, as in the case of our approximations, is data independent. The approximations are evaluated on a number of standard data sets, including Caltech-101 [3], Daimler-Chrysler pedestrians [4], and INRIA pedestrians [5].

804 citations


Posted Content
TL;DR: This paper proposes to accelerate the computation of the l2, 1-norm regularized regression model by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov's method---an optimal first-order black-box method for smooth conveX optimization.
Abstract: The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the l2,1-norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilistic framework by assuming a suitable prior from the exponential family. One appealing feature of the l2,1-norm regularization is that it encourages multiple predictors to share similar sparsity patterns. However, the resulting optimization problem is challenging to solve due to the non-smoothness of the l2,1-norm regularization. In this paper, we propose to accelerate the computation by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov's method-an optimal first-order black-box method for smooth convex optimization. A key building block in solving the reformulations is the Euclidean projection. We show that the Euclidean projection for the first reformulation can be analytically computed, while the Euclidean projection for the second one can be computed in linear time. Empirical evaluations on several data sets verify the efficiency of the proposed algorithms.

630 citations


Journal ArticleDOI
TL;DR: This paper created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation, and examined the 15 most prominent feature sets, finding the keypoint-based features Sift and Surf as well as the block-based DCT, DWT, KPCA, PCA, and Zernike features perform very well.
Abstract: A copy-move forgery is created by copying and pasting content within the same image, and potentially postprocessing it. In recent years, the detection of copy-move forgeries has become one of the most actively researched topics in blind image forensics. A considerable number of different algorithms have been proposed focusing on different types of postprocessed copies. In this paper, we aim to answer which copy-move forgery detection algorithms and processing steps (e.g., matching, filtering, outlier detection, affine transformation estimation) perform best in various postprocessing scenarios. The focus of our analysis is to evaluate the performance of previously proposed feature sets. We achieve this by casting existing algorithms in a common pipeline. In this paper, we examined the 15 most prominent feature sets. We analyzed the detection performance on a per-image basis and on a per-pixel basis. We created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation. Experiments show, that the keypoint-based features Sift and Surf, as well as the block-based DCT, DWT, KPCA, PCA, and Zernike features perform very well. These feature sets exhibit the best robustness against various noise sources and downsampling, while reliably identifying the copied regions.

623 citations


Book ChapterDOI
07 Oct 2012
TL;DR: This paper proposes to learn a metric from pairs of samples from different cameras, so that even less sophisticated features describing color and texture information are sufficient for finally getting state-of-the-art classification results.
Abstract: Matching persons across non-overlapping cameras is a rather challenging task. Thus, successful methods often build on complex feature representations or sophisticated learners. A recent trend to tackle this problem is to use metric learning to find a suitable space for matching samples from different cameras. However, most of these approaches ignore the transition from one camera to the other. In this paper, we propose to learn a metric from pairs of samples from different cameras. In this way, even less sophisticated features describing color and texture information are sufficient for finally getting state-of-the-art classification results. Moreover, once the metric has been learned, only linear projections are necessary at search time, where a simple nearest neighbor classification is performed. The approach is demonstrated on three publicly available datasets of different complexity, where it can be seen that state-of-the-art results can be obtained at much lower computational costs.

Journal ArticleDOI
TL;DR: This paper analyzes key aspects of the various AIA methods, including both feature extraction and semantic learning methods and provides a comprehensive survey on automatic image annotation.

Proceedings Article
22 Jul 2012
TL;DR: A new unsupervised learning algorithm, namely Nonnegative Discriminative Feature Selection (NDFS), which exploits the discriminative information and feature correlation simultaneously to select a better feature subset.
Abstract: In this paper, a new unsupervised learning algorithm, namely Nonnegative Discriminative Feature Selection (NDFS), is proposed. To exploit the discriminative information in unsupervised scenarios, we perform spectral clustering to learn the cluster labels of the input samples, during which the feature selection is performed simultaneously. The joint learning of the cluster labels and feature selection matrix enables NDFS to select the most discriminative features. To learn more accurate cluster labels, a nonnegative constraint is explicitly imposed to the class indicators. To reduce the redundant or even noisy features, l2,1-norm minimization constraint is added into the objective function, which guarantees the feature selection matrix sparse in rows. Our algorithm exploits the discriminative information and feature correlation simultaneously to select a better feature subset. A simple yet efficient iterative algorithm is designed to optimize the proposed objective function. Experimental results on different real world datasets demonstrate the encouraging performance of our algorithm over the state-of-the-arts.

Proceedings ArticleDOI
16 Jun 2012
TL;DR: In the authors' experiments, it is demonstrated that conditional regression forests outperform regression forests for facial feature detection and close-to-human accuracy is achieved while processing images in real-time.
Abstract: Although facial feature detection from 2D images is a well-studied field, there is a lack of real-time methods that estimate feature points even on low quality images. Here we propose conditional regression forest for this task. While regression forest learn the relations between facial image patches and the location of feature points from the entire set of faces, conditional regression forest learn the relations conditional to global face properties. In our experiments, we use the head pose as a global property and demonstrate that conditional regression forests outperform regression forests for facial feature detection. We have evaluated the method on the challenging Labeled Faces in the Wild [20] database where close-to-human accuracy is achieved while processing images in real-time.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper examined the 15 most prominent feature sets and analyzed the detection performance on a per-image basis and on per-pixel basis, and found that the keypoint-based features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and Zernike features perform very well.
Abstract: A copy-move forgery is created by copying and pasting content within the same image, and potentially post-processing it. In recent years, the detection of copy-move forgeries has become one of the most actively researched topics in blind image forensics. A considerable number of different algorithms have been proposed focusing on different types of postprocessed copies. In this paper, we aim to answer which copy-move forgery detection algorithms and processing steps (e.g., matching, filtering, outlier detection, affine transformation estimation) perform best in various postprocessing scenarios. The focus of our analysis is to evaluate the performance of previously proposed feature sets. We achieve this by casting existing algorithms in a common pipeline. In this paper, we examined the 15 most prominent feature sets. We analyzed the detection performance on a per-image basis and on a per-pixel basis. We created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation. Experiments show, that the keypoint-based features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and Zernike features perform very well. These feature sets exhibit the best robustness against various noise sources and downsampling, while reliably identifying the copied regions.

Journal ArticleDOI
16 Jun 2012
TL;DR: This paper proposes a new neighborhood repulsed metric learning (NRML) method for kinship verification, and proposes a multiview NRM-L method to seek a common distance metric to make better use of multiple feature descriptors to further improve the verification performance.
Abstract: Kinship verification from facial images is an interesting and challenging problem in computer vision, and there are very limited attempts on tackle this problem in the literature. In this paper, we propose a new neighborhood repulsed metric learning (NRML) method for kinship verification. Motivated by the fact that interclass samples (without a kinship relation) with higher similarity usually lie in a neighborhood and are more easily misclassified than those with lower similarity, we aim to learn a distance metric under which the intraclass samples (with a kinship relation) are pulled as close as possible and interclass samples lying in a neighborhood are repulsed and pushed away as far as possible, simultaneously, such that more discriminative information can be exploited for verification. To make better use of multiple feature descriptors to extract complementary information, we further propose a multiview NRML (MNRML) method to seek a common distance metric to perform multiple feature fusion to improve the kinship verification performance. Experimental results are presented to demonstrate the efficacy of our proposed methods. Finally, we also test human ability in kinship verification from facial images and our experimental results show that our methods are comparable to that of human observers.

Journal ArticleDOI
TL;DR: The results show that the proposed method can effectively get the signal feature to diagnose the occurrence of early fault of rotating machinery.

Book
09 Oct 2012
TL;DR: This book is an essential guide to the implementation of image processing and computer vision techniques, with tutorial introductions and sample code in Matlab, and contains extensive new material on Haar wavelets, Viola-Jones, bilateral filtering, SURF, PCA-SIFT, moving object detection and tracking.
Abstract: This book is an essential guide to the implementation of image processing and computer vision techniques, with tutorial introductions and sample code in Matlab. Algorithms are presented and fully explained to enable complete understanding of the methods and techniques demonstrated. As one reviewer noted, "The main strength of the proposed book is the exemplar code of the algorithms." Fully updated with the latest developments in feature extraction, including expanded tutorials and new techniques, this new edition contains extensive new material on Haar wavelets, Viola-Jones, bilateral filtering, SURF, PCA-SIFT, moving object detection and tracking, development of symmetry operators, LBP texture analysis, Adaboost, and a new appendix on color models. Coverage of distance measures, feature detectors, wavelets, level sets and texture tutorials has been extended. * Named a 2012 Notable Computer Book for Computing Methodologies by Computing Reviews* Essential reading for engineers and students working in this cutting-edge field* Ideal module text and background reference for courses in image processing and computer vision* The only currently available text to concentrate on feature extraction with working implementation and worked through derivation

Journal ArticleDOI
TL;DR: The patch alignment framework is introduced to linearly combine multiple features in the optimal way and obtain a unified low-dimensional representation of these multiple features for subsequent classification in hyperspectral remote sensing image classification.
Abstract: In hyperspectral remote sensing image classification, multiple features, e.g., spectral, texture, and shape features, are employed to represent pixels from different perspectives. It has been widely acknowledged that properly combining multiple features always results in good classification performance. In this paper, we introduce the patch alignment framework to linearly combine multiple features in the optimal way and obtain a unified low-dimensional representation of these multiple features for subsequent classification. Each feature has its particular contribution to the unified representation determined by simultaneously optimizing the weights in the objective function. This scheme considers the specific statistical properties of each feature to achieve a physically meaningful unified low-dimensional representation of multiple features. Experiments on the classification of the hyperspectral digital imagery collection experiment and reflective optics system imaging spectrometer hyperspectral data sets suggest that this scheme is effective.

Journal ArticleDOI
TL;DR: A class of linear transformation techniques based on block wise transformation of MFLE which effectively decorrelate the filter bank log energies and also capture speech information in an efficient manner are studied.

Journal Article
TL;DR: This work introduces a framework for feature selection based on dependence maximization between the selected features and the labels of an estimation problem, using the Hilbert-Schmidt Independence Criterion, and shows that a number of existing feature selectors are special cases of this framework.
Abstract: We introduce a framework for feature selection based on dependence maximization between the selected features and the labels of an estimation problem, using the Hilbert-Schmidt Independence Criterion. The key idea is that good features should be highly dependent on the labels. Our approach leads to a greedy procedure for feature selection. We show that a number of existing feature selectors are special cases of this framework. Experiments on both artificial and real-world data show that our feature selector works well in practice.

Journal ArticleDOI
TL;DR: It is shown how to generalize locality-sensitive hashing to accommodate arbitrary kernel functions, making it possible to preserve the algorithm's sublinear time similarity search guarantees for a wide class of useful similarity functions.
Abstract: Fast retrieval methods are critical for many large-scale and data-driven vision applications. Recent work has explored ways to embed high-dimensional features or complex distance functions into a low-dimensional Hamming space where items can be efficiently searched. However, existing methods do not apply for high-dimensional kernelized data when the underlying feature embedding for the kernel is unknown. We show how to generalize locality-sensitive hashing to accommodate arbitrary kernel functions, making it possible to preserve the algorithm's sublinear time similarity search guarantees for a wide class of useful similarity functions. Since a number of successful image-based kernels have unknown or incomputable embeddings, this is especially valuable for image retrieval tasks. We validate our technique on several data sets, and show that it enables accurate and fast performance for several vision problems, including example-based object classification, local feature matching, and content-based retrieval.

Journal ArticleDOI
TL;DR: A forensic tool able to discriminate between original and forged regions in an image captured by a digital camera is presented, based on a new feature measuring the presence of demosaicking artifacts at a local level and a new statistical model allowing to derive the tampering probability of each 2 × 2 image block without requiring to know a priori the position of the forged region.
Abstract: In this paper, a forensic tool able to discriminate between original and forged regions in an image captured by a digital camera is presented. We make the assumption that the image is acquired using a Color Filter Array, and that tampering removes the artifacts due to the demosaicking algorithm. The proposed method is based on a new feature measuring the presence of demosaicking artifacts at a local level, and on a new statistical model allowing to derive the tampering probability of each 2 × 2 image block without requiring to know a priori the position of the forged region. Experimental results on different cameras equipped with different demosaicking algorithms demonstrate both the validity of the theoretical model and the effectiveness of our scheme.

Proceedings ArticleDOI
01 Jan 2012
TL;DR: This work proposes a novel method for re-identification that learns a selection and weighting of mid-level semantic attributes to describe people, an attribute-centric, parts-based feature representation that differs from and complements existing low-level features that rely purely on bottom-up statistics for feature selection.
Abstract: Visually identifying a target individual reliably in a crowded environment observed by a distributed camera network is critical to a variety of tasks in managing business information, border control, and crime prevention. Automatic re-identification of a human candidate from public space CCTV video is challenging due to spatiotemporal visual feature variations and strong visual similarity between different people, compounded by low-resolution and poor quality video data. In this work, we propose a novel method for re-identification that learns a selection and weighting of mid-level semantic attributes to describe people. Specifically, the model learns an attribute-centric, parts-based feature representation. This differs from and complements existing low-level features for re-identification that rely purely on bottom-up statistics for feature selection, which are limited in discriminating and identifying reliably visual appearances of target people appearing in different camera views under certain degrees of occlusion due to crowdedness. Our experiments demonstrate the effectiveness of our approach compared to existing feature representations when applied to benchmarking datasets.

Proceedings Article
03 Dec 2012
TL;DR: This paper incorporates deep learning into the congealing alignment framework, and modify the learning algorithm for the restricted Boltzmann machine by incorporating a group sparsity penalty, leading to a topographic organization of the learned filters and improving subsequent alignment results.
Abstract: Unsupervised joint alignment of images has been demonstrated to improve performance on recognition tasks such as face verification. Such alignment reduces undesired variability due to factors such as pose, while only requiring weak supervision in the form of poorly aligned examples. However, prior work on unsupervised alignment of complex, real-world images has required the careful selection of feature representation based on hand-crafted image descriptors, in order to achieve an appropriate, smooth optimization landscape. In this paper, we instead propose a novel combination of unsupervised joint alignment with unsupervised feature learning. Specifically, we incorporate deep learning into the congealing alignment framework. Through deep learning, we obtain features that can represent the image at differing resolutions based on network depth, and that are tuned to the statistics of the specific data being aligned. In addition, we modify the learning algorithm for the restricted Boltzmann machine by incorporating a group sparsity penalty, leading to a topographic organization of the learned filters and improving subsequent alignment results. We apply our method to the Labeled Faces in the Wild database (LFW). Using the aligned images produced by our proposed unsupervised algorithm, we achieve higher accuracy in face verification compared to prior work in both unsupervised and supervised alignment. We also match the accuracy for the best available commercial method.

Journal ArticleDOI
01 Jul 2012
TL;DR: A targeted feature transform based on Gabor filters for this system for 3D object retrieval based on sketched feature lines as input is developed and it is shown objectively that this transform is better suited than other approaches from the literature developed for similar tasks.
Abstract: We develop a system for 3D object retrieval based on sketched feature lines as input. For objective evaluation, we collect a large number of query sketches from human users that are related to an existing data base of objects. The sketches turn out to be generally quite abstract with large local and global deviations from the original shape. Based on this observation, we decide to use a bag-of-features approach over computer generated line drawings of the objects. We develop a targeted feature transform based on Gabor filters for this system. We can show objectively that this transform is better suited than other approaches from the literature developed for similar tasks. Moreover, we demonstrate how to optimize the parameters of our, as well as other approaches, based on the gathered sketches. In the resulting comparison, our approach is significantly better than any other system described so far.

Journal ArticleDOI
TL;DR: A survey and a comparative evaluation of recent techniques for moving cast shadow detection indicate that all shadow detection approaches make different contributions and all have individual strength and weaknesses.

Book ChapterDOI
07 Oct 2012
TL;DR: This study shows that certain features play more important role than others under different circumstances, and proposes a novel unsupervised approach for learning a bottom-up feature importance, so features extracted from different individuals are weighted adaptively driven by their unique and inherent appearance attributes.
Abstract: State-of-the-art person re-identification methods seek robust person matching through combining various feature types. Often, these features are implicitly assigned with a single vector of global weights, which are assumed to be universally good for all individuals, independent to their different appearances. In this study, we show that certain features play more important role than others under different circumstances. Consequently, we propose a novel unsupervised approach for learning a bottom-up feature importance, so features extracted from different individuals are weighted adaptively driven by their unique and inherent appearance attributes. Extensive experiments on two public datasets demonstrate that attribute-sensitive feature importance facilitates more accurate person matching when it is fused together with global weights obtained using existing methods.

Journal ArticleDOI
TL;DR: The proposed image retargeting algorithm effectively preserves the visually important regions for images, efficiently removes the less crucial regions, and therefore significantly outperforms the relevant state-of-the-art algorithms, as demonstrated with the in-depth analysis in the extensive experiments.
Abstract: Saliency detection plays important roles in many image processing applications, such as regions of interest extraction and image resizing. Existing saliency detection models are built in the uncompressed domain. Since most images over Internet are typically stored in the compressed domain such as joint photographic experts group (JPEG), we propose a novel saliency detection model in the compressed domain in this paper. The intensity, color, and texture features of the image are extracted from discrete cosine transform (DCT) coefficients in the JPEG bit-stream. Saliency value of each DCT block is obtained based on the Hausdorff distance calculation and feature map fusion. Based on the proposed saliency detection model, we further design an adaptive image retargeting algorithm in the compressed domain. The proposed image retargeting algorithm utilizes multioperator operation comprised of the block-based seam carving and the image scaling to resize images. A new definition of texture homogeneity is given to determine the amount of removal block-based seams. Thanks to the directly derived accurate saliency information from the compressed domain, the proposed image retargeting algorithm effectively preserves the visually important regions for images, efficiently removes the less crucial regions, and therefore significantly outperforms the relevant state-of-the-art algorithms, as demonstrated with the in-depth analysis in the extensive experiments.

Posted Content
TL;DR: A new learning method for heterogeneous domain adaptation (HDA), in which the data from the source domain and the target domain are represented by heterogeneous features with different dimensions, and it is demonstrated that HFA outperforms the existing HDA methods.
Abstract: We propose a new learning method for heterogeneous domain adaptation (HDA), in which the data from the source domain and the target domain are represented by heterogeneous features with different dimensions. Using two different projection matrices, we first transform the data from two domains into a common subspace in order to measure the similarity between the data from two domains. We then propose two new feature mapping functions to augment the transformed data with their original features and zeros. The existing learning methods (e.g., SVM and SVR) can be readily incorporated with our newly proposed augmented feature representations to effectively utilize the data from both domains for HDA. Using the hinge loss function in SVM as an example, we introduce the detailed objective function in our method called Heterogeneous Feature Augmentation (HFA) for a linear case and also describe its kernelization in order to efficiently cope with the data with very high dimensions. Moreover, we also develop an alternating optimization algorithm to effectively solve the nontrivial optimization problem in our HFA method. Comprehensive experiments on two benchmark datasets clearly demonstrate that HFA outperforms the existing HDA methods.