scispace - formally typeset
Search or ask a question
Author

Chang Tang

Bio: Chang Tang is an academic researcher from China University of Geosciences (Wuhan). The author has contributed to research in topics: Cluster analysis & Graph (abstract data type). The author has an hindex of 25, co-authored 67 publications receiving 2034 citations. Previous affiliations of Chang Tang include Information Technology University & Tianjin University.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
01 Dec 2015
TL;DR: This paper proposes an open framework to use the kernel matrix over feature dimensions as a generic representation and discusses its properties and advantages, which significantly elevates covariance representation to the unlimited opportunities provided by this new representation.
Abstract: Covariance matrix has recently received increasing attention in computer vision by leveraging Riemannian geometry of symmetric positive-definite (SPD) matrices. Originally proposed as a region descriptor, it has now been used as a generic representation in various recognition tasks. However, covariance matrix has shortcomings such as being prone to be singular, limited capability in modeling complicated feature relationship, and having a fixed form of representation. This paper argues that more appropriate SPD-matrix-based representations shall be explored to achieve better recognition. It proposes an open framework to use the kernel matrix over feature dimensions as a generic representation and discusses its properties and advantages. The proposed framework significantly elevates covariance representation to the unlimited opportunities provided by this new representation. Experimental study shows that this representation consistently outperforms its covariance counterpart on various visual recognition tasks. In particular, it achieves significant improvement on skeleton-based human action recognition, demonstrating the state-of-the-art performance over both the covariance and the existing non-covariance representations.

102 citations

Journal ArticleDOI
TL;DR: This paper proposes an Efficient and Effective Incomplete Multi-view Clustering (EE-IMVC) algorithm, which proposes to impute each incomplete base matrix generated by incomplete views with a learned consensus clustering matrix to address issues of intensive computational and storage complexities, over-complicated optimization and limitedly improved clustering performance.
Abstract: Incomplete multi-view clustering (IMVC) optimally combines multiple pre-specified incomplete views to improve clustering performance. Among various excellent solutions, the recently proposed multiple kernel $k$ k -means with incomplete kernels (MKKM-IK) forms a benchmark, which redefines IMVC as a joint optimization problem where the clustering and kernel matrix imputation tasks are alternately performed until convergence. Though demonstrating promising performance in various applications, we observe that the manner of kernel matrix imputation in MKKM-IK would incur intensive computational and storage complexities, over-complicated optimization and limitedly improved clustering performance. In this paper, we first propose an Efficient and Effective Incomplete Multi-view Clustering (EE-IMVC) algorithm to address these issues. Instead of completing the incomplete kernel matrices, EE-IMVC proposes to impute each incomplete base matrix generated by incomplete views with a learned consensus clustering matrix. Moreover, we further improve this algorithm by incorporating prior knowledge to regularize the learned consensus clustering matrix. Two three-step iterative algorithms are carefully developed to solve the resultant optimization problems with linear computational complexity, and their convergence is theoretically proven. After that, we theoretically study the generalization bound of the proposed algorithms. Furthermore, we conduct comprehensive experiments to study the proposed algorithms in terms of clustering accuracy, evolution of the learned consensus clustering matrix and the convergence. As indicated, our algorithms deliver their effectiveness by significantly and consistently outperforming some state-of-the-art ones.

90 citations

Journal ArticleDOI
TL;DR: This work resent a MV-UFS model via cross-view local structure preserved diversity and consensus learning, referred to as CvLP-DCL briefly, and regularize the fact that different views represent same samples to solve the resultant optimization problem.
Abstract: Although demonstrating great success, previous multi-view unsupervised feature selection (MV-UFS) methods often construct a view-specific similarity graph and characterize the local structure of data within each single view. In such a way, the cross-view information could be ignored. In addition, they usually assume that different feature views are projected from a latent feature space while the diversity of different views cannot be fully captured. In this work, we resent a MV-UFS model via cross-view local structure preserved diversity and consensus learning, referred to as CvLP-DCL briefly. In order to exploit both the shared and distinguishing information across different views, we project each view into a label space, which consists of a consensus part and a view-specific part. Therefore, we regularize the fact that different views represent same samples. Meanwhile, a cross-view similarity graph learning term with matrix-induced regularization is embedded to preserve the local structure of data in the label space. By imposing the $l_{2,1}$ -norm on the feature projection matrices for constraining row sparsity, discriminative features can be selected from different views. An efficient algorithm is designed to solve the resultant optimization problem and extensive experiments on six publicly datasets are conducted to validate the effectiveness of the proposed CvLP-DCL.

89 citations

Journal ArticleDOI
TL;DR: The proposed method takes into consideration not only the affect of light refraction but also the blur texture of an image, and is more reliable in defocus map estimation compared to various state-of-the-art methods.
Abstract: We present an effective method for defocus map estimation from a single natural image. It is inspired by the observation that defocusing can significantly affect the spectrum amplitude at the object edge locations in an image. By establishing the relationship between the amount of spatially varying defocus blur and spectrum contrast at edge locations, we first estimate the blur amount at these edge locations, then a full defocus map can be obtained by propagating the blur amount at edge locations over the entire image with a nonhomogeneous optimization procedure. The proposed method takes into consideration not only the affect of light refraction but also the blur texture of an image. Experimental results demonstrate that our proposed method is more reliable in defocus map estimation compared to various state-of-the-art methods.

88 citations

Journal ArticleDOI
03 Apr 2020
TL;DR: The proposed CGD takes the traditional predefined graph matrices of different views as input, and learns an improved graph for each single view via an iterative cross diffusion process by capturing the underlying manifold geometry structure of original data points, and leveraging the complementary information among multiple graphs.
Abstract: Graph based multi-view clustering has been paid great attention by exploring the neighborhood relationship among data points from multiple views. Though achieving great success in various applications, we observe that most of previous methods learn a consensus graph by building certain data representation models, which at least bears the following drawbacks. First, their clustering performance highly depends on the data representation capability of the model. Second, solving these resultant optimization models usually results in high computational complexity. Third, there are often some hyper-parameters in these models need to tune for obtaining the optimal results. In this work, we propose a general, effective and parameter-free method with convergence guarantee to learn a unified graph for multi-view data clustering via cross-view graph diffusion (CGD), which is the first attempt to employ diffusion process for multi-view clustering. The proposed CGD takes the traditional predefined graph matrices of different views as input, and learns an improved graph for each single view via an iterative cross diffusion process by 1) capturing the underlying manifold geometry structure of original data points, and 2) leveraging the complementary information among multiple graphs. The final unified graph used for clustering is obtained by averaging the improved view associated graphs. Extensive experiments on several benchmark datasets are conducted to demonstrate the effectiveness of the proposed method in terms of seven clustering evaluation metrics.

80 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A broad survey of the recent advances in convolutional neural networks can be found in this article, where the authors discuss the improvements of CNN on different aspects, namely, layer design, activation function, loss function, regularization, optimization and fast computation.

3,125 citations

Posted Content
TL;DR: In this paper, a large-scale dataset for RGB+D human action recognition was introduced with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects.
Abstract: Recent approaches in depth-based human activity analysis achieved outstanding performance and proved the effectiveness of 3D representation for classification of action classes. Currently available depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of training samples, distinct class labels, camera views and variety of subjects. In this paper we introduce a large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects. Our dataset contains 60 different action classes including daily, mutual, and health-related actions. In addition, we propose a new recurrent neural network structure to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification. Experimental results show the advantages of applying deep learning methods over state-of-the-art hand-crafted features on the suggested cross-subject and cross-view evaluation criteria for our dataset. The introduction of this large scale dataset will enable the community to apply, develop and adapt various data-hungry learning techniques for the task of depth-based and RGB+D-based human activity analysis.

1,448 citations

Proceedings ArticleDOI
01 Jun 2016
TL;DR: A large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects is introduced and a new recurrent neural network structure is proposed to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification.
Abstract: Recent approaches in depth-based human activity analysis achieved outstanding performance and proved the effectiveness of 3D representation for classification of action classes. Currently available depth-based and RGB+Dbased action recognition benchmarks have a number of limitations, including the lack of training samples, distinct class labels, camera views and variety of subjects. In this paper we introduce a large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects. Our dataset contains 60 different action classes including daily, mutual, and health-related actions. In addition, we propose a new recurrent neural network structure to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification. Experimental results show the advantages of applying deep learning methods over state-of-the-art handcrafted features on the suggested cross-subject and crossview evaluation criteria for our dataset. The introduction of this large scale dataset will enable the community to apply, develop and adapt various data-hungry learning techniques for the task of depth-based and RGB+D-based human activity analysis.

1,391 citations

Posted Content
TL;DR: This paper details the improvements of CNN on different aspects, including layer design, activation function, loss function, regularization, optimization and fast computation, and introduces various applications of convolutional neural networks in computer vision, speech and natural language processing.
Abstract: In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing. Among different types of deep neural networks, convolutional neural networks have been most extensively studied. Leveraging on the rapid growth in the amount of the annotated data and the great improvements in the strengths of graphics processor units, the research on convolutional neural networks has been emerged swiftly and achieved state-of-the-art results on various tasks. In this paper, we provide a broad survey of the recent advances in convolutional neural networks. We detailize the improvements of CNN on different aspects, including layer design, activation function, loss function, regularization, optimization and fast computation. Besides, we also introduce various applications of convolutional neural networks in computer vision, speech and natural language processing.

1,302 citations

Journal Article
TL;DR: An independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator, or HSIC, is proposed.
Abstract: We propose an independence criterion based on the eigen-spectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator (we term this a Hilbert-Schmidt Independence Criterion, or HSIC). This approach has several advantages, compared with previous kernel-based independence criteria. First, the empirical estimate is simpler than any other kernel dependence test, and requires no user-defined regularisation. Second, there is a clearly defined population quantity which the empirical estimate approaches in the large sample limit, with exponential convergence guaranteed between the two: this ensures that independence tests based on HSIC do not suffer from slow learning rates. Finally, we show in the context of independent component analysis (ICA) that the performance of HSIC is competitive with that of previously published kernel-based criteria, and of other recently published ICA methods.

1,134 citations