Home
/
Authors
/
Zinan Zeng

Author

Zinan Zeng

Agency for Science, Technology and Research

Other affiliations: Nanyang Technological University

Bio: Zinan Zeng is an academic researcher from Agency for Science, Technology and Research. The author has contributed to research in topics: Contextual image classification & Partial permutation. The author has an hindex of 9, co-authored 9 publications receiving 2270 citations. Previous affiliations of Zinan Zeng include Nanyang Technological University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

PCANet: A Simple Deep Learning Baseline for Image Classification?

[...]

Tsung-Han Chan¹, Kui Jia², Shenghua Gao³, Jiwen Lu⁴, Zinan Zeng⁵, Yi Ma³ - Show less +2 more•Institutions (5)

MediaTek¹, University of Macau², ShanghaiTech University³, Tsinghua University⁴, Agency for Science, Technology and Research⁵

14 Apr 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: PCANet as discussed by the authors is a simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms.

...read moreread less

Abstract: In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms. In the proposed architecture, PCA is employed to learn multistage filter banks. It is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus named as a PCA network (PCANet) and can be designed and learned extremely easily and efficiently. For comparison and better understanding, we also introduce and study two simple variations to the PCANet, namely the RandNet and LDANet. They share the same topology of PCANet but their cascaded filters are either selected randomly or learned from LDA. We have tested these basic networks extensively on many benchmark visual datasets for different tasks, such as LFW for face verification, MultiPIE, Extended Yale B, AR, FERET datasets for face recognition, as well as MNIST for hand-written digits recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state of the art features, either prefixed, highly hand-crafted or carefully learned (by DNNs). Even more surprisingly, it sets new records for many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST variations. Additional experiments on other public datasets also demonstrate the potential of the PCANet serving as a simple but highly competitive baseline for texture classification and object recognition.

...read moreread less

1,157 citations

Journal Article•DOI•

PCANet: A Simple Deep Learning Baseline for Image Classification?

[...]

Tsung-Han Chan¹, Kui Jia², Shenghua Gao³, Jiwen Lu⁴, Zinan Zeng⁵, Yi Ma³ - Show less +2 more•Institutions (5)

MediaTek¹, University of Macau², ShanghaiTech University³, Tsinghua University⁴, Agency for Science, Technology and Research⁵

01 Sep 2015-IEEE Transactions on Image Processing

TL;DR: Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)].

...read moreread less

Abstract: In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.

...read moreread less

1,034 citations

Journal Article•DOI•

Spectral Embedded Clustering: A Framework for In-Sample and Out-of-Sample Spectral Clustering

[...]

Feiping Nie¹, Zinan Zeng², Ivor W. Tsang², Dong Xu², Changshui Zhang¹ - Show less +1 more•Institutions (2)

Tsinghua University¹, Nanyang Technological University²

01 Nov 2011-IEEE Transactions on Neural Networks

TL;DR: This paper proposes the spectral embedded clustering (SEC) framework, in which a linearity regularization is explicitly added into the objective function of SC methods, and presents a new Laplacian matrix constructed from a local regression of each pattern to capture both local and global discriminative information for clustering.

...read moreread less

Abstract: Spectral clustering (SC) methods have been successfully applied to many real-world applications. The success of these SC methods is largely based on the manifold assumption, namely, that two nearby data points in the high-density region of a low-dimensional data manifold have the same cluster label. However, such an assumption might not always hold on high-dimensional data. When the data do not exhibit a clear low-dimensional manifold structure (e.g., high-dimensional and sparse data), the clustering performance of SC will be degraded and become even worse than K -means clustering. In this paper, motivated by the observation that the true cluster assignment matrix for high-dimensional data can be always embedded in a linear space spanned by the data, we propose the spectral embedded clustering (SEC) framework, in which a linearity regularization is explicitly added into the objective function of SC methods. More importantly, the proposed SEC framework can naturally deal with out-of-sample data. We also present a new Laplacian matrix constructed from a local regression of each pattern and incorporate it into our SEC framework to capture both local and global discriminative information for clustering. Comprehensive experiments on eight real-world high-dimensional datasets demonstrate the effectiveness and advantages of our SEC framework over existing SC methods and K-means-based clustering methods. Our SEC framework significantly outperforms SC using the Nystrom algorithm on unseen data.

...read moreread less

292 citations

Journal Article•DOI•

Human Gait Recognition Using Patch Distribution Feature and Locality-Constrained Group Sparse Representation

[...]

Dong Xu¹, Yi Huang¹, Zinan Zeng¹, Xinxing Xu¹•Institutions (1)

Nanyang Technological University¹

01 Jan 2012-IEEE Transactions on Image Processing

TL;DR: Detailed experiments on the benchmark USF HumanID database demonstrate the effectiveness of the newly proposed feature Gabor-PDF and the new classification method LGSR for human gait recognition, which achieves the best average Rank-1 and Rank-5 recognition rates on this database among all gait Recognition algorithms proposed to date.

...read moreread less

Abstract: In this paper, we propose a new patch distribution feature (PDF) (i.e., referred to as Gabor-PDF) for human gait recognition. We represent each gait energy image (GEI) as a set of local augmented Gabor features, which concatenate the Gabor features extracted from different scales and different orientations together with the X-Y coordinates. We learn a global Gaussian mixture model (GMM) (i.e., referred to as the universal background model) with the local augmented Gabor features from all the gallery GEIs; then, each gallery or probe GEI is further expressed as the normalized parameters of an image-specific GMM adapted from the global GMM. Observing that one video is naturally represented as a group of GEIs, we also propose a new classification method called locality-constrained group sparse representation (LGSR) to classify each probe video by minimizing the weighted l1, 2 mixed-norm-regularized reconstruction error with respect to the gallery videos. In contrast to the standard group sparse representation method that is a special case of LGSR, the group sparsity and local smooth sparsity constraints are both enforced in LGSR. Our comprehensive experiments on the benchmark USF HumanID database demonstrate the effectiveness of the newly proposed feature Gabor-PDF and the new classification method LGSR for human gait recognition. Moreover, LGSR using the new feature Gabor-PDF achieves the best average Rank-1 and Rank-5 recognition rates on this database among all gait recognition algorithms proposed to date.

...read moreread less

137 citations

Proceedings Article•DOI•

Learning by Associating Ambiguously Labeled Images

[...]

Zinan Zeng, Shijie Xiao¹, Kui Jia, Tsung-Han Chan, Shenghua Gao, Dong Xu¹, Yi Ma² - Show less +3 more•Institutions (2)

Nanyang Technological University¹, Microsoft²

23 Jun 2013

TL;DR: A novel framework is proposed to address the problem of learning classifiers from ambiguously labeled images by optimizing a partial permutation matrix (PPM) for each image, which is formulated in order to exploit all information between samples and labels in a principled way.

...read moreread less

Abstract: We study in this paper the problem of learning classifiers from ambiguously labeled images. For instance, in the collection of new images, each image contains some samples of interest (\emph{e.g.,} human faces), and its associated caption has labels with the true ones included, while the sample-label association is unknown. The task is to learn classifiers from these ambiguously labeled images and generalize to new images. An essential consideration here is how to make use of the information embedded in the relations between samples and labels, both within each image and across the image set. To this end, we propose a novel framework to address this problem. Our framework is motivated by the observation that samples from the same class repetitively appear in the collection of ambiguously labeled training images, while they are just ambiguously labeled in each image. If we can identify samples of the same class from each image and associate them across the image set, the matrix formed by the samples from the same class would be ideally low-rank. By leveraging such a low-rank assumption, we can simultaneously optimize a partial permutation matrix (PPM) for each image, which is formulated in order to exploit all information between samples and labels in a principled way. The obtained PPMs can be readily used to assign labels to samples in training images, and then a standard SVM classifier can be trained and used for unseen data. Experiments on benchmark datasets show the effectiveness of our proposed method.

...read moreread less

122 citations

Cited by

PDF

Open Access

More filters

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

Journal Article•DOI•

DehazeNet: An End-to-End System for Single Image Haze Removal

[...]

Bolun Cai¹, Xiangmin Xu¹, Kui Jia¹, Chunmei Qing¹, Dacheng Tao² - Show less +1 more•Institutions (2)

South China University of Technology¹, University of Technology, Sydney²

01 Nov 2016-IEEE Transactions on Image Processing

TL;DR: DehazeNet as discussed by the authors adopts convolutional neural network-based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing.

...read moreread less

Abstract: Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts convolutional neural network-based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, the layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called bilateral rectified linear unit, which is able to improve the quality of recovered haze-free image. We establish connections between the components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.

...read moreread less

1,880 citations

Proceedings Article•

Unsupervised deep embedding for clustering analysis

[...]

Junyuan Xie¹, Ross Girshick², Ali Farhadi¹•Institutions (2)

University of Washington¹, Facebook²

19 Jun 2016

TL;DR: Deep Embedded Clustering (DEC) as discussed by the authors learns a mapping from the data space to a lower-dimensional feature space in which it iteratively optimizes a clustering objective.

...read moreread less

Abstract: Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this paper, we propose Deep Embedded Clustering (DEC), a method that simultaneously learns feature representations and cluster assignments using deep neural networks. DEC learns a mapping from the data space to a lower-dimensional feature space in which it iteratively optimizes a clustering objective. Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods.

...read moreread less

1,776 citations

Journal Article•DOI•

Remote Sensing Image Scene Classification: Benchmark and State of the Art

[...]

Gong Cheng¹, Junwei Han¹, Xiaoqiang Lu²•Institutions (2)

Northwestern Polytechnical University¹, Chinese Academy of Sciences²

01 Mar 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: A large-scale data set, termed “NWPU-RESISC45,” is proposed, which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU).

...read moreread less

Abstract: Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.

...read moreread less

1,424 citations

Journal Article•DOI•

DehazeNet: An End-to-End System for Single Image Haze Removal

[...]

Bolun Cai¹, Xiangmin Xu¹, Kui Jia¹, Chunmei Qing¹, Dacheng Tao² - Show less +1 more•Institutions (2)

South China University of Technology¹, University of Technology, Sydney²

28 Jan 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a trainable end-to-end system called DehazeNet, for medium transmission estimation, which takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model.

...read moreread less

Abstract: Single image haze removal is a challenging ill-posed problem. Existing methods use various constraints/priors to get plausible dehazing solutions. The key to achieve haze removal is to estimate a medium transmission map for an input hazy image. In this paper, we propose a trainable end-to-end system called DehazeNet, for medium transmission estimation. DehazeNet takes a hazy image as input, and outputs its medium transmission map that is subsequently used to recover a haze-free image via atmospheric scattering model. DehazeNet adopts Convolutional Neural Networks (CNN) based deep architecture, whose layers are specially designed to embody the established assumptions/priors in image dehazing. Specifically, layers of Maxout units are used for feature extraction, which can generate almost all haze-relevant features. We also propose a novel nonlinear activation function in DehazeNet, called Bilateral Rectified Linear Unit (BReLU), which is able to improve the quality of recovered haze-free image. We establish connections between components of the proposed DehazeNet and those used in existing methods. Experiments on benchmark images show that DehazeNet achieves superior performance over existing methods, yet keeps efficient and easy to use.

...read moreread less

837 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse