Home
/
Authors
/
Stan Z. Li

Author

Stan Z. Li

Other affiliations: Microsoft, Macau University of Science and Technology, Beihang University ...read more

Bio: Stan Z. Li is an academic researcher from Westlake University. The author has contributed to research in topics: Facial recognition system & Face detection. The author has an hindex of 97, co-authored 532 publications receiving 41793 citations. Previous affiliations of Stan Z. Li include Microsoft & Macau University of Science and Technology.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1990

Papers

PDF

Open Access

More filters

Posted Content•

Constrained Deep Metric Learning for Person Re-identification

[...]

Hailin Shi, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Yang Yang, Stan Z. Li - Show less +2 more

24 Nov 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel CNN-based method to learn a discriminative metric with good robustness to the over-fitting problem in person re-identification is proposed and it is found that the selection of intra-class sample pairs is crucial for learning but has received little attention.

...read moreread less

Abstract: Person re-identification aims to re-identify the probe image from a given set of images under different camera views. It is challenging due to large variations of pose, illumination, occlusion and camera view. Since the convolutional neural networks (CNN) have excellent capability of feature extraction, certain deep learning methods have been recently applied in person re-identification. However, in person re-identification, the deep networks often suffer from the over-fitting problem. In this paper, we propose a novel CNN-based method to learn a discriminative metric with good robustness to the over-fitting problem in person re-identification. Firstly, a novel deep architecture is built where the Mahalanobis metric is learned with a weight constraint. This weight constraint is used to regularize the learning, so that the learned metric has a better generalization ability. Secondly, we find that the selection of intra-class sample pairs is crucial for learning but has received little attention. To cope with the large intra-class variations in pedestrian images, we propose a novel training strategy named moderate positive mining to prevent the training process from over-fitting to the extreme samples in intra-class pairs. Experiments show that our approach significantly outperforms state-of-the-art methods on several benchmarks of person re-identification.

...read moreread less

42 citations

Proceedings Article•DOI•

Illumination modeling and normalization for face recognition

[...]

Haitao Wang¹, Stan Z. Li, Yangsheng Wang, Weiwei Zhang•Institutions (1)

Chinese Academy of Sciences¹

17 Oct 2003

TL;DR: This work shows that a face lighting subspace can be constructed based on three or more training face images illuminated by noncoplanar lights, and presents a face normalization algorithm, illumination alignment, i.e. changing the lighting of one face image to that of another face image.

...read moreread less

Abstract: We present a general framework for face modeling under varying lighting conditions. First, we show that a face lighting subspace can be constructed based on three or more training face images illuminated by noncoplanar lights. The lighting of any face image can be represented as a point in this subspace. Second, we show that the extreme rays, i.e. the boundary of an illumination cone, cover the entire light sphere. Therefore, a relatively sparsely sampled face images can be used to build a face model instead of calculating each extremely illuminated face image. Third, we present a face normalization algorithm, illumination alignment, i.e. changing the lighting of one face image to that of another face image. Experiments are presented.

...read moreread less

42 citations

Journal Article•DOI•

Learning Stacked Image Descriptor for Face Recognition

[...]

Zhen Lei¹, Dong Yi¹, Stan Z. Li¹•Institutions (1)

Chinese Academy of Sciences¹

01 Sep 2016-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper extends the original shallow face descriptors to deep discriminant face features by introducing a stacked image descriptor (SID), with deep structure, more complex facial information can be extracted and the discriminant and compactness of feature representation can be improved.

...read moreread less

Abstract: Learning-based face descriptors have constantly improved the face recognition performance. Compared with the hand-crafted features, learning-based features are considered to be able to exploit information with better discriminative ability for specific tasks. Motivated by the recent success of deep learning, in this paper, we extend the original shallow face descriptors to deep discriminant face features by introducing a stacked image descriptor (SID). With deep structure, more complex facial information can be extracted and the discriminant and compactness of feature representation can be improved. The SID is learned in a forward optimization way, which is computational efficient compared with deep learning. Extensive experiments on various face databases are conducted to show that SID is able to achieve high face recognition performance with compact face representation, compared with other state-of-the-art descriptors.

...read moreread less

42 citations

Proceedings Article•DOI•

Low-resolution face recognition via Simultaneous Discriminant Analysis

[...]

Changtao Zhou¹, Zhiwei Zhang¹, Dong Yi¹, Zhen Lei¹, Stan Z. Li¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

11 Oct 2011-International Journal of Central Banking

TL;DR: Simultaneous Discriminant Analysis learns two mappings from LR and HR images respectively to a common subspace where discrimination property is maximized and the conventional classification method is applied in the common space for final decision.

...read moreread less

Abstract: Low resolution (LR) is an important issue when handling real world face recognition problems. The performance of traditional recognition algorithms will drop drastically due to the loss of facial texture information in original high resolution (HR) images. To address this problem, in this paper we propose an effective approach named Simultaneous Discriminant Analysis (SDA). SDA learns two mappings from LR and HR images respectively to a common subspace where discrimination property is maximized. In SDA, (1) the data gap between LR and HR is reduced by mapping into a common space; and (2) the mapping is designed for preserving most discriminative information. After that, the conventional classification method is applied in the common space for final decision. Extensive experiments are conducted on both FERET and Multi-PIE, and the results clearly show the superiority of the proposed SDA over state-of-the-art methods.

...read moreread less

42 citations

Posted Content•

When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition

[...]

Guosheng Hu¹, Yongxin Yang², Dong Yi, Josef Kittler³, William J. Christmas³, Stan Z. Li, Timothy M. Hospedales² - Show less +3 more•Institutions (3)

French Institute for Research in Computer Science and Automation¹, Queen Mary University of London², University of Surrey³

09 Apr 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make their work easily reproducible, and propose three CNN architectures which are the first reported architectures trained using LFW data.

...read moreread less

Abstract: Deep learning, in particular Convolutional Neural Network (CNN), has achieved promising results in face recognition recently. However, it remains an open question: why CNNs work well and how to design a 'good' architecture. The existing works tend to focus on reporting CNN architectures that work well for face recognition rather than investigate the reason. In this work, we conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make our work easily reproducible. Specifically, we use public database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing CNNs trained on private databases. We propose three CNN architectures which are the first reported architectures trained using LFW data. This paper quantitatively compares the architectures of CNNs and evaluate the effect of different implementation choices. We identify several useful properties of CNN-FRS. For instance, the dimensionality of the learned features can be significantly reduced without adverse effect on face recognition accuracy. In addition, traditional metric learning method exploiting CNN-learned features is evaluated. Experiments show two crucial factors to good CNN-FRS performance are the fusion of multiple CNNs and metric learning. To make our work reproducible, source code and models will be made publicly available.

...read moreread less

41 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
…
39
40
41
42
43
44
45
…
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

You Only Look Once: Unified, Real-Time Object Detection

[...]

Joseph Redmon¹, Santosh K. Divvala², Ross Girshick³, Ali Farhadi²•Institutions (3)

University of Washington¹, Allen Institute for Artificial Intelligence², Facebook³

27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

27,256 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Robust Face Recognition via Sparse Representation

[...]

John Wright¹, Allen Y. Yang², Arvind Ganesh¹, S. Shankar Sastry², Yi Ma¹ - Show less +1 more•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of California, Berkeley²

01 Feb 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.

...read moreread less

Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

...read moreread less

9,658 citations

Journal Article•DOI•

Integrating single-cell transcriptomic data across different conditions, technologies, and species.

[...]

Andrew Butler, Paul J. Hoffman, Peter Smibert, Efthymia Papalexi¹, Rahul Satija¹ - Show less +1 more•Institutions (1)

New York University¹

02 Apr 2018-Nature Biotechnology

TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.

...read moreread less

Abstract: Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

...read moreread less

7,741 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse