Showing papers by "Stan Z. Li published in 2002"

PDF

Open Access

Book Chapter•DOI•

Statistical Learning of Multi-view Face Detection

[...]

Stan Z. Li¹, Long Zhu¹, ZhenQiu Zhang, Andrew Blake¹, Hong-Jiang Zhang¹, Heung-Yeung Shum¹ - Show less +2 more•Institutions (1)

Microsoft¹

28 May 2002

TL;DR: FloatBoost incorporates the idea of Floating Search into AdaBoost to solve the non-monotonicity problem encountered in the sequential search of AdaBoost and leads to the first real-time multi-view face detection system in the world.

...read moreread less

Abstract: A new boosting algorithm, called FloatBoost, is proposed to overcome the monotonicity problem of the sequential AdaBoost learning. AdaBoost [1, 2] is a sequential forward search procedure using the greedy selection strategy. The premise oyered by the sequential procedure can be broken-down when the monotonicity assumption, i.e. that when adding a new feature to the current set, the value of the performance criterion does not decrease, is violated. FloatBoost incorporates the idea of Floating Search [3] into AdaBoost to solve the non-monotonicity problem encountered in the sequential search of AdaBoost.We then present a system which learns to detect multi-view faces using FloatBoost. The system uses a coarse-to-fine, simple-to-complex architecture called detector-pyramid. FloatBoost learns the component detectors in the pyramid and yields similar or higher classification accuracy than AdaBoost with a smaller number of weak classifiers. This work leads to the first real-time multi-view face detection system in the world. It runs at 200 ms per image of size 320x240 pixels on a Pentium-III CPU of 700 MHz. A live demo will be shown at the conference.

...read moreread less

489 citations

Proceedings Article•DOI•

Local non-negative matrix factorization as a visual representation

[...]

Tao Feng¹, Stan Z. Li¹, Heung-Yeung Shum¹, Hong-Jiang Zhang¹•Institutions (1)

Microsoft¹

07 Aug 2002

TL;DR: A set of orthogonal, binary, localized basis components are learned from a well-aligned face image database and leads to a Walsh function-based representation of the face images, which can be used to resolve the occlusion problem, improve the computing efficiency and compress the storage requirements of a face detection and recognition system.

...read moreread less

Abstract: Proposes a novel method, called local non-negative matrix factorization (LNMF), for learning a spatially localized, parts-based subspace representation of visual patterns. An objective function is defined to impose the localization constraint, in addition to the non-negativity constraint in the standard non-negative matrix factorization (NMF). This gives a set of bases which not only allows a non-subtractive (part-based) representation of images but also manifests localized features. An algorithm is presented for the learning of such basis components. Experimental results are presented to compare LNMF with the NMF and principal component analysis (PCA) methods for face representation and recognition, which demonstrates the advantages of LNMF. Based on our LNMF approach, a set of orthogonal, binary, localized basis components are learned from a well-aligned face image database. It leads to a Walsh function-based representation of the face images. These properties can be used to resolve the occlusion problem, improve the computing efficiency and compress the storage requirements of a face detection and recognition system.

...read moreread less

108 citations

Proceedings Article•DOI•

Real-time multi-view face detection

[...]

ZhenQiu Zhang, Long Zhu¹, Stan Z. Li¹, Hong-Jiang Zhang¹•Institutions (1)

Microsoft¹

20 May 2002

TL;DR: This work presents the first real-time multi-view face detection system which runs at 5 frames per second for 320/spl times/240 image sequence and trains by using a new meta booting learning algorithm.

...read moreread less

Abstract: We present a detector-pyramid architecture for real-time multi-view face detection. Using a coarse to fine strategy, the full view is partitioned into finer and finer views. Each face detector in the pyramid detects faces of its respective view range. Its training is performed by using a new meta booting learning algorithm. This results in the first real-time multi-view face detection system which runs at 5 frames per second for 320/spl times/240 image sequence.

...read moreread less

88 citations

Proceedings Article•

FloatBoost Learning for Classification

[...]

Stan Z. Li¹, ZhenQiu Zhang, Heung-Yeung Shum¹, Hong-Jiang Zhang¹•Institutions (1)

Microsoft¹

01 Jan 2002

TL;DR: FloatBoost uses a backtrack mechanism after each iteration of AdaBoost to remove weak classifiers which cause higher error rates, and proposes a statistical model for learning weak classifier, based on a stagewise approximation of the posterior using an overcomplete set of scalar features.

...read moreread less

Abstract: AdaBoost [3] minimizes an upper error bound which is an exponential function of the margin on the training set [14]. However, the ultimate goal in applications of pattern classification is always minimum error rate. On the other hand, AdaBoost needs an effective procedure for learning weak classifiers, which by itself is difficult especially for high dimensional data. In this paper, we present a novel procedure, called FloatBoost, for learning a better boosted classifier. FloatBoost uses a backtrack mechanism after each iteration of AdaBoost to remove weak classifiers which cause higher error rates. The resulting float-boosted classifier consists of fewer weak classifiers yet achieves lower error rates than AdaBoost in both training and test. We also propose a statistical model for learning weak classifiers, based on a stagewise approximation of the posterior using an overcomplete set of scalar features. Experimental comparisons of FloatBoost and AdaBoost are provided through a difficult classification problem, face detection, where the goal is to learn from training examples a highly nonlinear classifier to differentiate between face and nonface patterns in a high dimensional space. The results clearly demonstrate the promises made by FloatBoost over AdaBoost.

...read moreread less

75 citations

Proceedings Article•DOI•

Multi-view face alignment using direct appearance models

[...]

Stan Z. Li¹, Yan Shui cheng¹, Hong-Jiang Zhang¹, Qian Sheng Cheng²•Institutions (2)

Microsoft¹, Peking University²

20 May 2002

TL;DR: The way that DAM models shapes and textures has the following advantages as compared to AAM: (1) DAM subspaces include admissible appearances previously unseen in AAM, (2) the convergence and accuracy are improved, and (3) the memory requirement is cut down to a large extent.

...read moreread less

Abstract: Alignment makes face distribution statistically more compact than un-aligned faces and provides a good basis for face modeling, recognition and synthesis. In this paper we present a method for multi-view face alignment using a new model called direct appearance model (DAM). Like active appearance model (AAM), DAM also makes use of both shape and texture constraints; however it does this without combining shape and texture as in AAM. The way that DAM models shapes and textures has the following advantages as compared to AAM: (1) DAM subspaces include admissible appearances previously unseen in AAM, (2) the convergence and accuracy are improved, and (3) the memory requirement is cut down to a large extent. Extensive experiments are presented to evaluate the DAM alignment in comparison with AAM.

...read moreread less

65 citations

Journal Article•DOI•

AI-EigenSnake: an affine-invariant deformable contour model for object matching

[...]

Zhong Xue¹, Stan Z. Li², Eam Khwang Teoh¹•Institutions (2)

Nanyang Technological University¹, Microsoft²

01 Feb 2002-Image and Vision Computing

TL;DR: Experiments based on real object matching demonstrate that the proposed AI-ES model is more robust and insensitive to the positions, viewpoints, and large deformations of object shapes, as compared to the Active Shape Model and the AI-Snake Model.

...read moreread less

31 citations

Proceedings Article•DOI•

Learning to detect multi-view faces in real-time

[...]

Stan Z. Li¹, Long Zhu¹, ZhenQiu Zhang¹, Hong-Jiang Zhang¹•Institutions (1)

Microsoft¹

07 Aug 2002

TL;DR: A new boosting algorithm, called FloatBoost, is proposed to construct a strong face-nonface classifier from weak classifiers for the component detectors in the pyramid, which leads to the first real-time multi-view face detection system in the world.

...read moreread less

Abstract: In this paper, we present a system which learns to detect multi-view faces. The system uses a coarse-to-fine, simple-to-complex architecture called detector-pyramid. A new boosting algorithm, called FloatBoost, is proposed to construct a strong face-nonface classifier from weak classifiers for the component detectors in the pyramid. FloatBoost incorporates the idea of Floating Search into AdaBoost, and yields similar or higher classification accuracy than AdaBoost with a smaller number of weak classifiers. This work leads to the first real-time multi-view face detection system in the world. It runs at 200 ms per image of size 320/spl times/240 pixels on a Pentium-III CPU of 700 MHz.

...read moreread less

30 citations

Texture-Constrained Active Shape Models

[...]

Shuicheng Yan, Ce Liu, Stan Z. Li¹, Hong-Jiang Zhang, Heung-Yeung Shum, Qiansheng Cheng - Show less +2 more•Institutions (1)

Microsoft¹

01 Jan 2002

TL;DR: A texture-constrained active shape model (TC-ASM) to localize a face in an image that performs stable to initialization, accurate in shape localization and robust to illumination variation, with low computational cost.

...read moreread less

Abstract: In this paper, we propose a texture-constrained active shape model (TC-ASM) to localize a face in an image. TC-ASM effectively incorporates not only the shape prior and local appearance around each landmark, but also the global texture constraint over the shape. Therefore, it performs stable to initialization, accurate in shape localization and robust to illumination variation, with low computational cost. Extensive experiments are provided to demonstrate

...read moreread less

28 citations

Proceedings Article•DOI•

Multi-view face detection with FloatBoost

[...]

ZhenQiu Zhang¹, MingJing Li², Stan Z. Li², Hong-Jiang Zhang²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Microsoft²

03 Dec 2002

TL;DR: A new boosting algorithm, called FloatBoost, is proposed to construct a strong face-nonface classifier that incorporates the idea of Floating Search into AdaBoost, and yields similar or higher classification accuracy than AdaBoost with a smaller number of weak classifiers.

...read moreread less

Abstract: In this paper, a new boosting algorithm, called FloatBoost, is proposed to construct a strong face-nonface classifier. FloatBoost incorporates the idea of Floating Search into AdaBoost, and yields similar or higher classification accuracy than AdaBoost with a smaller number of weak classifiers. We also present a novel framework for fast multi-view face detection. A detector-pyramid architecture is designed to quickly discard a vast number of non-face sub-windows and hence perform multi-view face detection efficiently. This results in the first real-time multi-view face detection system which runs at 5 frames per second for 320x240 image sequence.

...read moreread less

26 citations

Proceedings Article•DOI•

Multi-view face pose estimation based on supervised ISA learning

[...]

Stan Z. Li¹, XianHuan Peng², Xinwen Hou², Hong-Jiang Zhang¹, Qiansheng Cheng² - Show less +1 more•Institutions (2)

Microsoft¹, Peking University²

20 May 2002

TL;DR: A supervised method is presented for more effective learning of view-subspace, assuming that view-labeled face examples are available and the models thus learned give more accurate pose estimation than those obtained with the unsupervised ISA.

...read moreread less

Abstract: Independent subspace analysis (ISA) is able to learn view-subspaces unsupervisedly from (view-unlabeled) multi-view face examples (S.Z. Li et al., 2001). We explain underlying reasons for the emergent formation of ISA view-subspaces. Based on the analysis, we present a supervised method for more effective learning of view-subspace, assuming that view-labeled face examples are available. The models thus learned give more accurate pose estimation than those obtained with the unsupervised ISA.

...read moreread less

22 citations

Proceedings Article•

Audio textures

[...]

Lie Lu¹, Stan Z. Li¹, Liu Wenyin¹, Hong-Jiang Zhang¹, Yi Mao² - Show less +1 more•Institutions (2)

Microsoft¹, Zhejiang University²

01 May 2002

TL;DR: This paper introduces a new audio medium, called audio texture, as a means of synthesizing long audio stream according to a given short example audio clip, and proposes a method for implementing audio textures.

...read moreread less

Abstract: In this paper, we introduce a new audio medium, called audio texture, as a means of synthesizing long audio stream according to a given short example audio clip. The example clip is analyzed, and basic building patterns are extracted. Then an audio stream of arbitrary length is synthesized using a sequence of extracted building patterns. The patterns can be varied in the synthesis process to add variations to the generated sound. Audio textures are useful in applications such as background music, lullabies, game music, and screen saver sounds. A method is proposed for implementing audio textures. Preliminary results of audio textures are provided at our website for evaluation.

...read moreread less

Proceedings Article•DOI•

A novel Bayesian shape model for facial feature extraction

[...]

Zhong Xue¹, Stan Z. Li², Dinggang Shen¹, Eam Khwang Teoh³•Institutions (3)

University of Pennsylvania¹, Microsoft², Nanyang Technological University³

01 Dec 2002

TL;DR: This paper presents a novel application of the Bayesian shape model (BSM) for facial feature extraction, which is designed to describe the shape of a face, and the PCA is used to estimate the shape variance of the face model.

...read moreread less

Abstract: This paper presents a novel application of the Bayesian shape model (BSM) for facial feature extraction First, a full-face model is designed to describe the shape of a face, and the PCA is used to estimate the shape variance of the face model Then, the BSM is applied to match and extract the face patch from input face images Finally, using the face model, the extracted face patches are easily warped or normalized to a standard view Applications of this facial feature extraction algorithm include face recognition, face video coding and retrieval, face animation and multimedia

...read moreread less

Journal Article•DOI•

Transverse micro- and mesotexture distribution characteristics on the core surface of (Bi,Pb)2Sr2Ca2Cu3O10/Ag superconductor tape

[...]

Thiam Teck Tan¹, Stan Z. Li¹, Wei Gao², H.K. Liu³, Shi Xue Dou³ - Show less +1 more•Institutions (3)

Nanyang Technological University¹, University of Auckland², University of Wollongong³

11 Jan 2002-Superconductor Science and Technology

TL;DR: In this article, both the microtexture and mesotexture of powder-in-tube (PIT) processed (Bi,Pb)2SrCaCuO10 (Bi2223) superconductor tapes were analyzed in the transverse direction of Bi2223 tapes.

...read moreread less

Abstract: Powder-in-tube (PIT) processing of BiSrCaCuO superconductor is widely used to introduce textured microstructure to high temperature superconductor tapes, thus effectively minimizing the weak-link effects caused by grain boundary misorientations. Although it was reported that PIT tapes have parabolic critical current density (Jc) distribution across the tape width, the role played by texture in this is not clearly understood. In this study, both the micro- and the mesotexture of PIT processed (Bi,Pb)2Sr2Ca2Cu3O10 (Bi2223) superconductor tapes were analysed in the transverse direction of Bi2223 tapes. Microtexture and mesotexture of PIT processed Bi2223 tapes were characterized using angle-axis pairs and Rodrigues–Frank vectors. The results of microtexture evaluation indicates that a/b axes texture did exist in PIT processed tapes while the mesotexture RF plot exhibits that majority of the grain boundaries were formed by grains with non-parallel c-axis. These grain boundaries generally had low mismatch angles of up to ~10°. High-angle misorientation boundaries ranging up to 45° were generally associated with c-axis twist boundaries. The dominating misorientation angle for both sample sides and centre was found to be 4°. It is believed that the micro- and mesotexture distribution characteristics has influence over the Jc distribution in the transverse direction of Bi2223 tapes.

...read moreread less