scispace - formally typeset
Search or ask a question

Showing papers by "Stan Z. Li published in 2010"


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work proposes a scale invariant local ternary pattern operator and proposes a pattern kernel density estimation technique to effectively model the probability distribution of local patterns in the pixel process, which utilizes only one single LBP-like pattern instead of histogram as feature.
Abstract: Background modeling plays an important role in video surveillance, yet in complex scenes it is still a challenging problem. Among many difficulties, problems caused by illumination variations and dynamic backgrounds are the key aspects. In this work, we develop an efficient background subtraction framework to tackle these problems. First, we propose a scale invariant local ternary pattern operator, and show that it is effective for handling illumination variations, especially for moving soft shadows. Second, we propose a pattern kernel density estimation technique to effectively model the probability distribution of local patterns in the pixel process, which utilizes only one single LBP-like pattern instead of histogram as feature. Third, we develop multimodal background models with the above techniques and a multiscale fusion scheme for handling complex dynamic backgrounds. Exhaustive experimental evaluations on complex scenes show that the proposed method is fast and effective, achieving more than 10% improvement in accuracy compared over existing state-of-the-art algorithms.

416 citations


01 Jan 2010
TL;DR: Several computer vision approaches have been developed for skin detection, which typically transforms a given pixel into an appropriate color space and then uses a skin classifier to label the pixel whether it is a ski n or a non-skin pixel.
Abstract: Skin detection is the process of finding skin-colored pixels and regions in an image or a video. This process is typically used as a preprocessing step to find regions that potentially have human faces and limbs in images. Several computer vision approach es have been developed for skin detection. A skin detector typically transforms a given pix el into an appropriate color space and then use a skin classifier to label the pixel whether it is a ski n or a non-skin pixel. A skin classifier defines a decision boundary of the skin color class in the colo r space based on a training database of skin-colored pixels.

92 citations


Proceedings ArticleDOI
23 Aug 2010
TL;DR: A novel algorithm for detection of moving cast shadows, that based on a local texture descriptor called Scale Invariant Local Ternary Pattern (SILTP) is presented, which demonstrates the robustness of the algorithm.
Abstract: Moving cast shadow removal is an important yet difficult problem in video analysis and applications. This paper presents a novel algorithm for detection of moving cast shadows, that based on a local texture descriptor called Scale Invariant Local Ternary Pattern (SILTP). An assumption is made that the texture properties of cast shadows bears similar patterns to those of the background beneath them. The likelihood of cast shadows is derived using information in both color and texture. An online learning scheme is employed to update the shadow model adaptively. Finally, the posterior probability of cast shadow region is formulated by further incorporating prior contextual constrains using a Markov Random Field (MRF) model. The optimal solution is found using graph cuts. Experimental results tested on various scenes demonstrate the robustness of the algorithm.

46 citations


Proceedings ArticleDOI
23 Aug 2010
TL;DR: The OPBS selects n principal backgrounds from N backgrounds in an online fashion with a low memory cost, making it possible to build an efficient online video synopsis system.
Abstract: Video synopsis provides a means for fast browsing of activities in video. Principal background selection (PBS) is an important step in video synopsis. Existing methods make PBS in an offline way and at a high memory cost. In this paper we propose a novel background selection method, ``online principal background selection'' (OPBS). The OPBS selects n principal backgrounds from N backgrounds in an online fashion with a low memory cost, making it possible to build an efficient online video synopsis system. Another advantage is that, with OPBS, the selected backgrounds are related to not only background changes over time but also video activities. Experimental results demonstrate the advantages of the proposed OPBS.

17 citations


Journal ArticleDOI
TL;DR: This paper uses a specially designed camera system with active NIR illumination to capture the NIR images of faces and uses a PCA or kernel based scheme to perform the learning between spaces of large dimensions.
Abstract: This paper proposes a statistical learning based method for 3D modeling of faces directly from Near Infrared (NIR) images. We use a specially designed camera system with active NIR illumination to capture the NIR images of faces. The NIR images captured in such a way are invariant to environmental lighting changes. The property provides more reasonable data sources for statistical learning. By using the NIR images and the depth images of some known faces, we can observe a mapping relation between the two image modalities. The mapping relation can then be used to recover depth data of an unknown face from his NIR image. To perform the learning, the images of different modalities taken from different persons are elaborately aligned to make pixel-to-pixel correspondences between images. Based on these aligned images, two face spaces corresponding to NIR and depth face images can be constructed, respectively. We then use a PCA based or kernel based scheme to perform the learning between spaces of large dimensions. Several regression algorithms with linear and nonlinear kernels are employed and evaluated to find the mapping that best describes the relation between the two face spaces. The experimental results show that the method presented in this paper is effective. It can reconstruct 3D face model directly from NIR image of a face with high accuracy and low computational costs.

6 citations


Proceedings ArticleDOI
06 Dec 2010
TL;DR: Zhang et al. as mentioned in this paper proposed a robust face alignment method by combining local feature matching and Probabilistic Hough Transform (PHT) for partial face alignment in near infrared (NIR) images.
Abstract: Face alignment and recognition in less controlled environment are one of the most essential bottlenecks for practical face recognition system. Recently several researches have focused on partial face recognition problem, but few works have addressed the problem of face alignment under partial occlusion. In this paper, we present a robust face alignment method by combining local feature matching and Probabilistic Hough Transform (PHT) for partial face alignment in near infrared (NIR) images. Given a set of well aligned faces as target, and for face images with occlusions, their correspondences are established by local feature matching. For faces with missing components, many false matches of local features will be built due to lack of holistic information. The PHT approach aims to find correct correspondences and resist the inevitable false ones by taking each parameter candidate generated by correspondences pair as a vote in the 4-D in-plane transform parameter space. We also employ geometric constraints and appearance consistency and combine them with PHT in an probabilistic hough optimization function, so that each vote is weighted by a probabilistic score. Experiments of alignment on both MBGC portal face video and facial images with Glass-face occlusions show that our approach can reliably and accurately deal with missing data of facial components caused by partial occlusion.

2 citations