scispace - formally typeset
Search or ask a question

Showing papers by "Ching Y. Suen published in 2011"


Proceedings ArticleDOI
TL;DR: This paper uses unsupervised discriminant projection (UDP) to build subspaces on WLBP featured periocular images and gain 100% rank-1 identification rate and 98% verification rate at 0.1% false accept rate on the entire FG-NET database.
Abstract: In this paper, we will present a novel framework of utilizing periocular region for age invariant face recognition. To obtain age invariant features, we first perform preprocessing schemes, such as pose correction, illumination and periocular region normalization. And then we apply robust Walsh-Hadamard transform encoded local binary patterns (WLBP) on preprocessed periocular region only. We find the WLBP feature on periocular region maintains consistency of the same individual across ages. Finally, we use unsupervised discriminant projection (UDP) to build subspaces on WLBP featured periocular images and gain 100% rank-1 identification rate and 98% verification rate at 0.1% false accept rate on the entire FG-NET database. Compared to published results, our proposed approach yields the best recognition and identification results.

164 citations


Journal ArticleDOI
TL;DR: It is argued that both large-and small-scale features of a face image are important for face restoration and recognition, and it is suggested that illumination normalization should be performed mainly on the large-scale featured rather than on the original face image.
Abstract: A face image can be represented by a combination of large-and small-scale features. It is well-known that the variations of illumination mainly affect the large-scale features (low-frequency components), and not so much the small-scale features. Therefore, in relevant existing methods only the small-scale features are extracted as illumination-invariant features for face recognition, while the large-scale intrinsic features are always ignored. In this paper, we argue that both large-and small-scale features of a face image are important for face restoration and recognition. Moreover, we suggest that illumination normalization should be performed mainly on the large-scale features of a face image rather than on the original face image. A novel method of normalizing both the Small-and Large-scale (S&L) features of a face image is proposed. In this method, a single face image is first decomposed into large-and small-scale features. After that, illumination normalization is mainly performed on the large-scale features, and only a minor correction is made on the small-scale features. Finally, a normalized face image is generated by combining the processed large-and small-scale features. In addition, an optional visual compensation step is suggested for improving the visual quality of the normalized image. Experiments on CMU-PIE, Extended Yale B, and FRGC 2.0 face databases show that by using the proposed method significantly better recognition performance and visual results can be obtained as compared to related state-of-the-art methods.

143 citations


Proceedings ArticleDOI
TL;DR: A novel Contourlet Appearance Model (CAM) is proposed that is more accurate and faster at localizing facial landmarks than Active Appearance Models (AAMs) and has the ability to not only extract holistic texture information, as AAMs do, but can also extract local texture information using the Nonsubsampled Contourlets Transform (NSCT).
Abstract: In this paper we propose a novel Contourlet Appearance Model (CAM) that is more accurate and faster at localizing facial landmarks than Active Appearance Models (AAMs). Our CAM also has the ability to not only extract holistic texture information, as AAMs do, but can also extract local texture information using the Nonsubsampled Contourlet Transform (NSCT). We demonstrate the efficiency of our method by applying it to the problem of facial age estimation. Compared to previously published age estimation techniques, our approach yields more accurate results when tested on various face aging databases.

90 citations


Journal ArticleDOI
TL;DR: To speed up the matching process and to control the misclassification error, a combined approach called the adaptive asymmetrical support vector machines (AASVMs) are applied in order to improve the overall generalization performance.

70 citations


Journal ArticleDOI
TL;DR: A variational level set-based curve evolution scheme that uses a significantly larger time step to numerically solve the evolution partial differential equation (PDE) for segmentation of an unideal iris image accurately, and thereby, speeding up the curve evolution process drastically.

51 citations


Proceedings ArticleDOI
18 Sep 2011
TL;DR: A novel adaptive binarization algorithm using ternary entropy-based approach is proposed and Experimental results show that the proposed algorithm outperforms other state-of-the-art methods.
Abstract: A vast number of historical and badly degraded document images can be found in libraries, public, and national archives. Due to the complex nature of different artifacts, such poor quality documents are hard to read and to process. In this paper, a novel adaptive binarization algorithm using ternary entropy-based approach is proposed. Given an input image, the contrast of intensity is first estimated by a grayscale morphological closing operator. A double-threshold is generated by our Shannon entropy-based ternarizing method to classify pixels into text, near-text, and non-text regions. The pixels in the second region are relabeled by the local mean and the standard deviation. Our proposed method classifies noise into two categories which are processed by binary morphological operators, shrink and swell filters, and graph searching strategy. The method is tested with three databases that have been used in the Document Image Binarization Contest 2009 (DIBCO 2009), the Handwriting Document Image Binarization Contest 2010 (H-DBCIO 2010), and the International Conference on Frontier in Handwriting Recognition 2010 (ICFHR 2010). The evaluation is based upon nine distinct measures. Experimental results show that our proposed algorithm outperforms other state-of-the-art methods.

29 citations


Journal ArticleDOI
TL;DR: A variational model is proposed to apply to localize the iris region belonging to given shape space using active contour method, a geometric shape prior, and the Mumford–Shah functional, which is robust against noise, poor localization and weak iris/sclera boundaries.
Abstract: Most state-of-the-art iris recognition algorithms claim to perform with a very high recognition accuracy in a strictly controlled environment. However, their recognition accuracies significantly decrease when the acquired images are affected by different noise factors including motion blur, camera diffusion, head movement, gaze direction, camera angle, reflections, contrast, luminosity, eyelid and eyelash occlusions, and problems due to contraction and dilation. The novelty of this research effort is that we propose to apply a variational model to localize the iris region belonging to given shape space using active contour method, a geometric shape prior, and the Mumford–Shah functional. This variational model is robust against noise, poor localization and weak iris/sclera boundaries. Furthermore, we apply the Modified Contribution-Selection Algorithm (MCSA) for iris feature ranking based on the Multi-Perturbation Shapley Analysis (MSA), a framework which relies on cooperative game theory to estimate the effectiveness of the features iteratively and select them accordingly, using either forward selection or backward elimination approaches. The verification and identification performance of the proposed scheme is validated using the ICE 2005, the UBIRIS Version 1, the CASIA Version 3 Interval, and WVU Nonideal datasets.

29 citations


Proceedings ArticleDOI
21 Mar 2011
TL;DR: This paper introduces an advanced age-determination technique using hybrid facial features and Kernel Spectral Regression, a nonlinear dimensionality reduction method that yields promising results in overall mean absolute error (MAE), meanabsolute error per decade of life (MAe/D), and cumulative match score in various face aging corpuses.
Abstract: This paper introduces an advanced age-determination technique using hybrid facial features and Kernel Spectral Regression, a nonlinear dimensionality reduction method In the preprocessing stage, the logarithmic nonsubsampled contourlet transform (NSCT) is conducted to denoise and amplify facial wrinkles that help to distinguish young faces from elder ones Then the hybrid facial features that combine both local and holistic features are extracted from the preprocessed images Our novel Uniform Local Ternary Patterns (ULTP) are used as the local features Meanwhile the holistic features are extracted by using the Active Appearance Model (AAM) to encode each face Kernel Spectral Regression is used to minimize inter-class distances while maximizing intra-class distances of feature sets These reduced features are used to classify faces into two age groups (age-classification) An age-determination function is then constructed for each age group in accordance with physiological growth periods for humans — pre-adult (youth) and adult Compared to published results, this method yields promising results in overall mean absolute error (MAE), mean absolute error per decade of life (MAE/D), and cumulative match score in various face aging corpuses

19 citations


Proceedings ArticleDOI
18 Sep 2011
TL;DR: A graph matching approach is proposed to retrieve envelope images from a large image database using a minimum weighted bipartite graph matching method to compute the distance between two graphs.
Abstract: A graph matching approach is proposed to retrieve envelope images from a large image database. First, the graph representation of an envelop image is generated based on the image segmentation results, in which each node corresponds to one segmented region. The attributes of nodes and edges in the graph are described by characteristics of the envelope image. Second, a minimum weighted bipartite graph matching method is employed to compute the distance between two graphs. Finally, the whole retrieval system including two principal stages is presented, namely, rough matching and fine matching. The experiments on a database of envelope images captured from real-life mail pieces demonstrate that the proposed method achieves promising results.

11 citations


Journal ArticleDOI
TL;DR: A new approach based on Linear Discriminant Analysis to reject less reliable classifier outputs and it represents a more comprehensive measurement than traditional rejection measurements such as First Rank Measurement and First Two Ranks Measurement.
Abstract: In document recognition, it is often important to obtain high accuracy or reliability and to reject patterns that cannot be classified with high confidence. This is the case for applications such as the processing of financial documents in which errors can be very costly and therefore far less tolerable than rejections. This paper presents a new approach based on Linear Discriminant Analysis (LDA) to reject less reliable classifier outputs. To implement the rejection, which can be considered a two-class problem of accepting the classification result or otherwise, an LDA-based measurement is used to determine a new rejection threshold. This measurement (LDAM) is designed to take into consideration the confidence values of the classifier outputs and the relations between them, and it represents a more comprehensive measurement than traditional rejection measurements such as First Rank Measurement and First Two Ranks Measurement. Experiments are conducted on the CENPARMI database of numerals, the CENPARMI Arabic Isolated Numerals Database, and the numerals in the NIST Special Database 19. The results show that LDAM is more effective, and it can achieve a higher reliability while maintaining a high recognition rate on these databases of very different origins and sizes.

10 citations


Journal ArticleDOI
TL;DR: The proposed NIC-NPL-QI algorithm obtains better quality in synthesizing face images as compared with state-of-the-art algorithms and can handle the harmonic light and shadows.

Proceedings ArticleDOI
18 Sep 2011
TL;DR: This paper summarizes an in-depth inquiry into the following topics: impact of fonts on digital publishing and display, the influence of typographic features on reading, the role of fonts in reading, effect of spacing on reading speed and comprehension, and machine reading of early styles of ancient Chinese characters.
Abstract: Advances in digital technology have greatly facilitated the design of new type fonts. Today, hundreds of thousands of fonts can be found in various visual appearances or styles, which are used in digital publishing and information display. As a result, it has become important to find ways of evaluating their impact on our daily lives: (1) ease in reading, (2) comprehension of the texts, and (3) eye-strain. This paper summarizes an in-depth inquiry into the following topics: (a) impact of fonts on digital publishing and display, (b) the influence of typographic features on reading, (c) the role of fonts in reading, (d) effect of spacing on reading speed and comprehension, and (e) machine reading of early styles of ancient Chinese characters. Several insightful questions on this subject are asked, and answers have been provided through this paper and the oral presentations. A comprehensive list of references is included at the end of each section for further studies and research.

Proceedings ArticleDOI
18 Sep 2011
TL;DR: Methods of evaluating digital Chinese fonts and their typeface characteristics to seek a good way to enhance the character recognition rate and the relationships among legibility, eye-strain, and myopia, will be discussed.
Abstract: More and more fonts have sprung up in recent years in digital publishing industry and reading devices. In this paper, we focus on methods of evaluating digital Chinese fonts and their typeface characteristics to seek a good way to enhance the character recognition rate. To accomplish this, we combined psychological analysis methods with statistical analysis. It involved an extensive survey of distinctive features of eighteen popular digital typefaces. Survey results were tabulated and analyzed statistically. Then another objective experiment was conducted using the best six fonts derived from the survey results. These experimental results reveal an effective way of choosing legible digital fonts most suitable for comfortable reading of books, magazines, newspapers, and for display of texts on cell-phones, e-books, and digital libraries, and finding out the features for improving character legibility of different Chinese typefaces. The relationships among legibility, eye-strain, and myopia, will be discussed.

Proceedings ArticleDOI
23 Jan 2011
TL;DR: A novel algorithm for automatic extraction of numeric strings in unconstrained handwritten document images using probabilistic RBF networks for real-world documents where letters and digits may be connected or broken in a document.
Abstract: Numeric strings such as identification numbers carry vital pieces of information in documents. In this paper, we present a novel algorithm for automatic extraction of numeric strings in unconstrained handwritten document images. The algorithm has two main phases: pruning and verification. In the pruning phase, the algorithm first performs a new segment-merge procedure on each text line, and then using a new regularity measure, it prunes all sequences of characters that are unlikely to be numeric strings. The segment-merge procedure is composed of two modules: a new explicit character segmentation algorithm which is based on analysis of skeletal graphs and a merging algorithm which is based on graph partitioning. All the candidate sequences that pass the pruning phase are sent to a recognition-based verification phase for the final decision. The recognition is based on a coarse-to-fine approach using probabilistic RBF networks. We developed our algorithm for the processing of real-world documents where letters and digits may be connected or broken in a document. The effectiveness of the proposed approach is shown by extensive experiments done on a real-world database of 607 documents which contains handwritten, machine-printed and mixed documents with different types of layouts and levels of noise.

Proceedings ArticleDOI
18 Sep 2011
TL;DR: A verification post processing module was developed in order to reject some false positives and reached an overall precision of 80% and 83.3% recall on the test set of handwritten documents.
Abstract: In order to spot the digits in a handwritten document, each component is sent to a classifier. This is a time consuming process because a document usually contains several hundred components. A method is presented to reduce the number of candidate components from a handwritten document sent to the classifier. Furthermore, since the classifier does not contain a rejection class, this led to several misclassifications. To lessen this, a verification post processing module was developed in order to reject some false positives. We reached an overall precision of 80% and 83.3% recall on our test set of handwritten documents.

Journal ArticleDOI
TL;DR: The new splitting–shooting method, new splitting integrating method, and their combination are proposed, which show that the true error bound O(H) is well suited to images with all kinds of discontinuous intensity, including scattered pixels.
Abstract: For digital images and patterns under the nonlinear geometric transformation, T: (ξ, η) → (x, y), this study develops the splitting algorithms (i.e., the pixel-division algorithms) that divide a 2D pixel into N × N subpixels, where N is a positive integer chosen as N = 2k(k ≥ 0) in practical computations. When the true intensity values of pixels are known, this method makes it easy to compute the true intensity errors. As true intensity values are often unknown, the proposed approaches can compute the sequential intensity errors based on the differences between the two approximate intensity values at N and N/2. This article proposes the new splitting–shooting method, new splitting integrating method, and their combination. These methods approximate results show that the true errors of pixel intensity are O(H), where H is the pixel size. Note that the algorithms in this article do not produce any sequential errors as N ≥ N0, where N0 (≥2) is an integer independent of N and H. This is a distinctive feature compared to our previous papers on this subject. The other distinct feature of this article is that the true error bound O(H) is well suited to images with all kinds of discontinuous intensity, including scattered pixels. © 2011 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 21, 323–335, 2011 © 2011 Wiley Periodicals, Inc.