Showing papers by "Ching Y. Suen published in 2012"

PDF

Open Access

Journal Article•DOI•

A novel hybrid CNN-SVM classifier for recognizing handwritten digits

[...]

Xiao-Xiao Niu¹, Ching Y. Suen¹•Institutions (1)

01 Apr 2012-Pattern Recognition

TL;DR: A hybrid model of integrating the synergy of two superior classifiers: Convolutional Neural Network (CNN) and Support Vector Machine (SVM) which have proven results in recognizing different types of patterns is presented.

...read moreread less

585 citations

Journal Article•DOI•

LoGID: An adaptive framework combining local and global incremental learning for dynamic selection of ensembles of HMMs

[...]

Paulo R. Cavalin¹, Robert Sabourin², Ching Y. Suen³•Institutions (3)

Federal University of Tocantins¹, École de technologie supérieure², Concordia University Wisconsin³

01 Sep 2012-Pattern Recognition

TL;DR: The proposed LoGID framework to adapt hidden Markov model-based pattern recognition systems during both the generalization and learning phases is evaluated and it is shown that it can effectively improve the performance of systems created with small training sets as more data are observed over time.

...read moreread less

48 citations

Book•

New Systems and Architectures for Automatic Speech Recognition and Synthesis

[...]

Renato De Mori, Ching Y. Suen

07 Jan 2012

TL;DR: In this article, the authors present an overview of basic algorithms for progessing speech signals and an architecture for isolated and connected word recognition in real-time speech recognition systems, as well as knowledge-based and expert systems in automatic speech recognition.

...read moreread less

Abstract: I. Review of Basic Algorithms.- An Overview of Digital Techniques for Progessing Speech Signals.- Systems for Isolated and Connected Word Recognition.- II. System Architecture and Vlsi for Speech Processing.- Systolic Architectures for Connected Speech Recognition.- Computer Systems for High-Performance Speech Recognition.- VLSI Architectures for Recognition of Context-Free Languages.- Implementation of an Aeoustical Front-End for Speech Recognition.- Reconfigurable Modular Architecture for a Man-Machine Vocal Communication System in Real Time.- A Surrey of Algorithms & Architecture for Connected Speech Recognition.- III. Software Systems for Automatic Speech Recognition.- Knowledge-Based and Expert Systems in Automatic Speech Recognition.- The Speech Understanding and Dialog System EVAR.- A New Rule-Based Expert System for Speech Recognition.- SAY - A PC Based Speech Analysis System.- Automatic Generation of Linguistic, Phonetic and Acoustic Knowledge for a Diphone-Based Continuous Speech Recognition System.- The Use of Dynamic Frequency Warping in a Speaker-Independent Vowel Classifier.- Dynamic Time Warping Algorithms for Isolated and Connected Word Recognition.- An Efficient Algorithm for Recognizing Isolated Turkish Words.- A General Fuzzy-Parsing Scheme for Speech Recognition.- IV Speech Synthesis and Phonetics.- Linguistics and Automatic Processing of Speech.- Synthesis of Speech by Computers and Chips.- Prosodic Knowledge in the Rule-Based Synthex Expert System for Speech Synthesis.- Syntex - Unrestricted Conversion of Text to Speech for German.- Concatenation Rules for Demisyllable Speech Synthesis.- On the Use of Phonetic Knowledge for Automatic Speech Recognition.- Demisyllables as Processing Units for Automatic Speech Recognition and Lexical Access.- Detection and Recognition of Nasal Consonants in Continuous Speech - Preliminary Results.- Author Index.

...read moreread less

46 citations

Arabic Handwritten Text Line Extraction by Applying an Adaptive Mask to

[...]

Morphological Dilation, Muna Khayyat, Louisa Lam, Ching Y. Suen, Fei Yin, Cheng-Lin Liu - Show less +2 more

01 Jan 2012

37 citations

Proceedings Article•DOI•

Arabic Handwritten Text Line Extraction by Applying an Adaptive Mask to Morphological Dilation

[...]

Muna Khayyat¹, Louisa Lam¹, Ching Y. Suen¹, Fei Yin, Cheng-Lin Liu - Show less +1 more•Institutions (1)

Concordia University¹

27 Mar 2012

TL;DR: This paper uses morphological dilation with a dynamic adaptive mask for line extraction using the CENPARMI Arabic handwritten documents database which contains multi-skewed and touching lines to demonstrate the effectiveness of this approach.

...read moreread less

Abstract: This paper presents a robust method for handwritten text line extraction. We use morphological dilation with a dynamic adaptive mask for line extraction. Line separation occurs because of the repulsion and attraction between connected components. The characteristics of the Arabic script are considered to ensure a high performance of the algorithm. Our method is evaluated on the CENPARMI Arabic handwritten documents database which contains multi-skewed and touching lines. With a matching score of 0.95, our method achieved precision and recall rates of 96:3% and 96:7% respectively, which demonstrate the effectiveness of our approach.

...read moreread less

30 citations

Journal Article•DOI•

Matching of Tracked Pedestrians Across Disjoint Camera Views Using CI-DLBP

[...]

Guoyun Lian¹, Jianhuang Lai¹, Ching Y. Suen², Pei Chen¹•Institutions (2)

Sun Yat-sen University¹, Concordia University²

01 Jul 2012-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This work proposes a distance-based local binary pattern (DLBP) descriptor, a part-based pedestrian representation, and a novel CI_DLBP descriptor, which unifies the color intensity and DLBP by learning the joint distributions of the DLBP and color intensity at each channel.

...read moreread less

Abstract: Matching pedestrians across disjoint camera views is a challenging task, since their observations are separated in time and space and their appearances may vary considerably. Recently, some approaches of matching pedestrians have been proposed. However, these approaches either used too complex representations or only considered the color information and discarded the spatial structural information of the pedestrian. In order to describe the spatial structural information in color space, we propose a distance-based local binary pattern (DLBP) descriptor. Besides the spatial structural information, the color itself namely its intensity value is also an important feature in matching pedestrians across disjoint camera views. In order to effectively combine these two kinds of information, we further propose a novel CI_DLBP descriptor, which unifies the color intensity and DLBP by learning the joint distributions (2-D histograms) of the DLBP and color intensity at each channel. In addition, different from the previous approaches in which the pedestrians matching is based on their whole bodies, we develop a part-based pedestrian representation because the color density and spatial structural information between the upper outer garment and the lower garment worn by the pedestrian is usually different. Experimental results on challenging realistic scenarios and VIPeR dataset validate the proposed DLBP operator, the CI_DLBP descriptor, and the part-based pedestrian representation for pedestrian matching across disjoint camera views. Compared with existing methods based on color information, this new CI_DLBP approach performs better.

...read moreread less

18 citations

Proceedings Article•DOI•

Arabic handwritten word spotting using language models

[...]

Muna Khayyat¹, Louisa Lam¹, Ching Y. Suen¹•Institutions (1)

Concordia University¹

18 Sep 2012

TL;DR: For the first time in Arabic word spotting, language models are incorporated into the process of reconstructing words from PAWs, and a hierarchical classifier is implemented.

...read moreread less

Abstract: With the ever-increasing amounts of published materials being made available, developing efficient means of locating target items has become a subject of significant interest. Among the approaches adopted for this purpose is word spotting, which enables the identification of documents through the use of pertinent keywords. This paper reports on an effective method of word spotting for Arabic handwritten documents that takes into consideration the nature of Arabic handwriting. Parts of Arabic Words (PAWs) form the basic components of this search process, and a hierarchical classifier (consisting of a set of classifiers each trained on a different part of the input pattern) is implemented. For the first time in Arabic word spotting, language models are incorporated into the process of reconstructing words from PAWs. Details of the method and promising experimental results are also presented.

...read moreread less

16 citations

Journal Article•DOI•

Iris segmentation using game theory

[...]

Kaushik Roy¹, Prabir Bhattacharya², Ching Y. Suen¹•Institutions (2)

Concordia University¹, University of Cincinnati²

01 Jun 2012-Signal, Image and Video Processing

TL;DR: A new iris segmentation scheme using game theory to elicit iris/pupil boundaries from a nonideal iris image is described, which is robust to noise and poor localization, and less affected by weakiris/sclera boundaries.

...read moreread less

Abstract: Robust segmentation of an iris image plays an important role in iris recognition. However, the nonlinear deformations, pupil dilations, head rotations, motion blurs, reflections, nonuniform intensities, low image contrast, camera angles and diffusions, and presence of eyelids and eyelashes often hamper the conventional iris/pupil localization methods, which utilize the region-based or the gradient-based boundary-finding information. The novelty of this research effort is that we describe a new iris segmentation scheme using game theory to elicit iris/pupil boundaries from a nonideal iris image. We apply a parallel game-theoretic decision making procedure by modifying Chakraborty and Duncan’s algorithm, which integrates (1) the region-based segmentation and gradient-based boundary-finding methods and (2) fuses the complementary strengths of each of these individual methods. This integrated scheme forms a unified approach, which is robust to noise and poor localization, and less affected by weak iris/sclera boundaries. The verification and identification performance of the proposed method are validated using the ICE 2005, the UBIRIS Version 1, WVU Nonideal, and the CASIA Version 3 data sets.

...read moreread less

14 citations

Journal Article•DOI•

Removal of noise patterns in handwritten images using expectation maximization and fuzzy inference systems

[...]

M. Mehdi Haji¹, Tien D. Bui¹, Ching Y. Suen¹•Institutions (1)

Concordia University¹

01 Dec 2012-Pattern Recognition

TL;DR: Fuzzy inference systems are proposed to be used in the initialization step of the optimization process of the noise removal and recognition problem, which can be solved by expectation maximization given that the recognition engine is trained for clean images.

...read moreread less

12 citations

Proceedings Article•DOI•

Statistical Hypothesis Testing for Handwritten Word Segmentation Algorithms

[...]

M. Mehdi Haji¹, Kalyan Asis Sahoo², Tien D. Bui¹, Ching Y. Suen¹, Dominique Ponson - Show less +1 more•Institutions (2)

Concordia University¹, Indian Institute of Technology Kharagpur²

18 Sep 2012

TL;DR: The main idea behind the proposed approach is to learn the geometrical distribution of words within a sentence using a Markov chain or a Hidden Markov Model (HMM).

...read moreread less

Abstract: We present a statistical hypothesis testing method for handwritten word segmentation algorithms. Our proposed method can be used along with any word segmentation algorithm in order to detect over-segmented or under-segmented errors or to adapt the word segmentation algorithm to new data in an unsupervised manner. The main idea behind the proposed approach is to learn the geometrical distribution of words within a sentence using a Markov chain or a Hidden Markov Model (HMM). In the former, we assume all the necessary information is observable, where in the latter, we assume the minimum observable variables are the bounding boxes of the words, and the hidden variables are the part of speech information. Our experimental results on a benchmark database show that not only we can achieve a lower over-segmentation and under-segmentation error rate, but also a higher correct segmentation rate as a result of the proposed hypothesis testing.

...read moreread less

7 citations

Proceedings Article•DOI•

A Novel Approach for Stroke Extraction of Off-Line Chinese Handwritten Characters Based on Optimum Paths

[...]

Jun Tan¹, Jianhuang Lai¹, Wei-Shi Zheng¹, Ching Y. Suen²•Institutions (2)

Sun Yat-sen University¹, Concordia University²

18 Sep 2012

TL;DR: A novel approach based on Optimum Paths, derived from the degree information and continuation property, is introduced to solve the segmentation ambiguities at intersection points of Off-line Chinese characters recognition.

...read moreread less

Abstract: In recognition of Off-line handwritten characters and signatures, stroke extraction is often a crucial step. Given the large number of Chinese handwritten characters, pattern matching based on structural decomposition and analysis is useful and essential to Off-line Chinese recognition to reduce ambiguity. Two challenging problems for stroke extraction are: 1) how to extract primary strokes and 2) how to solve the segmentation ambiguities at intersection points. In this paper, we introduce a novel approach based on Optimum Paths(AOP) to solve this problem. Optimum Paths(AOP) are derived from the degree information and continuation property, we use them to tackle these two problems. Compared with other methods, the proposed approach has extracted strokes from Off-line Chinese handwritten characters with better performance.

...read moreread less

Proceedings Article•

Document image matching using probabilistic graphical models

[...]

Li Liu¹, Yue Lu¹, Ching Y. Suen²•Institutions (2)

East China Normal University¹, Concordia University²

01 Nov 2012

TL;DR: With properly defined potential functions in the joint probability represented by the graphical model, the disparity in tree representations caused by different image capturing conditions can be tolerated as demonstrated in the encouraging experimental results.

...read moreread less

Abstract: A document image matching approach making use of probabilistic graphical models is proposed. The document image is first represented by a tree with the nodes in the tree corresponding to the regions in the image and the edges indicating the parent-child relationships between them, transforming the problem to tree matching. A graphical model, i.e. pairwise Markov Random Field is defined on the tree, in which sense the nodes are considered as random variables and the edges encode the relations among these variables in the probability domain. The tree matching problem is then formulated as Maximum a Posterior (MAP) inference over the graphical model and solved by belief propagation. Since the underlying graphical model is tree-structured, the exact inference can be obtained. With properly defined potential functions in the joint probability represented by the graphical model, the disparity in tree representations caused by different image capturing conditions can be tolerated as demonstrated in the encouraging experimental results.

...read moreread less

Proceedings Article•

Compressed Submanifold Multifactor Analysis with adaptive factor structures

[...]

Khoa Luu¹, Marios Savvides¹, Tien D. Bui², Ching Y. Suen²•Institutions (2)

Carnegie Mellon University¹, Concordia University²

01 Nov 2012

TL;DR: The proposed CSMA method achieves both fastest running time and highest accuracy in the face recognition problem compared to MPCA and some other multifactor based methods on two challenging databases, i.e. CMU-MPIE and Extended YALE-B.

...read moreread less

Abstract: This paper proposes a novel approach named Compressed Submanifold Multifactor Analysis (CSMA) to concisely and precisely deal with multifactor analysis. Compared to the state-of-the-art MPCA method that loses the original local geometry structures of input factors due to the averaging process, our proposed approach can preserve their original geometry. In addition, the fast low-rank approximation of a given dataset with multifactors is also provided using Random Projection to reduce space requirements and give more transparent representation. Our proposed method achieves both fastest running time and highest accuracy in the face recognition problem compared to MPCA and some other multifactor based methods on two challenging databases, i.e. CMU-MPIE and Extended YALE-B.

...read moreread less

Book Chapter•DOI•

Handwritten Farsi Word Recognition Using Hidden Markov Models

[...]

Puntis Jifroodian Haghighi¹, Ching Y. Suen¹•Institutions (1)

Concordia University¹

01 Jan 2012

TL;DR: A Hidden Markov Model based recognizer for Farsi handwritten word recognition systems is developed and first evaluation of the performance of this recognizer shows promising results.

...read moreread less

Abstract: One of the most important script groups, which is based on Arabic alphabet, is the Persian/Farsi script This script is the basis of different languages used in Middle East and Central Asian regions For the development of Farsi handwritten word recognition systems, the CENPARMI group designed and collected a database Based on statistical features, a Hidden Markov Model based recognizer is developed First evaluation of the performance of this recognizer shows promising results

...read moreread less

Proceedings Article•DOI•

Image super-resolution reconstruction based on self-similarity and neural networks

[...]

Yan Xu¹, Xue M. Li¹, Tian Gao², Ching Y. Suen³•Institutions (3)

Beijing University of Posts and Telecommunications¹, Shandong jianzhu university 山東建築大學², Concordia University³

02 Jul 2012

TL;DR: A novel super-resolution approach based on the framework of wavelet transform that reconstructs the more reliable image without obvious visual artifacts.

...read moreread less

Abstract: A novel super-resolution approach is presented. An image pyramid has been built based on the framework of wavelet transform, and the detailed coefficients are explored for training the neural networks. The initial high resolution image is estimated by the trained networks and the inverse wavelet transform, and then is constrained with prior knowledge of the error function by iteration. For a factor of 2n, repeat this process and update the networks. The experimental results show that our method reconstructs the more reliable image without obvious visual artifacts.

...read moreread less