scispace - formally typeset
Search or ask a question

Showing papers by "Ching Y. Suen published in 2022"


Book ChapterDOI
01 Jan 2022

3 citations


Journal ArticleDOI
TL;DR: In this article , a set of gender-related features suggested by a graphologist, to detect the gender of the writers, have been proposed, which include margins, space between words, pen-pressure and handwriting irregularity.

3 citations


Journal ArticleDOI
TL;DR: The experimental results show that the recognition rate of CRHM is higher than convolutional neural network (CNN), while the training time is only 5% of CNN's, which confirms that the approach provides a novel model for recognition, which can simultaneously improve the effectiveness and efficiency without the need of advanced equipment.

2 citations


Proceedings ArticleDOI
21 Aug 2022
TL;DR: In this article , a complete framework is provided in which enables existing real-time object detectors to integrate with another model connected to an OCR and then classify shops using NLP techniques.
Abstract: Many factors can influence the process of detecting and classifying stores based on their visual appearance. Previous studies built models that considered the whole storefront however, the detection and classification results were negatively impacted because of the lack of consistency in storefront design. This research focuses on store signboards as they are much more consistent. A complete framework is provided in which it enables existing real-time object detectors to integrate with another model connected to an OCR and then classify shops using NLP techniques. The models were trained and evaluated utilizing the ShoS dataset which was collected from Google Street Views for different research purposes. A total of 10k storefront signboards were captured and fully annotated. The outcomes of different baseline methodology and applications on the ShoS dataset are provided to measure the performance of our work.

2 citations


Book ChapterDOI
TL;DR: Wang et al. as discussed by the authors proposed a Densemble attention network to extract structural information from handwritten mathematical expressions and used label smoothing as the loss to prevent the model from overfitting.
Abstract: Unlike handwritten numeral recognition and handwritten text recognition, the recognition of handwritten mathematical expressions is more difficult because of their complex two-dimensional spatial structure. Since the “watch, attend and parse (WAP)” method was proposed in 2017, encoder-decoder models have made significant progress on handwritten mathematical expression recognition. Our model is improved based on the WAP [1] model. In this paper, the attention module is reasonably added to the encoder to make the extracted features are more informative. The new network is called Dense Attention Network (DATNet), which allows for an adequate extraction of the structural information from handwritten mathematical expressions. To prevent the model from overfitting during the training process, we use label smoothing [2] as the loss. Experiments showed that our model (DATWAP) improved WAP expression recognition from 48.4%/46.8%/45.1% to 54.72%/52.83%/52.54% on the CROHME 2014/2016/2019 test sets.

1 citations



Journal ArticleDOI
TL;DR: This study used the ResNet and GoogleNet CNN architectures as fixed feature extractors from handwriting samples and SVM was used to classify the writer’s gender and age based on the extracted features.
Abstract: Handwriting analysis is the science of determining an individual’s personality from his or her handwriting by assessing features such as slant, pen pressure, word spacing, and other factors. Handwriting analysis has a wide range of uses and applications, including dating and socialising, roommates and landlords, business and professional, employee hiring, and human resources. This study used the ResNet and GoogleNet CNN architectures as fixed feature extractors from handwriting samples. SVM was used to classify the writer’s gender and age based on the extracted features. We built an Arabic dataset named FSHS to analyse and test the proposed system. In the gender detection system, applying the automatic feature extraction method to the FSHS dataset produced accuracy rates of 84.9% and 82.2% using ResNet and GoogleNet, respectively. While the age detection system using the automatic feature extraction method achieved accuracy rates of 69.7% and 61.1% using ResNet and GoogleNet, respectively

1 citations


Book ChapterDOI
TL;DR: Wang et al. as discussed by the authors proposed an encoder-decoder method combining contrastive learning and supervised learning, whose encoder is trained to learn semantic-invariant features between printed and handwritten characters effectively.
Abstract: Handwritten Mathematical Expressions differ considerably from ordinary linear handwritten texts, due to their two-dimentional structures plus many special symbols and characters. Hence, HMER(Handwritten Mathematical Expression Recognition) is a lot more challenging compared with normal handwriting recognition. At present, the mainstream offline recognition systems are generally built on deep learning methods, but these methods can hardly cope with HEMR due to the lack of training data. In this paper, we propose an encoder-decoder method combining contrastive learning and supervised learning(CCLSL), whose encoder is trained to learn semantic-invariant features between printed and handwritten characters effectively. CCLSL improves the robustness of the model in handwritten styles. Extensive experiments on CROHME benchmark show that without data enhancement, our model achieves an expression accuracy of 58.07% on CROHME2014, 55.88% on CROHME2016 and 59.63% on CROHME2019, which is much better than all previous state-of-the-art methods. Furthermore, our ensemble model added a boost of 2.5% to 3.4% to the accuracy, achieving the state-of-the-art performance on public CROHME datasets for the first time.


Journal ArticleDOI
TL;DR: The annotation of the ShoS dataset were extended to include more attributes for shop classification and the results showed that the classifier excelled over human performance by about 15\%.
Abstract: The rapid advancements in artificial intelligence algorithms have sharpened the focus on street signs due to their prevalence. Some street signs have consistent shapes and pre-defined colors and fonts, such as traffic signs while others are characterized by their visual variability like shop signboards. This variations create a complicated challenge for AI-based systems to classify them. In this paper, the annotation of the ShoS dataset were extended to include more attributes for shop classification. Then, two classifiers were trained and tested utilizing the extended ShoS dataset. SVM showed great performance as its F1-score reached 89.33\%. The classification performance was compared with human performance, and the results showed that our classifier excelled over human performance by about 15\%. The results were discussed, so the factors that affect classification were provided for further enhancement.

Book ChapterDOI
01 Jan 2022
TL;DR: In this paper , the authors proposed an automated Big Five Factor Model (BFFM) system called Averaging of SMOTE multi-label SVM-CNN (AvgMlSC).
Abstract: AbstractThe Big Five Factors Model (BFFM) is the most widely accepted personality theory used by psychologists today. The theory states that personality can be described with five core factors which are Conscientiousness, Agreeableness, Emotional Stability, Openness to Experience, and Extraversion. In this work, we measure the five factors using handwriting analysis instead of answering a long questionnaire of personality test. Handwriting analysis is a study that merely needs a writing sample to assess personality traits of the writer. It started manually by interpreting the extracted features such as size of writing, slant, and space between words into personality traits based on graphological rules. In this work, we proposed an automated BFFM system called Averaging of SMOTE multi-label SVM-CNN (AvgMlSC). AvgMlSC constructs synthetic samples to handle imbalanced data using Synthetic Minority Oversampling Technique (SMOTE). It averages two learning-based classifiers i.e. Multi-label Support Vector Machine and Multi-label Convolutional Neural Network based on offline handwriting recognition to produce one optimal predictive model. The model was trained using 1066 handwriting samples written in English, French, Chinese, Arabic, and Spanish. The results reveal that our proposed model outperformed the overall performance of five traditional models i.e. Logistic Regression (LR), Naïve Bayes (NB), K-Neighbors (KN), Support Vector Machine (SVM), and Convolutional Neural Network (CNN) with 93% predictive accuracy, 0.94 AUC, and 90% F-Score.KeywordsBig five factor modelHandwriting analysisComputerizedOff-line handwritingLearning modelMulti-labelEnsembleSMOTESVMCNN

Journal ArticleDOI
TL;DR: In this paper , it was shown that ∆: ∅: A→B(H) is a completely positive linear map and that ∅ is a minimal commutant representation with isometry.
Abstract: Let A be a unital C* -algebra, let L: A→B(H) be a linear map, and let ∅: A→B(H) be a completely positive linear map. We prove the property in the following: is completely positive}=inf {||T*T+TT*||1/2: L= V*TπV which is a minimal commutant representation with isometry} . Moreover, if L=L* , then is completely positive . In the paper we also extend the result is completely positive}=inf{||T||: L=V*TπV} [3 , Corollary 3.12].

Proceedings ArticleDOI
21 Aug 2022
TL;DR: In this article , different deep learning architectures to learn patterns that are objects lying on the Riemannian and Grassmann manifolds are considered, including cascades of classifier ensembles (CCEs), convolutional neural networks (CNNs), and deep neural forests (DNFs).
Abstract: This paper considers different deep learning architectures to learn patterns that are objects lying on the Riemannian and Grassmann manifolds. Among them, we considered cascades of classifier ensembles (CCEs), convolutional neural networks (CNNs), and deep neural forests (DNFs). All aforementioned architectures have linearized and nonlinearized versions. Patterns that are objects of Riemannian manifolds are classifier prediction pairwise matrices (CPPMs) while objects of the Grassmann manifolds are obtained using decision profiles (DPs). We also compared our architectures with CCEs that operate in the Euclidean geometry. As seen from the experimental results deep learning architectures based on CNNs provided the best results.

Book ChapterDOI
01 Jan 2023
TL;DR: In this paper , a novel network called ScriptNet is proposed, which is composed of two streams: spatial stream and visual stream, which capture the spatial dependencies within the image, while the visual stream describes the appearance of the image.
Abstract: Script identification is an essential part of a document image analysis system, since documents written in different scripts may undergo different processing methods. In this paper, we address the issue of script identification in camera-based document images, which is challenging since the camera-based document images are often subject to perspective distortions, uneven illuminations, etc. We propose a novel network called ScriptNet that is composed of two streams: spatial stream and visual stream. The spatial stream captures the spatial dependencies within the image, while the visual stream describes the appearance of the image. The two streams are then fused in the network, which can be trained in an end-to-end manner. Extensive experiments demonstrate the effectiveness of the proposed approach. The two streams have been shown to be complementary to each other. An accuracy of $$99.1\%$$ has been achieved by our proposed network, which compares favourably with state-of-the-art methods. Besides, the proposed network achieves promising results even when it is trained with non-camera-based document images and tested on camera-based document images.

Book ChapterDOI
TL;DR: In this article , the authors used Convolutional Neural Network (CNN) to recognize multi-class handwritten words written in cursive Arabic scripts of Pashto Language and achieved the average accuracies for the three Data Sets are 97.26, 96.25% and 95.84%, respectively.
Abstract: The Inter-Class Word Similarities in combination with Intra-Class Variations make it a difficult task for an OCR or any other machine learning system to recognize the handwritten characters and words with high accuracy, especially in the domain of cursive Arabic scripts. Convolutional Neural Network was originally designed to handle the problems of shape resemblance, position shift and distortion in the domain of handwritten characters and digit recognition. In this paper, we have used Convolutional Neural Network (CNN) to recognize multi class handwritten words written in cursive Arabic scripts of Pashto Language. The handwritten word images in Pashto Database [3] contain a high level of shape resemblance, and position shift in terms of diacritic marks. Hence it has been proved to be a good source for properly analyzing the performance of any CNN based recognition system. The model has successfully handled the problem and has not been affected by the level of complexity of Inter-class resemblances. The CNN model has been well tested on three sub data sets of Pashto Database. In each data set the number of classes and level of complexity, i.e., Shape Similarity, increase from one data set to another, i.e., 25, 40 and 68 classes of handwritten words. The average accuracies for the three Data Sets are 97.26%, 96.25% and 95.84%, respectively.


Book ChapterDOI
TL;DR: In this paper , the authors proposed a method to detect counterfeit coins based on image content, which employed SIFT, SURF, and MSER to determine the similarity degree of their datasets and evaluated those descriptors by statistical analysis to see which one is the most effective criterion for counterfeit coin detection.
Abstract: We use coins in our daily life to pay for bus, metro tickets, vending machines, etc. However, the market for antique and historical coins is another place, where the quality of coins and their genuinity play a significant role. Hence, researchers have considered different methods in coin detection studies. In recent years 2-D and 3-D image processing approaches have been widely used in image-based coin detection. In this paper, we propose a method to detect counterfeit coins based on image content. We employed SIFT, SURF, and MSER to determine the similarity degree of our datasets. Then, we evaluate those descriptors by statistical analysis to see which one is the most effective criterion for counterfeit coin detection. According to experiments, SIFT was selected as the most reliable algorithm for the Danish coin image dataset. Then, we train an autoencoder to find anomalies in the coin images. The trained autoencoder receives a coin image as input and generates a new image. The output image is compared with a basic image using the selected criterion. If the similarity between these two images meets a threshold then the coin is genuine. Most counterfeit coin detection methods require fake data for training. This can be eliminated by our autoencoding-based anomaly method.

Book ChapterDOI
01 Jan 2022
TL;DR: In this paper , a complete framework including PCC (Pearson's correlation coefficient) to extract lines and curves, SLIC for the selection of feature key points, DBSCAN for object cluster, and finally YoloV3-SPP model for detecting shapes and objects.
Abstract: AbstractWartegg Test is a drawing completion task designed to reflect the personal characteristics of the testers. A complete Wartegg Test has eight 4 cm \(\times \) 4 cm boxes with a printed hint in each of them. The tester will be required to use pencil to draw eight pictures in the boxes after they saw these printed hints. In recent years the trend of utilizing high-speed hardware and deep learning based model for object detection makes it possible to recognize hand-drawn objects from images. However, recognizing them is not an easy task, like other hand-drawn images, the Wartegg images are abstract and diverse. Also, Wartegg Test images are multi-object images, the number of objects in one image, their distribution and size are all unpredictable. These factors make the recognition task on Wartegg Test images more difficult. In this paper, we present a complete framework including PCC (Pearson’s Correlation Coefficient) to extract lines and curves, SLIC for the selection of feature key points, DBSCAN for object cluster, and finally YoloV3-SPP model for detecting shapes and objects. Our system produced an accuracy of 87.9\(\%\) for one object detection and 75\(\%\) for multi-object detection which surpass the previous results by a wide margin.KeywordsWartegg testImage processingObject detection

Proceedings ArticleDOI
01 Oct 2022
TL;DR: This conference brings together a large number of scientists from all over the world to express their innovative ideas and report on their latest achievements.
Abstract: Document analysis and recognition is a special branch of studies in the field of pattern recognition. During the past 10 years, we have witnessed revolutionary changes in both software and hardware. Today, computers equipped with cameras or optical scanners can see various types of documents and read their contents. Indeed, huge volumes of documents are being processed automatically every day, for example the reading of bar-codes on items sold in big companies and grocery stores; sorting mail; processing utility bills and bank cheques; reading financial data and business forms; extracting symbols and information from maps and engineering drawings; recognizing on-line and off-line handwritten data; and so forth. More subjects can be found in the technical program. This conference brings together a large number of scientists from all over the world to express their innovative ideas and report on their latest achievements. In addition to the rich program, the conference also features a full day of tutorials and an exhibition of demonstrations and video presentations.