scispace - formally typeset
Search or ask a question
Book ChapterDOI

DocDescribor: Digits + Alphabets + Math Symbols - A Complete OCR for Handwritten Documents

22 Dec 2019-Vol. 1249, pp 292-301
TL;DR: In this paper, a Siamese-CNN network is proposed to identify if two images in a pair contain similar or dissimilar characters, and then the network is used to recognize different characters by character matching where test images are compared to sample images of any target class.
Abstract: This paper presents an Optical Character Recognition (OCR) system for documents with English text and mathematical expressions. Neural network architectures using CNN layers and/or dense layers achieve high level accuracy in character recognition task. However, these models require large amount of data to train the network, with balanced number of samples for each class. Recognition of mathematical symbols poses challenges of the imbalance and paucity of training data available. To address this issue, we pose the character recognition problem as a Distance Metric Learning problem. We propose a Siamese-CNN Network that learns discriminative features to identify if the two images in a pair contain similar or dissimilar characters. The network is then used to recognize different characters by character matching where test images are compared to sample images of any target class which may or may not be included during training. Thus our model can scale to new symbols easily. The proposed approach is invariant to author’s handwriting. Our model has been tested over images extracted from a dataset of scanned answer scripts collected by us. It is seen that our approach achieves comparable performance to other architectures using convolutional layers or dense layers while using lesser training data.
References
More filters
Journal ArticleDOI
TL;DR: The proposed system is able to recognize handwritten English digits and letters with high accuracy and performance comparison with other structure of neural networks revealed that the weighted average recognition rate for patternnet, feedforwardnet, and proposed DNN were 80.3%, 68.3, and 90.4%, respectively.
Abstract: Due to the advanced in GPU and CPU, in recent years, Deep Neural Network (DNN) becomes popular to be utilized both as feature extraction and classifier. This paper aims to develop offline handwritten recognition system using DNN. First, two popular English digits and letters database, i.e. MNIST and EMNIST, were selected to provide dataset for training and testing phase of DNN. Altogether, there are 10 digits [0-9] and 52 letters [a-z, A-Z]. The proposed DNN used stacked two autoencoder layers and one softmax layer. Recognition accuracy for English digits and letters is 97.7% and 88.8%, respectively. Performance comparison with other structure of neural networks revealed that the weighted average recognition rate for patternnet, feedforwardnet, and proposed DNN were 80.3%, 68.3%, and 90.4%, respectively. It shows that our proposed system is able to recognize handwritten English digits and letters with high accuracy.

25 citations

Proceedings ArticleDOI
Wenhao He1, Yuxuan Luo2, Fei Yin1, Han Hu2, Junyu Han2, Errui Ding2, Cheng-Lin Liu1 
01 Dec 2016
TL;DR: A novel end-to-end framework for mathematical expression (ME) recognition using a convolutional neural network to perform mathematical symbol detection and recognition simultaneously incorporating spatial context, and can handle multi-part and touching symbols effectively.
Abstract: In this paper we propose a novel end-to-end framework for mathematical expression (ME) recognition. The method uses a convolutional neural network (CNN) to perform mathematical symbol detection and recognition simultaneously incorporating spatial context, and can handle multi-part and touching symbols effectively. To evaluate the performance, we provide a benchmark that contains MEs both from real-life and synthetic data. Images in our dataset undergo multiple variations such as viewpoint, illumination and background. For training, we use pure synthetic data for saving human labeling effort. The proposed method achieved 87% accuracy of total correct for clear images and 45% for cluttered ones.

18 citations

Proceedings ArticleDOI
01 Nov 2018
TL;DR: A generic optical character recognition (OCR) system based on deep Siamese convolution neural networks and support vector machines that achieves a very promising recognition accuracy close to the results achieved by CNNs trained for specific target classes and recognition systems without the need for retraining.
Abstract: This paper presents a generic optical character recognition (OCR) system based on deep Siamese convolution neural networks (CNNs) and support vector machines (SVM). Supervised deep CNNs achieve high level of accuracy in classification tasks. However, fine-tuning a trained model for a new set of classes requires large amount of data to overcome the problem of dataset bias. The classification accuracy of deep neural networks (DNNs) degrades when the available dataset is insufficient. Moreover, using a trained deep neural network in classifying a new class requires tuning the network architecture and retraining the model. All these limitations are handled by our proposed system. The deep Siamese CNN is trained for extracting discriminative features. The training is performed once using a group of classes. The OCR system is then used for recognizing different classes without retraining or fine-tuning the deep Siamese CNN model. Only few samples are needed from any target class for classification. The proposed OCR system is evaluated on different domains: Arabic letters, Eastern-Arabic numerals, Hindu-Arabic numerals, and Farsi numerals using test sets that contain printed and handwritten letters and numerals. The proposed system achieves a very promising recognition accuracy close to the results achieved by CNNs trained for specific target classes and recognition systems without the need for retraining. The system outperforms the state of the art method that uses Siamese CNN in one-shot classification task by around 12%.

13 citations