scispace - formally typeset
Search or ask a question
Author

Ridhi Aggarwal

Bio: Ridhi Aggarwal is an academic researcher from Indian Institute of Technology, Jodhpur. The author has contributed to research in topics: Expression (mathematics) & Character (computing). The author has co-authored 3 publications.

Papers
More filters
Book ChapterDOI
18 Dec 2018
TL;DR: A new segmentation-free approach is proposed which matches convex shape portions of symbols occurring in various layout such as subscript, superscript, fraction etc and is able to perform spotting of symbols present in a handwritten expression.
Abstract: Recognition of touching characters in mathematical expressions is a challenging problem in the field of document image analysis. Various approaches for recognizing touching maths symbols have been reported in literature, but they mainly dealt with printed expressions and handwritten numeral strings. In this work, a new segmentation-free approach is proposed which matches convex shape portions of symbols occurring in various layout such as subscript, superscript, fraction etc. and is able to perform spotting of symbols present in a handwritten expression. Our contribution lies in the design of a novel feature which can handle touching symbols effectively in the presence of handwriting variations. This recognition-based approach helps in spotting symbols in an expression even in the presence of clutter created by the presence of other symbols.

3 citations

Book ChapterDOI
TL;DR: In this article, a two-dimensional, stochastic context-free grammar is used for the structural analysis of offline handwritten mathematical expressions in a document image and the spatial relation between characters in an expression has been incorporated so that the structural variability in handwritten expressions can be tackled.
Abstract: Structural analysis helps in parsing the mathematical expressions. Various approaches for structural analysis have been reported in literature, but they mainly deal with online and printed expressions. In this work, two-dimensional, stochastic context-free grammar is used for the structural analysis of offline handwritten mathematical expressions in a document image. The spatial relation between characters in an expression has been incorporated so that the structural variability in handwritten expressions can be tackled.
Book ChapterDOI
22 Dec 2019
TL;DR: In this paper, a Siamese-CNN network is proposed to identify if two images in a pair contain similar or dissimilar characters, and then the network is used to recognize different characters by character matching where test images are compared to sample images of any target class.
Abstract: This paper presents an Optical Character Recognition (OCR) system for documents with English text and mathematical expressions. Neural network architectures using CNN layers and/or dense layers achieve high level accuracy in character recognition task. However, these models require large amount of data to train the network, with balanced number of samples for each class. Recognition of mathematical symbols poses challenges of the imbalance and paucity of training data available. To address this issue, we pose the character recognition problem as a Distance Metric Learning problem. We propose a Siamese-CNN Network that learns discriminative features to identify if the two images in a pair contain similar or dissimilar characters. The network is then used to recognize different characters by character matching where test images are compared to sample images of any target class which may or may not be included during training. Thus our model can scale to new symbols easily. The proposed approach is invariant to author’s handwriting. Our model has been tested over images extracted from a dataset of scanned answer scripts collected by us. It is seen that our approach achieves comparable performance to other architectures using convolutional layers or dense layers while using lesser training data.

Cited by
More filters
Proceedings ArticleDOI
04 May 2023
TL;DR: In this paper , the SVR algorithm encodes the image, produces a model that fits the data better, and the result is then obtained by character-wise segmenting the image and comparing it with trained models.
Abstract: One of the most important tasks in the realm of document analysis and recognition is the detection of equations in documents that were acquired using a camera. The procedure includes several steps, including pre-processing of the images, segmentation, feature extraction, and classification. The suggested method comprises taking a user-provided input expression image and classifying it into one of three types of equations: simple, complex, and highly complex. By choosing a decision boundary set off from the initial hyperplane, the SVR algorithm encodes the image, producing a model that fits the data better. The result is then obtained by character-wise segmenting the image and comparing it with trained models. Two recurrent neural networks make up the RNN encoder-decoder that is used. One RNN creates a fixed-length vector representation from a sequence of symbols, and a different RNN decodes that representation into a different sequence of symbols. 1900 images containing various equations made up the dataset utilized for training, validating, and testing the SVR and RNN. The accuracy of the system was about 93.64%.
Proceedings ArticleDOI
04 May 2023
TL;DR: In this article , the SVR algorithm encodes the image, produces a model that fits the data better, and the result is then obtained by character-wise segmenting the image and comparing it with trained models.
Abstract: One of the most important tasks in the realm of document analysis and recognition is the detection of equations in documents that were acquired using a camera. The procedure includes several steps, including pre-processing of the images, segmentation, feature extraction, and classification. The suggested method comprises taking a user-provided input expression image and classifying it into one of three types of equations: simple, complex, and highly complex. By choosing a decision boundary set off from the initial hyperplane, the SVR algorithm encodes the image, producing a model that fits the data better. The result is then obtained by character-wise segmenting the image and comparing it with trained models. Two recurrent neural networks make up the RNN encoder-decoder that is used. One RNN creates a fixed-length vector representation from a sequence of symbols, and a different RNN decodes that representation into a different sequence of symbols. 1900 images containing various equations made up the dataset utilized for training, validating, and testing the SVR and RNN. The accuracy of the system was about 93.64%.