Textual Description for Mathematical Equations
Ajoy Mondal,C. V. Jawahar +1 more
- pp 1300-1307
Reads0
Chats0
TLDR
In this article, a mathematical equation description (MED) model was proposed, which consists of a convolution neural network as an encoder that extracts features of input mathematical equation images and a recurrent neural network with attention mechanism.Abstract:
Reading of mathematical expression or equation in the document images is very challenging due to the large variability of mathematical symbols and expressions. In this paper, we pose reading of mathematical equation as a task of the generation of the textual description which interprets the internal meaning of this equation. Inspired by the natural image captioning problem in computer vision, we present a mathematical equation description ( MED ) model, a novel end-to-end trainable deep neural network based approach that learns to generate a textual description for reading mathematical equation images. Our MED model consists of a convolution neural network as an encoder that extracts features of input mathematical equation images and a recurrent neural network with attention mechanism which generates description related to the input mathematical equation images. Due to the unavailability of mathematical equation image data sets with their textual descriptions, we generate two data sets for experimental purpose. To validate the effectiveness of our MED model, we conduct a real-world experiment to see whether the students are able to write equations by only reading or listening their textual descriptions or not. Experiments conclude that the students are able to write most of the equations correctly by reading their textual descriptions only.read more
Citations
More filters
Journal ArticleDOI
Automatic adaptation of open educational resources: an approach from a multilevel methodology based on students’ preferences, educational special needs, artificial intelligence, and accessibility metadata
Paola Ingavélez-Guerra,Vladimir Robles-Bykbaev,Angel Perez-Munoz,José Ramón Hilera,Salvador Otón Tortosa +4 more
TL;DR: The research conducted aims to contribute with an automated support tool in the generation of accessible educational resources that are correctly labeled for search and reuse and to support researchers in artificial intelligence applications to address challenges and opportunities in the field of virtual education.
Book ChapterDOI
Classroom Slide Narration System
TL;DR: In this article , the authors proposed a Classroom Slide Segmentation Network (cssn) that generates audio descriptions corresponding to the slide content, which poses as an image-to-markup language generation task.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI
Bleu: a Method for Automatic Evaluation of Machine Translation
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.