scispace - formally typeset
Search or ask a question
Book ChapterDOI

Unlocking the Mechanism of Devanagari Letter Identification Using Eye Tracking

TL;DR: Upon understanding the level of distortion acceptable for correct letter recognition and the processes involved in the identification of the letters, the OCR can be made more robust and the gap between human reading and machine reading can be narrowed down.
Abstract: The present day computers can outperform the human in many complicated tasks very precisely and efficiently. However, in many scenarios like pattern recognition and more importantly, character recognition; a school going child can outperform the sophisticated machines available today. The modern machines present today find handwritten, calligraphic text difficult to recognize because such texts hardly contain rationalized straight lines or perfect loops or circles. Therefore, most of the optical character recognition systems fail to recognize the characters beyond certain levels of distortions and noise. On the other hand, the human brain has achieved a remarkable ability to recognize visual patterns or characters in various distortion conditions with high speed. The present work tries to understand how human perceive, process and recognize the Devanagari characters under various distortion levels. In order to achieve this objective, eye tracking experiment was performed on 20 graduate participants by presenting stimuli in decreasing level of distortions (from highly distorted to more normal one). The eye fixation patterns along with the time course of recognition gave us the moment-to-moment processing involved in letter identification. Upon understanding the level of distortion acceptable for correct letter recognition and the processes involved in the identification of the letters, the OCR can be made more robust and the gap between human reading and machine reading can be narrowed down.
Citations
More filters
Proceedings ArticleDOI
01 Dec 2019
TL;DR: The results show that the performance of the network improves significantly when trained iteratively with increasing level of blur, and the model trained on gradually decreasing blurriness on Dog vs Cat dataset for classification task.
Abstract: It is claimed that convolutional neural networks are inspired by human vision systems. Based on the literature of development of human visual system, we know that newly born child has blurred vision initially due to rapid eye movements. This rapid eye movement is termed as Nystagmus. This paper is concerned with a novel approach to quantify the nystagmus and implementing an artificial system that can mimic the visual learning of a newly born child or person with nystagmus. To quantify the nystagmus, we have recorded 10 seconds of eye movement videos from 3 subjects and 10 trials. We estimate the eye movement frequency by tracking the eye pupil through image processing, which is then used to create a database. To simulate a suitable learning environment, we have trained our model on gradually decreasing blurriness on Dog vs Cat dataset for classification task. The novelty of the paper is in the type of training which is elicited by human visual learning system. The results show that the performance of the network improves significantly when trained iteratively with increasing level of blur.

1 citations


Cites background from "Unlocking the Mechanism of Devanaga..."

  • ...There are several studies using eye-movement data for reading comprehension[2], memorability prediction [3], understanding the recognition performance[4]....

    [...]

Posted Content
TL;DR: In this article, the congruence of information gathering strategies between humans and deep neural networks has been examined in a character recognition task, where the authors use the visual fixation maps obtained from the eye-tracking experiment as a supervisory input to align the model's focus on relevant character regions.
Abstract: Human observers engage in selective information uptake when classifying visual patterns. The same is true of deep neural networks, which currently constitute the best performing artificial vision systems. Our goal is to examine the congruence, or lack thereof, in the information-gathering strategies of the two systems. We have operationalized our investigation as a character recognition task. We have used eye-tracking to assay the spatial distribution of information hotspots for humans via fixation maps and an activation mapping technique for obtaining analogous distributions for deep networks through visualization maps. Qualitative comparison between visualization maps and fixation maps reveals an interesting correlate of congruence. The deep learning model considered similar regions in character, which humans have fixated in the case of correctly classified characters. On the other hand, when the focused regions are different for humans and deep nets, the characters are typically misclassified by the latter. Hence, we propose to use the visual fixation maps obtained from the eye-tracking experiment as a supervisory input to align the model's focus on relevant character regions. We find that such supervision improves the model's performance significantly and does not require any additional parameters. This approach has the potential to find applications in diverse domains such as medical analysis and surveillance in which explainability helps to determine system fidelity.
References
More filters
Journal ArticleDOI
TL;DR: The basic theme of the review is that eye movement data reflect moment-to-moment cognitive processes in the various tasks examined.
Abstract: Recent studies of eye movements in reading and other information processing tasks, such as music reading, typing, visual search, and scene perception, are reviewed. The major emphasis of the review is on reading as a specific example of cognitive processing. Basic topics discussed with respect to reading are (a) the characteristics of eye movements, (b) the perceptual span, (c) integration of information across saccades, (d) eye movement control, and (e) individual differences (including dyslexia). Similar topics are discussed with respect to the other tasks examined. The basic theme of the review is that eye movement data reflect moment-to-moment cognitive processes in the various tasks examined. Theoretical and practical considerations concerning the use of eye movement data are also discussed.

6,656 citations

Book
01 Jan 1969

2,274 citations

Proceedings ArticleDOI
08 Nov 2000
TL;DR: A taxonomy of fixation identification algorithms is proposed that classifies algorithms in terms of how they utilize spatial and temporal information in eye-tracking protocols in order to evaluate and compare these algorithms with respect to a number of qualitative characteristics.
Abstract: The process of fixation identification—separating and labeling fixations and saccades in eye-tracking protocols—is an essential part of eye-movement data analysis and can have a dramatic impact on higher-level analyses. However, algorithms for performing fixation identification are often described informally and rarely compared in a meaningful way. In this paper we propose a taxonomy of fixation identification algorithms that classifies algorithms in terms of how they utilize spatial and temporal information in eye-tracking protocols. Using this taxonomy, we describe five algorithms that are representative of different classes in the taxonomy and are based on commonly employed techniques. We then evaluate and compare these algorithms with respect to a number of qualitative characteristics. The results of these comparisons offer interesting implications for the use of the various algorithms in future work.

1,809 citations

Journal ArticleDOI
TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.

592 citations

Proceedings ArticleDOI
08 Sep 2013
TL;DR: This work investigates whether different document types can be automatically detected from visual behaviour recorded using a mobile eye tracker, and presents an initial recognition approach that uses special purpose eye movement features as well as machine learning for document type detection.
Abstract: Reading is a ubiquitous activity that many people even perform in transit, such as while on the bus or while walking. Tracking reading enables us to gain more insights about expertise level and potential knowledge of users -- towards a reading log tracking and improve knowledge acquisition. As a first step towards this vision, in this work we investigate whether different document types can be automatically detected from visual behaviour recorded using a mobile eye tracker. We present an initial recognition approach that com- bines special purpose eye movement features as well as machine learning for document type detection. We evaluate our approach in a user study with eight participants and five Japanese document types and achieve a recognition performance of 74% using user-independent training.

77 citations