scispace - formally typeset
Search or ask a question
Author

Naila Habib Khan

Bio: Naila Habib Khan is an academic researcher. The author has contributed to research in topics: Optical character recognition & Feature extraction. The author has an hindex of 4, co-authored 7 publications receiving 49 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The OCR technology is introduced and a historical review of the OCR systems is presented, providing comparisons between the English, Arabic, and Urdu systems.
Abstract: This paper gives an across-the-board comprehensive review and survey of the most prominent studies in the field of Urdu optical character recognition (OCR). This paper introduces the OCR technology and presents a historical review of the OCR systems, providing comparisons between the English, Arabic, and Urdu systems. Detailed background and literature have also been provided for Urdu script, discussing the script’s past, OCR categories, and phases. This paper further reports all state-of-the-art studies for different phases, namely, image acquisition, pre-processing, segmentation, feature extraction, classification/recognition, and post-processing for an Urdu OCR system. In the segmentation section, the analytical and holistic approaches for Urdu text have been emphasized. In the feature extraction section, a comparison has been provided between the feature learning and feature engineering approaches. Deep learning and traditional machine learning approaches have been discussed. The Urdu numeral recognition systems have also been deliberated concisely. The research paper concludes by identifying some open problems and suggesting some future directions.

36 citations

Journal ArticleDOI
TL;DR: An Urdu OCR system which aims at ligature-level recognition of Urdu text, which overcomes the character-level segmentation problems associated with cursive scripts is proposed and the results show 62, 61, 73 and 90% accuracy.
Abstract: Optical character recognition (OCR) system holds great significance in human-machine interaction OCR has been the subject of intensive research especially for Latin, Chinese and Japanese script Comparatively, little work has been done for Urdu OCR, due to the complexities and segmentation errors associated with its cursive script This paper proposes an Urdu OCR system which aims at ligature-level recognition of Urdu text This ligature based recognition approach overcomes the character-levelsegmentation problems associated with cursive scripts A newly developed OCR algorithm is introduced that uses a semi-supervised multi-level clustering for categorization of the ligatures Classification is performed using four machine learning techniques ie decision trees, linear discriminant analysis, naive Bayes and k-nearest neighbor (K-NN) The system was implemented and the results show 62, 61, 73 and 90% accuracy for decision tree, linear discriminant analysis, naive Bayes and K-NN respectively

18 citations

Journal ArticleDOI
TL;DR: The proposed deep transfer-based learning has achieved phenomenal recognition rates for PashTo ligatures on benchmark FAST-NU Pashto dataset.
Abstract: Over the past decades, text recognition technologies have focused immensely on noncursive isolated scripts. A text recognition system for the cursive Pashto script will serve as a great contribution, allowing the traditional, cultural, and educational Pashto literature to be converted into machine-readable form. We propose the use of deep learning architectures based on the transfer learning for the recognition of Pashto ligatures. For recognition analysis and evaluation, the ligature images in the dataset are preprocessed by data augmentation techniques, i.e., negatives, contours, and rotated to increase the variation of each sample and size of the original dataset. Rich feature representations are automatically extracted from the Pashto ligature images using deep convolution layers of the convolution neural network (CNN) architectures using fine-tuned approach. Pretrained CNN architectures: AlexNet, GoogleNet, and VGG (VGG-16 and VGG-19) are used for classification by feeding the extracted features to a fully connected layer and a softmax layer. The proposed deep transfer-based learning has achieved phenomenal recognition rates for Pashto ligatures on benchmark FAST-NU Pashto dataset. An accuracy of 97.24%, 97.46%, and 99.03% is achieved using AlexNext, GoogleNet, and VGGNet architectures, respectively.

11 citations

Journal ArticleDOI
TL;DR: An overview for motion estimation in general with special focus on ego-motion estimation and its basic concepts, and Vital algorithms that are used for ego- motion estimation are critically discussed in the following section.
Abstract: Ego-motion technology holds great significance for computer vision applications, robotics, augmented reality and visual simultaneous localization and mapping. This paper is a study of ego-motion estimation basic concepts, equipment, algorithms, challenges and its real world applications. First, we provide an overview for motion estimation in general with special focus on ego-motion estimation and its basic concepts. For ego-motion estimation it's necessary to understand the notion of independent moving objects, focus of expansion, motion field, and optical flow. Vital algorithms that are used for ego-motion estimation are critically discussed in the following section of the paper. Various camera setups and their potential weakness and strength are also studied in context of ego-motion estimation. We also briefly specify some ego-motion applications used in the real world. We conclude the paper by discussing some open problems, provide some future directions and finally summarize the entire paper in the conclusions.

8 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A large multi-purpose and multi-format dataset that contain more than ten thousand documents organize into six classes of single-layer Multisize Filters Convolutional Neural Network (SMFCNN) is designed and it is the first study of Urdu TDC using DL model.
Abstract: The rapid growth of electronic documents are causing problems like unstructured data that need more time and effort to search a relevant document. Text Document Classification (TDC) has a great significance in information processing and retrieval where unstructured documents are organized into pre-defined classes. Urdu is the most favorite research language in South Asian languages because of its complex morphology, unique features, and lack of linguistic resources like standard datasets. As compared to short text, like sentiment analysis, long text classification needs more time and effort because of large vocabulary, more noise, and redundant information. Machine Learning (ML) and Deep Learning (DL) models have been widely used in text processing. Despite the major limitations of ML models, like learn directed features, these are the favorite methods for Urdu TDC. To the best of our knowledge, it is the first study of Urdu TDC using DL model. In this paper, we design a large multi-purpose and multi-format dataset that contain more than ten thousand documents organize into six classes. We use Single-layer Multisize Filters Convolutional Neural Network (SMFCNN) for classification and compare its performance with sixteen ML baseline models on three imbalanced datasets of various sizes. Further, we analyze the effects of preprocessing methods on SMFCNN performance. SMFCNN outperformed the baseline classifiers and achieved 95.4%, 91.8%, and 93.3% scores of accuracy on medium, large and small size dataset respectively. The designed dataset would be publically and freely available in different formats for future research in Urdu text processing.

56 citations

Journal ArticleDOI
TL;DR: VGG architecture outperforms the state-of-the-art techniques and number of architectures of conveNet in Alzheimer’s disease detection, and achieves an identification test set accuracy of 99.27% (MCI/AD), 98.89% (AD/CN) and 97.06% ( MCI/CN).
Abstract: Machine learning and deep learning play a crucial role in identification of various diseases like neurological, skin, eyes, blood and cancers. The deep learning algorithms can be performed promising for prediction of Alzheimer’s disease using MRI scans. Alzheimer disease becoming more common in the people (age 65 years or above). The disease becomes severe before the symptoms appear and causes brain disorder that cannot be cured by medicines and other therapies and treatments. So the early diagnosis is necessary to slow down its progression. Detection and prevention of Alzheimer disease is one of the active research area for the researchers nowadays. In this paper, we employed architectures of convoutional networks using freeze features extracted from source data set ImageNet for binary and ternary classification. All experiments were carried out using Alzheimer’s disease national initiative (ADNI) data set consisting of MRI scans. The performance of proposed system demonstrates for classification of Alzheimer’s disease versus mild cognitive impairment, normal controls versus mild cognitive impairment, and cognitive normal versus Alzheimer’s disease. The results of proposed study show that VGG architecture outperforms the state-of-the-art techniques and number of architectures of conveNet (AlexNet, GoogLeNet, ResNet, DenseNet, Inceptionv3, InceptionResNet) in Alzheimer’s disease detection, and achieves an identification test set accuracy of 99.27% (MCI/AD), 98.89% (AD/CN) and 97.06% (MCI/CN).

41 citations

Journal ArticleDOI
TL;DR: The use of the convolutional neural network is proposed to recognize the multifont offline Urdu handwritten characters in an unconstrained environment and a novel dataset of Urdu handwriting characters is proposed since there is no publicly-available dataset of this kind.
Abstract: In the area of pattern recognition and pattern matching, the methods based on deep learning models have recently attracted several researchers by achieving magnificent performance. In this paper, we propose the use of the convolutional neural network to recognize the multifont offline Urdu handwritten characters in an unconstrained environment. We also propose a novel dataset of Urdu handwritten characters since there is no publicly-available dataset of this kind. A series of experiments are performed on our proposed dataset. The accuracy achieved for character recognition is among the best while comparing with the ones reported in the literature for the same task.

39 citations

Journal ArticleDOI
TL;DR: A methodology is proposed that covers detection, orientation prediction, and recognition of Urdu ligatures in outdoor images and Resnet50 features based FasterRCNN was found to be the winner detector with AP of.98.
Abstract: Urdu text is a cursive script and belongs to a non-Latin family of other cursive scripts like Arabic, Chinese, and Hindi. Urdu text poses a challenge for detection/localization from natural scene images, and consequently recognition of individual ligatures in scene images. In this paper, a methodology is proposed that covers detection, orientation prediction, and recognition of Urdu ligatures in outdoor images. As a first step, the custom FasterRCNN algorithm has been used in conjunction with well-known CNNs like Squeezenet, Googlenet, Resnet18, and Resnet50 for detection and localization purposes for images of size $320\times 240$ pixels. For ligature Orientation prediction, a custom Regression Residual Neural Network (RRNN) is trained/tested on datasets containing randomly oriented ligatures. Recognition of ligatures was done using Two Stream Deep Neural Network (TSDNN). In our experiments, five-set of datasets, containing 4.2K and 51K Urdu-text-embedded synthetic images were generated using the CLE annotation text to evaluate different tasks of detection, orientation prediction, and recognition of ligatures. These synthetic images contain 132, and 1600 unique ligatures corresponding to 4.2K and 51K images respectively, with 32 variations of each ligature (4-backgrounds and font 8-color variations). Also, 1094 real-world images containing more than 12k Urdu characters were used for TSDNN’s evaluation. Finally, all four detectors were evaluated and used to compare them for their ability to detect/localize Urdu-text using average-precision (AP). Resnet50 features based FasterRCNN was found to be the winner detector with AP of.98. While Squeeznet, Googlenet, Resnet18 based detectors had testing AP of.65,.88, and.87 respectively. RRNN achieved and accuracy of 79% and 99% for 4k and 51K images respectively. Similarly, for characters classification in ligatures, TSDNN attained a partial sequence recognition rate of 94.90% and 95.20% for 4k and 51K images respectively. Similarly, a partial sequence recognition rate of 76.60% attained for real world-images.

38 citations

Journal ArticleDOI
TL;DR: The OCR technology is introduced and a historical review of the OCR systems is presented, providing comparisons between the English, Arabic, and Urdu systems.
Abstract: This paper gives an across-the-board comprehensive review and survey of the most prominent studies in the field of Urdu optical character recognition (OCR). This paper introduces the OCR technology and presents a historical review of the OCR systems, providing comparisons between the English, Arabic, and Urdu systems. Detailed background and literature have also been provided for Urdu script, discussing the script’s past, OCR categories, and phases. This paper further reports all state-of-the-art studies for different phases, namely, image acquisition, pre-processing, segmentation, feature extraction, classification/recognition, and post-processing for an Urdu OCR system. In the segmentation section, the analytical and holistic approaches for Urdu text have been emphasized. In the feature extraction section, a comparison has been provided between the feature learning and feature engineering approaches. Deep learning and traditional machine learning approaches have been discussed. The Urdu numeral recognition systems have also been deliberated concisely. The research paper concludes by identifying some open problems and suggesting some future directions.

36 citations