scispace - formally typeset
Search or ask a question
Author

Riaz Ahmad

Bio: Riaz Ahmad is an academic researcher from Shaheed Benazir Bhutto University. The author has contributed to research in topics: Cursive & Optical character recognition. The author has an hindex of 9, co-authored 19 publications receiving 335 citations. Previous affiliations of Riaz Ahmad include National University of Computer and Emerging Sciences & German Research Centre for Artificial Intelligence.

Papers
More filters
Journal ArticleDOI
TL;DR: This work presents a hybrid approach based on explicit feature extraction by combining convolutional and recursive neural networks for feature learning and classification of cursive Urdu Nastaliq script using the proposed hierarchical combination of CNN and MDLSTM.

95 citations

Journal ArticleDOI
TL;DR: An implicit segmentation based recognition system for Urdu text lines in Nastaliq script that relies on sliding overlapped windows on lines of text and extracting a set of statistical features is presented.

75 citations

Journal ArticleDOI
TL;DR: A robust feature extraction approach that extracts feature based on right-to-left sliding window that significantly reduce the label error for Urdu Nasta’liq text lines and outperforms the state-of-the-art results.
Abstract: Character recognition for cursive script like Arabic, handwritten English and French is a challenging task which becomes more complicated for Urdu Nasta'liq text due to complexity of this script over Arabic. Recurrent neural network (RNN) has proved excellent performance for English, French as well as cursive Arabic script due to sequence learning property. Most of the recent approaches perform segmentation-based character recognition, whereas, due to the complexity of the Nasta'liq script, segmentation error is quite high as compared to Arabic Naskh script. RNN has provided promising results in such scenarios. In this paper, we achieved high accuracy for Urdu Nasta'liq using statistical features and multi-dimensional long short-term memory. We present a robust feature extraction approach that extracts feature based on right-to-left sliding window. Results showed that selected features significantly reduce the label error. For evaluation purposes, we have used Urdu printed text images dataset and compared the proposed approach with the recent work. The system provided 94.97 % recognition accuracy for unconstrained printed Nasta'liq text lines and outperforms the state-of-the-art results.

70 citations

Journal ArticleDOI
01 Oct 2016
TL;DR: This paper uses zoning features for the classification of Urdu Nasta'liq text lines, with a combination of 2-Dimensional Long Short Term Memory networks (2DLSTM) as learning classifier and an approach based on zoning features proved to be efficient and popular.
Abstract: Recognition of Urdu cursive script is a challenging task due to the implicit complexities associated with it. The performance of a recognition system is immensely dependent on extracted features. There are various features extraction approaches proposed in recent years. Among many, an approach based on zoning features proved to be efficient and popular. Such zoning features represent significant information with low complexity and high speed. In this paper, we used zoning features for the classification of Urdu Nasta'liq text lines, with a combination of 2-Dimensional Long Short Term Memory networks (2DLSTM) as learning classifier. The proposed model is evaluated on publicly available UPTI dataset and character recognition rate of 93.39% is obtained.

34 citations

Proceedings ArticleDOI
01 Oct 2016
TL;DR: This paper presents the first Pashto text image database for scientific research and thereby the first dataset with complete handwritten and printed text line images which ultimately covers all alphabets of Arabic and Persian languages.
Abstract: This paper presents the first Pashto text image database for scientific research and thereby the first dataset with complete handwritten and printed text line images which ultimately covers all alphabets of Arabic and Persian languages. Language like Pashto, written in a complex way by calligraphers, still requires a mature Optical Character Recognition (OCR), system. Although 50 million people use this language both for oral and written communication, there is no significant effort which is devoted to the recognition of Pashto Script. A real dataset of 17,015 images having Pashto text lines is introduced. The images are acquired via scanning from hand scribed Pashto books. Further, in this work, we evaluated the performance of deep learning based models like Bidirectional and Multi-Dimensional Long Short Term Memory (BLSTM and MDLSTM) networks for Pashto texts and provide a baseline character error rate of 9.22%.

28 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of LSTM’s formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example are presented.
Abstract: Long short-term memory (LSTM) has transformed both machine learning and neurocomputing fields. According to several online sources, this model has improved Google’s speech recognition, greatly improved machine translations on Google Translate, and the answers of Amazon’s Alexa. This neural system is also employed by Facebook, reaching over 4 billion LSTM-based translations per day as of 2017. Interestingly, recurrent neural networks had shown a rather discrete performance until LSTM showed up. One reason for the success of this recurrent network lies in its ability to handle the exploding/vanishing gradient problem, which stands as a difficult issue to be circumvented when training recurrent or very deep neural networks. In this paper, we present a comprehensive review that covers LSTM’s formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example.

412 citations

Journal ArticleDOI
TL;DR: The proposed framework conducts three studies using three architectures of convolutional neural networks (AlexNet, GoogLeNet, and VGGNet) to classify brain tumors such as meningioma, gliomas, and pituitary and achieves highest accuracy up to 98.69 in terms of classification and detection.
Abstract: Brain tumors are the most destructive disease, leading to a very short life expectancy in their highest grade. The misdiagnosis of brain tumors will result in wrong medical intercession and reduce chance of survival of patients. The accurate diagnosis of brain tumor is a key point to make a proper treatment planning to cure and improve the existence of patients with brain tumors disease. The computer-aided tumor detection systems and convolutional neural networks provided success stories and have made important strides in the field of machine learning. The deep convolutional layers extract important and robust features automatically from the input space as compared to traditional predecessor neural network layers. In the proposed framework, we conduct three studies using three architectures of convolutional neural networks (AlexNet, GoogLeNet, and VGGNet) to classify brain tumors such as meningioma, glioma, and pituitary. Each study then explores the transfer learning techniques, i.e., fine-tune and freeze using MRI slices of brain tumor dataset—Figshare. The data augmentation techniques are applied to the MRI slices for generalization of results, increasing the dataset samples and reducing the chance of over-fitting. In the proposed studies, the fine-tune VGG16 architecture attained highest accuracy up to 98.69 in terms of classification and detection.

277 citations

Journal ArticleDOI
TL;DR: The problem foundation of manipulator control and the theoretical ideas on using neural network to solve this problem are analyzed and then the latest progresses on this topic in recent years are described and reviewed in detail.

209 citations

Journal ArticleDOI
TL;DR: This review article serves the purpose of presenting state of the art results and techniques on OCR and also provide research directions by highlighting research gaps.
Abstract: Given the ubiquity of handwritten documents in human transactions, Optical Character Recognition (OCR) of documents have invaluable practical worth. Optical character recognition is a science that enables to translate various types of documents or images into analyzable, editable and searchable data. During last decade, researchers have used artificial intelligence/machine learning tools to automatically analyze handwritten and printed documents in order to convert them into electronic format. The objective of this review paper is to summarize research that has been conducted on character recognition of handwritten documents and to provide research directions. In this Systematic Literature Review (SLR) we collected, synthesized and analyzed research articles on the topic of handwritten OCR (and closely related topics) which were published between year 2000 to 2019. We followed widely used electronic databases by following pre-defined review protocol. Articles were searched using keywords, forward reference searching and backward reference searching in order to search all the articles related to the topic. After carefully following study selection process 176 articles were selected for this SLR. This review article serves the purpose of presenting state of the art results and techniques on OCR and also provide research directions by highlighting research gaps.

139 citations

Journal ArticleDOI
01 Mar 2014
TL;DR: The Urdu, Pushto, and Sindhi languages are discussed, with the emphasis being on the Nasta'liq and Naskh scripts, with an emphasis on the preprocessing, segmentation, feature extraction, classification, and recognition in OCR.
Abstract: We survey the optical character recognition (OCR) literature with reference to the Urdu-like cursive scripts. In particular, the Urdu, Pushto, and Sindhi languages are discussed, with the emphasis being on the Nasta'liq and Naskh scripts. Before detaining the OCR works, the peculiarities of the Urdu-like scripts are outlined, which are followed by the presentation of the available text image databases. For the sake of clarity, the various attempts are grouped into three parts, namely: (a) printed, (b) handwritten, and (c) online character recognition. Within each part, the works are analyzed par rapport a typical OCR pipeline with an emphasis on the preprocessing, segmentation, feature extraction, classification, and recognition. HighlightsA literature review of the Nasta'liq and Naskh cursive script OCR.The peculiarities and challenges are described a priori.Printed, handwritten and online OCR efforts are being explored.Analyses based on the stages of a typical OCR pipeline.

121 citations