scispace - formally typeset
Search or ask a question
Author

Ramzi Salah

Bio: Ramzi Salah is an academic researcher. The author has contributed to research in topics: Deep learning & Recurrent neural network. The author has an hindex of 1, co-authored 1 publications receiving 1 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a Quranic optical character recognition (OCR) system based on convolutional neural network (CNN) followed by RNN is introduced, and six deep learning models are built to study the effect of different representations of the input and output, and the accuracy and performance of the models.
Abstract: A Quranic optical character recognition (OCR) system based on convolutional neural network (CNN) followed by recurrent neural network (RNN) is introduced in this work. Six deep learning models are built to study the effect of different representations of the input and output, and the accuracy and performance of the models, and compare long short-term memory (LSTM) and gated recurrent unit (GRU). A new Quranic OCR dataset is developed based on the most famous printed version of the Holy Quran (Mushaf Al-Madinah), and a page and line-text image with the corresponding labels is prepared. This work’s contribution is a Quranic OCR model capable of recognizing the Quranic image’s diacritic text. A better performance in word recognition rate (WRR) and character recognition rate (CRR) is achieved in the experiments. The LSTM and GRU are compared in the Arabic text recognition domain. In addition, a public database is built for research purposes in Arabic text recognition that contains the diacritics and the Uthmanic script, and is large enough to be used with the deep learning models. The outcome of this work shows that the proposed system obtains an accuracy of 98% on the validation data, and a WRR of 95% and a CRR of 99% in the test dataset.

15 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a real-time lane detection system based on CNN Encoder-Decoder and Long Short-Term Memory (LSTM) networks for dynamic environments and complex road conditions.
Abstract: In recent years, lane detection has become one of the most important factors in the progress of intelligent vehicles. To deal with the challenging problem of low detection precision and real-time performance of most traditional systems, we proposed a real-time deep lane detection system based on CNN Encoder–Decoder and Long Short-Term Memory (LSTM) networks for dynamic environments and complex road conditions. The CNN Encoder network is used to extract deep features from a dataset and to reduce their dimensionality. A corresponding decoder network is used to map the low resolution encoder feature maps to dense feature maps that correspond to road lane. The LSTM network processes historical data to improve the detection rate through the removal of the influence of false alarm patches on detection results. We propose three network architectures to predict the road lane: CNN Encoder–Decoder network, CNN Encoder–Decoder network with the application of Dropout layers and CNN Encoder–LSTM-Decoder network that are trained and tested on a public dataset comprising 12764 road images under different conditions. Experimental results show that the proposed hybrid CNN Encoder–LSTM-Decoder network that we have integrated into a Lane-Departure-Warning-System (LDWS) achieves high prediction performance namely an average accuracy of 96.36%, a Recall of 97.54%, and a F1-score of 97.42%. A NVIDIA Jetson Xavier NX supercomputer has been used, for its performance and efficiency, to realize an Embedded Deep LDWS.

31 citations

Journal ArticleDOI
27 Sep 2021-Energies
TL;DR: This work proposes an encoder–decoder model based on convolutional long short-term memory networks (ConvLSTM) for energy load forecasting that is favorably comparable to the existing state-of-the-art and better than the conventional methods with the least error rate.
Abstract: An efficient energy management system is integrated with the power grid to collect information about the energy consumption and provide the appropriate control to optimize the supply–demand pattern. Therefore, there is a need for intelligent decisions for the generation and distribution of energy, which is only possible by making the correct future predictions. In the energy market, future knowledge of the energy consumption pattern helps the end-user to decide when to buy or sell the energy to reduce the energy cost and decrease the peak consumption. The Internet of things (IoT) and energy data analytic techniques have provided the convenience to collect the data from the end devices on a large scale and to manipulate all the recorded data. Forecasting an electric load is fairly challenging due to the high uncertainty and dynamic nature involved due to spatiotemporal pattern consumption. Existing conventional forecasting models lack the ability to deal with the spatio-temporally varying data. To overcome the above-mentioned challenges, this work proposes an encoder–decoder model based on convolutional long short-term memory networks (ConvLSTM) for energy load forecasting. The proposed architecture uses encode consisting of multiple ConvLSTM layers to extract the salient features in the data and to learn the sequential dependency and then passes the output to the decoder, having LSTM layers to make forecasting. The forecasting results produced by the proposed approach are favorably comparable to the existing state-of-the-art and better than the conventional methods with the least error rate. Quantitative analyses show that a mean absolute percentage error (MAPE) of 6.966% for household energy consumption and 16.81% for city-wide energy consumption is obtained for the proposed forecasting model in comparison with existing encoder–decoder-based deep learning models for two real-world datasets.

4 citations

Journal ArticleDOI
TL;DR: An end‐to‐end trainable CRNN architecture consisting of CNN, RNN (LSTM), and CTC layers for the Ottoman OCR problem is proposed and the Osmanlica.com Hybrid model outperforms the next best model by a clear margin.
Abstract: The Ottoman OCR is an open problem because the OCR models for Arabic do not perform well on Ottoman. The models specifically trained with Ottoman documents have not produced satisfactory results either. We present a deep learning model and an OCR tool using that model for the OCR of printed Ottoman documents in the naksh font. We propose an end‐to‐end trainable CRNN architecture consisting of CNN, RNN (LSTM), and CTC layers for the Ottoman OCR problem. An experimental comparison of this model, called Osmanlica.com, with the Tesseract Arabic, the Tesseract Persian, Abby Finereader, Miletos, and Google Docs OCR tools or models was performed using a test data set of 21 pages of original documents. With 88.86% raw text, 96.12% normalized text, and 97.37% joined text character recognition accuracy, the Osmanlica.com Hybrid model outperforms the others with a marked difference. Our model outperforms the next best model by a clear margin of 4% which is a significant improvement considering the difficulty of the Ottoman OCR problem, and the huge size of the Ottoman archives to be processed. The hybrid model also achieves 58% word recognition accuracy on normalized text which is the only rate above 50%.

3 citations

Journal ArticleDOI
TL;DR: In this article , the authors proposed to improve the capacity of bidirectional gated recurrent unit (BGRU) to recognize handwritten Arabic text, which can have a high success rate but expensive in terms of time and memory.
Abstract: Handwriting recognition is a challenge that interests many researchers around the world. As an exception, handwritten Arabic script has many objectives that remain to be overcome, given its complex form, their number of forms which exceeds 100 and its cursive nature. Over the past few years, good results have been obtained, but with a high cost of memory and execution time. In this paper we propose to improve the capacity of bidirectional gated recurrent unit (BGRU) to recognize Arabic text. The advantages of using BGRUs is the execution time compared to other methods that can have a high success rate but expensive in terms of time and memory. To test the recognition capacity of BGRU, the proposed architecture is composed by 6 convolutional neural network (CNN) blocks for feature extraction and 1 BGRU + 2 dense layers for learning and test. The experiment is carried out on the entire database of institut für nachrichtentechnik/ecole nationale d’ingénieurs de Tunis (IFN/ENIT) without any preprocessing or data selection. The obtained results show the ability of BGRUs to recognize handwritten Arabic script.

2 citations

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a hybrid model for Arabic word recognition by adapting a deep convolutional neural network (DCNN) to work as a classifier based on a generative adversarial network (GAN) as a data augmentation technique to develop a robust hybrid model.
Abstract: The recognition of printed Arabic words remains an open area for research since Arabic is among the most complex languages. Prior research has shown that few efforts have been made to develop models of accurate Arabic recognition, as most of these models have faced the increasing complexity of the performance and lack of benchmark Arabic datasets. Meanwhile, Deep learning models, such as Convolutional Neural Networks (CNNs), have been shown to be beneficial in reducing the error rate and enhancing accuracy in Arabic character recognition systems. The reliability of these models increases with the depth of layers. Still, the essential condition for more layers is an extensive amount of data. Since CNN generates features by analysing large amounts of data, its performance is directly proportional to the volume of data, as DL models are considered data-hungry algorithms. Nevertheless, this technique suffers from poor generalisation ability and overfitting issues, which affect the Arabic recognition models' accuracy. These issues are due to the limited availability of Arabic databases in terms of accessibility and size, which led to a central problem facing the Arabic language nowadays. Therefore, the Arabic character recognition models still have gaps that need to be bridged. The Deep Learning techniques are also to be improved to increase the accuracy by manipulating the strength of technique in a neural network for handling the lack of datasets and the generalisation ability of the neural network in model building. To solve these problems, this study proposes a hybrid model for Arabic word recognition by adapting a deep convolutional neural network (DCNN) to work as a classifier based on a generative adversarial network (GAN) work as a data augmentation technique to develop a robust hybrid model for improving the accuracy and generalisation ability. Each proposed model is separately evaluated and compared with other state-ofthe-art models. These models are tested on the Arabic printed text image dataset (APTI). The proposed hybrid deep learning model shows excellent performance regarding the accuracy, with a score of 99.76% compared to 94.81% for the proposed DCNN model on the APTI dataset. The proposed model indicates highly competitive performance and enhanced accuracy compared to the existing state-of-the-art Arabic printed word recognition models. The results demonstrate that the generalisation of networks and the handling of overfitting have also improved. This study output is comparable to other competitive models and contributes an enhanced Arabic recognition model to the body of

1 citations