scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Parallelization of digit recognition system using Deep Convolutional Neural Network on CUDA

04 May 2017-pp 379-383
TL;DR: A Compute Unified Device Architecture (CUDA) implementation of Deep Convolutional Neural Network (DCNN) for a digit recognition system is proposed to reduce the computation time of ANN and achieve high accuracy.
Abstract: A Compute Unified Device Architecture (CUDA) implementation of Deep Convolutional Neural Network (DCNN) for a digit recognition system is proposed to reduce the computation time of ANN and achieve high accuracy. A neural network with three layers of convolutions and two fully connected layers is developed by building input, hidden and output neurons to achieve an improved accuracy. The network is parallelized using a dedicated GPU on CUDA platform using Tensor flow library. A comparative analysis of accuracy and computation time is performed for sequential and parallel execution of the network on dual core (4 logical processors) CPU, octa core (16 logical processors) only CPU and octa core (16 logical processors) CPU with GPU systems. MNIST (Modified National Institute of Standards and Technology) and EMNIST (Extended MNIST) database are used for both training and testing. MNIST has 55000 training sets, 10000 testing sets and 5000 validation sets. EMNIST consists of 235000 training, 40000 testing and 5000 validation sets. The network designed requires high computation and hence parallelizing it shows significant improvement in execution time.
Citations
More filters
Journal ArticleDOI
TL;DR: A deep fast convolutional neural network based on extreme learning machine and a fixed bank of filters resulting in superior accuracy as well as competitive training time, even in relation to approaches that employ processing in GPUs is proposed.

29 citations

Journal ArticleDOI
TL;DR: Results show that both the use of neuroevolved committees and the application of topology transfer learning are successful: committees of convolutional neural networks are able to improve classification results when compared to single models, and topologies learned for one problem can be reused for a different problem and data with a good performance.
Abstract: Neuroevolution is the field of study that uses evolutionary computation in order to optimize certain aspect of the design of neural networks, most often its topology and hyperparameters. The field was introduced in the late-1980s, but only in the latest years the field has become mature enough to enable the optimization of deep learning models, such as convolutional neural networks. In this paper, we rely on previous work to apply neuroevolution in order to optimize the topology of deep neural networks that can be used to solve the problem of handwritten character recognition. Moreover, we take advantage of the fact that evolutionary algorithms optimize a population of candidate solutions, by combining a set of the best evolved models resulting in a committee of convolutional neural networks. This process is enhanced by using specific mechanisms to preserve the diversity of the population. Additionally, in this paper, we address one of the disadvantages of neuroevolution: the process is very expensive in terms of computational time. To lessen this issue, we explore the performance of topology transfer learning: whether the best topology obtained using neuroevolution for a certain domain can be successfully applied to a different domain. By doing so, the expensive process of neuroevolution can be reused to tackle different problems, turning it into a more appealing approach for optimizing the design of neural networks topologies. After evaluating our proposal, results show that both the use of neuroevolved committees and the application of topology transfer learning are successful: committees of convolutional neural networks are able to improve classification results when compared to single models, and topologies learned for one problem can be reused for a different problem and data with a good performance. Additionally, both approaches can be combined by building committees of transferred topologies, and this combination attains results that combine the best of both approaches.

28 citations

Proceedings ArticleDOI
14 Jun 2020
TL;DR: Experimental results demonstrate that the introduced hybrid optical-digital implementation of a convolutional neural network based on engineering of the point spread function of an optical imaging system yields more than two orders of magnitude reduction in the computational cost while achieving near-state-of-the-art accuracy.
Abstract: Despite the substantial progress made in deep learning in recent years, advanced approaches remain computationally intensive. The trade-off between accuracy and computation time and energy limits their use in real-time applications on low power and other resource-constrained systems. In this paper, we tackle this fundamental challenge by introducing a hybrid optical-digital implementation of a convolutional neural network (CNN) based on engineering of the point spread function (PSF) of an optical imaging system. This is done by coding an imaging aperture such that its PSF replicates a large convolution kernel of the first layer of a pre-trained CNN. As the convolution takes place in the optical domain, it has zero cost in terms of energy consumption and has zero latency independent of the kernel size. Experimental results on two datasets demonstrate that our approach yields more than two orders of magnitude reduction in the computational cost while achieving near-state-of-the-art accuracy, or equivalently, better accuracy at the same computational cost.

19 citations


Additional excerpts

  • ...95M E Parallelized CNN[34] 99....

    [...]

Journal ArticleDOI
02 Dec 2019-PeerJ
TL;DR: In this experiment, the end-to-end face recognition system based on 3D face texture is proposed, combining the geometric invariants, histogram of oriented gradients and the fine-tuned residual neural networks, which costs less than traditional methods.
Abstract: As the technology for 3D photography has developed rapidly in recent years, an enormous amount of 3D images has been produced, one of the directions of research for which is face recognition. Improving the accuracy of a number of data is crucial in 3D face recognition problems. Traditional machine learning methods can be used to recognize 3D faces, but the face recognition rate has declined rapidly with the increasing number of 3D images. As a result, classifying large amounts of 3D image data is time-consuming, expensive, and inefficient. The deep learning methods have become the focus of attention in the 3D face recognition research. In our experiment, the end-to-end face recognition system based on 3D face texture is proposed, combining the geometric invariants, histogram of oriented gradients and the fine-tuned residual neural networks. The research shows that when the performance is evaluated by the FRGC-v2 dataset, as the fine-tuned ResNet deep neural network layers are increased, the best Top-1 accuracy is up to 98.26% and the Top-2 accuracy is 99.40%. The framework proposed costs less iterations than traditional methods. The analysis suggests that a large number of 3D face data by the proposed recognition framework could significantly improve recognition decisions in realistic 3D face scenarios.

6 citations

Proceedings ArticleDOI
01 Jan 2020
TL;DR: A deep analysis of recent development on scene text and compare their performance and bring into light the real modern applications of scene text detection and recognition.
Abstract: Right from the very beginning, the text has vital importance in human life. As compared to the vision-based applications, preference is always given to the precise and productive information embodied in the text. Considering the importance of text, recognition, and detection of text is also equally important in human life. This paper presents a deep analysis of recent development on scene text and compare their performance and bring into light the real modern applications. Future potential directions of scene text detection and recognition are also discussed.

5 citations

References
More filters
Proceedings Article
01 Jan 1989
TL;DR: Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task, and has 1% error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service.
Abstract: We present an application of back-propagation networks to handwritten digit recognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated digits. The method has 1% error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service.

3,324 citations

Proceedings Article
17 Feb 2017
TL;DR: A variant of the full NIST dataset is introduced, which is called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset, and shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems.
Abstract: The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.

550 citations


"Parallelization of digit recognitio..." refers methods in this paper

  • ...[13] provided the EMNIST database deriving it from NIST library and then processing it....

    [...]

Journal ArticleDOI
TL;DR: A novel neural classifier LImited Receptive Area (LIRA) for the image recognition that contains three neuron layers: sensor, associative and output layers and shows sufficiently good results in task of the pin–hole position estimation.

175 citations


"Parallelization of digit recognitio..." refers background in this paper

  • ...[10] developed a neural classifier Limited Receptive Area for the image recognition achieving a recognition rate of 99....

    [...]

Proceedings ArticleDOI
01 Dec 2008
TL;DR: This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU and uses CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem.
Abstract: Many algorithms for image processing and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation using GPU encounters two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job which needs much cooperation between CPU and GPU, which is usual in image processings and pattern recognitions contrary to the graphics area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results ineffectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text detection system using the proposed architecture, and the computational times showed about 15 times faster than implementation using CPU and about 4 times faster than implementation on only GPU without OpenMP.

142 citations

Proceedings ArticleDOI
06 Aug 2002
TL;DR: The latest results of handwritten digit recognition on well-known image databases using the state-of-the-art feature extraction and classification techniques are presented and they provide a baseline for evaluation of future works.
Abstract: This paper presents the latest results of handwritten digit recognition on well-known image databases using the state-of-the-art feature extraction and classification techniques. The tested databases are CENPARMI, CEDAR, and MNIST. On the test dataset of each database, 56 recognition accuracies are given by combining 7 classifiers with 8 feature vectors. All the classifiers and feature vectors give high accuracies. Among the features, the chain-code feature and gradient feature show advantages, and the profile structure feature shows efficiency as a complementary feature. In comparison of classifiers, the support vector classifier with RBF kernel gives the highest accuracy but is extremely expensive in storage and computation. Among the non-SV classifiers, the polynomial classifier performs best, followed by a learning quadratic discriminant function classifier. The results are competitive compared to previous ones and they provide a baseline for evaluation of future works.

49 citations


"Parallelization of digit recognitio..." refers methods in this paper

  • ...[11], implemented the state of the art feature extraction and classification techniques for the digit recognition on three different database....

    [...]