Parallelization of digit recognition system using Deep Convolutional Neural Network on CUDA

doi:10.1109/SSPS.2017.8071623

Home
/
Papers
/
Parallelization of digit recognition system using Deep Convolutional Neural Network on CUDA

Proceedings Article•DOI•

Parallelization of digit recognition system using Deep Convolutional Neural Network on CUDA

Srishti Singh¹, Amrit Paul¹, M. Arun¹•Institutions (1)

VIT University¹

04 May 2017-pp 379-383

TL;DR: A Compute Unified Device Architecture (CUDA) implementation of Deep Convolutional Neural Network (DCNN) for a digit recognition system is proposed to reduce the computation time of ANN and achieve high accuracy.

read less

Abstract: A Compute Unified Device Architecture (CUDA) implementation of Deep Convolutional Neural Network (DCNN) for a digit recognition system is proposed to reduce the computation time of ANN and achieve high accuracy. A neural network with three layers of convolutions and two fully connected layers is developed by building input, hidden and output neurons to achieve an improved accuracy. The network is parallelized using a dedicated GPU on CUDA platform using Tensor flow library. A comparative analysis of accuracy and computation time is performed for sequential and parallel execution of the network on dual core (4 logical processors) CPU, octa core (16 logical processors) only CPU and octa core (16 logical processors) CPU with GPU systems. MNIST (Modified National Institute of Standards and Technology) and EMNIST (Extended MNIST) database are used for both training and testing. MNIST has 55000 training sets, 10000 testing sets and 5000 validation sets. EMNIST consists of 235000 training, 40000 testing and 5000 validation sets. The network designed requires high computation and hence parallelizing it shows significant improvement in execution time.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Deep convolutional extreme learning machines: Filters combination and error model validation

[...]

Michel M. dos Santos¹, Abel Guilhermino da Silva Filho¹, Wellington Pinheiro dos Santos¹•Institutions (1)

Federal University of Pernambuco¹

15 Feb 2019-Neurocomputing

TL;DR: A deep fast convolutional neural network based on extreme learning machine and a fixed bank of filters resulting in superior accuracy as well as competitive training time, even in relation to approaches that employ processing in GPUs is proposed.

...read moreread less

29 citations

Journal Article•DOI•

Hybridizing Evolutionary Computation and Deep Neural Networks: An Approach to Handwriting Recognition Using Committees and Transfer Learning

[...]

Alejandro Baldominos¹, Yago Saez¹, Pedro Isasi¹•Institutions (1)

Charles III University of Madrid¹

26 Mar 2019-Complexity

TL;DR: Results show that both the use of neuroevolved committees and the application of topology transfer learning are successful: committees of convolutional neural networks are able to improve classification results when compared to single models, and topologies learned for one problem can be reused for a different problem and data with a good performance.

...read moreread less

Abstract: Neuroevolution is the field of study that uses evolutionary computation in order to optimize certain aspect of the design of neural networks, most often its topology and hyperparameters. The field was introduced in the late-1980s, but only in the latest years the field has become mature enough to enable the optimization of deep learning models, such as convolutional neural networks. In this paper, we rely on previous work to apply neuroevolution in order to optimize the topology of deep neural networks that can be used to solve the problem of handwritten character recognition. Moreover, we take advantage of the fact that evolutionary algorithms optimize a population of candidate solutions, by combining a set of the best evolved models resulting in a committee of convolutional neural networks. This process is enhanced by using specific mechanisms to preserve the diversity of the population. Additionally, in this paper, we address one of the disadvantages of neuroevolution: the process is very expensive in terms of computational time. To lessen this issue, we explore the performance of topology transfer learning: whether the best topology obtained using neuroevolution for a certain domain can be successfully applied to a different domain. By doing so, the expensive process of neuroevolution can be reused to tackle different problems, turning it into a more appealing approach for optimizing the design of neural networks topologies. After evaluating our proposal, results show that both the use of neuroevolved committees and the application of topology transfer learning are successful: committees of convolutional neural networks are able to improve classification results when compared to single models, and topologies learned for one problem can be reused for a different problem and data with a good performance. Additionally, both approaches can be combined by building committees of transferred topologies, and this combination attains results that combine the best of both approaches.

...read moreread less

28 citations

Proceedings Article•DOI•

Efficient Neural Vision Systems Based on Convolutional Image Acquisition

[...]

Pedram Pad, Simon Narduzzi, Clement Kundig, Engin Türetken, Siavash Arjomand Bigdeli, L. Andrea Dunbar - Show less +2 more

14 Jun 2020

TL;DR: Experimental results demonstrate that the introduced hybrid optical-digital implementation of a convolutional neural network based on engineering of the point spread function of an optical imaging system yields more than two orders of magnitude reduction in the computational cost while achieving near-state-of-the-art accuracy.

...read moreread less

Abstract: Despite the substantial progress made in deep learning in recent years, advanced approaches remain computationally intensive. The trade-off between accuracy and computation time and energy limits their use in real-time applications on low power and other resource-constrained systems. In this paper, we tackle this fundamental challenge by introducing a hybrid optical-digital implementation of a convolutional neural network (CNN) based on engineering of the point spread function (PSF) of an optical imaging system. This is done by coding an imaging aperture such that its PSF replicates a large convolution kernel of the first layer of a pre-trained CNN. As the convolution takes place in the optical domain, it has zero cost in terms of energy consumption and has zero latency independent of the kernel size. Experimental results on two datasets demonstrate that our approach yields more than two orders of magnitude reduction in the computational cost while achieving near-state-of-the-art accuracy, or equivalently, better accuracy at the same computational cost.

...read moreread less

19 citations

Additional excerpts

...95M E Parallelized CNN[34] 99....
[...]

Journal Article•DOI•

3D texture-based face recognition system using fine-tuned deep residual networks.

[...]

Siming Zheng, Rahmita Wirza O. K. Rahmat, Fatimah Khalid, Nurul Amelina Nasharuddin

02 Dec 2019-PeerJ

TL;DR: In this experiment, the end-to-end face recognition system based on 3D face texture is proposed, combining the geometric invariants, histogram of oriented gradients and the fine-tuned residual neural networks, which costs less than traditional methods.

...read moreread less

Abstract: As the technology for 3D photography has developed rapidly in recent years, an enormous amount of 3D images has been produced, one of the directions of research for which is face recognition. Improving the accuracy of a number of data is crucial in 3D face recognition problems. Traditional machine learning methods can be used to recognize 3D faces, but the face recognition rate has declined rapidly with the increasing number of 3D images. As a result, classifying large amounts of 3D image data is time-consuming, expensive, and inefficient. The deep learning methods have become the focus of attention in the 3D face recognition research. In our experiment, the end-to-end face recognition system based on 3D face texture is proposed, combining the geometric invariants, histogram of oriented gradients and the fine-tuned residual neural networks. The research shows that when the performance is evaluated by the FRGC-v2 dataset, as the fine-tuned ResNet deep neural network layers are increased, the best Top-1 accuracy is up to 98.26% and the Top-2 accuracy is 99.40%. The framework proposed costs less iterations than traditional methods. The analysis suggests that a large number of 3D face data by the proposed recognition framework could significantly improve recognition decisions in realistic 3D face scenarios.

...read moreread less

6 citations

Proceedings Article•DOI•

End-to-End Analysis for Text Detection and Recognition in Natural Scene Images

[...]

Ahlam Alnefaie, Deepak Gupta, Monowar H. Bhuyan¹, Imran Razzak², Prashant K. Gupta³, Mukesh Prasad - Show less +2 more•Institutions (3)

Umeå University¹, Deakin University², Guru Gobind Singh Indraprastha University³

01 Jan 2020

TL;DR: A deep analysis of recent development on scene text and compare their performance and bring into light the real modern applications of scene text detection and recognition.

...read moreread less

Abstract: Right from the very beginning, the text has vital importance in human life. As compared to the vision-based applications, preference is always given to the precise and productive information embodied in the text. Considering the importance of text, recognition, and detection of text is also equally important in human life. This paper presents a deep analysis of recent development on scene text and compare their performance and bring into light the real modern applications. Future potential directions of scene text detection and recognition are also discussed.

...read moreread less

5 citations

1
2
3
4
…

References

PDF

Open Access

More filters

Proceedings Article•

Handwritten Digit Recognition with a Back-Propagation Network

[...]

Yann LeCun¹, Bernhard E. Boser², John S. Denker³, John S. Denker², D. Henderson¹, Richard Howard², W. Hubbard², Lawrence D. Jackel¹ - Show less +4 more•Institutions (3)

Bell Labs¹, Alcatel-Lucent², AT&T³

01 Jan 1989

TL;DR: Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task, and has 1% error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service.

...read moreread less

Abstract: We present an application of back-propagation networks to handwritten digit recognition. Minimal preprocessing of the data was required, but architecture of the network was highly constrained and specifically designed for the task. The input of the network consists of normalized images of isolated digits. The method has 1% error rate and about a 9% reject rate on zipcode digits provided by the U.S. Postal Service.

...read moreread less

3,324 citations

Proceedings Article•

EMNIST: an extension of MNIST to handwritten letters

[...]

Gregory Cohen¹, Saeed Afshar¹, Jonathan Tapson¹, André van Schaik¹•Institutions (1)

University of Sydney¹

17 Feb 2017

TL;DR: A variant of the full NIST dataset is introduced, which is called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset, and shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems.

...read moreread less

Abstract: The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.

...read moreread less

550 citations

"Parallelization of digit recognitio..." refers methods in this paper

...[13] provided the EMNIST database deriving it from NIST library and then processing it....
[...]

Journal Article•DOI•

Improved method of handwritten digit recognition tested on MNIST database

[...]

Ernst Kussul¹, T. Baidyk¹•Institutions (1)

National Autonomous University of Mexico¹

01 Oct 2004-Image and Vision Computing

TL;DR: A novel neural classifier LImited Receptive Area (LIRA) for the image recognition that contains three neuron layers: sensor, associative and output layers and shows sufficiently good results in task of the pin–hole position estimation.

...read moreread less

175 citations

"Parallelization of digit recognitio..." refers background in this paper

...[10] developed a neural classifier Limited Receptive Area for the image recognition achieving a recognition rate of 99....
[...]

Proceedings Article•DOI•

Neural Network Implementation Using CUDA and OpenMP

[...]

Hong-Hoon Jang¹, Anjin Park¹, Keechul Jung¹•Institutions (1)

Soongsil University¹

01 Dec 2008

TL;DR: This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU and uses CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem.

...read moreread less

Abstract: Many algorithms for image processing and pattern recognition have recently been implemented on GPU (graphic processing unit) for faster computational times. However, the implementation using GPU encounters two problems. First, the programmer should master the fundamentals of the graphics shading languages that require the prior knowledge on computer graphics. Second, in a job which needs much cooperation between CPU and GPU, which is usual in image processings and pattern recognitions contrary to the graphics area, CPU should generate raw feature data for GPU processing as much as possible to effectively utilize GPU performance. This paper proposes more quick and efficient implementation of neural networks on both GPU and multi-core CPU. We use CUDA (compute unified device architecture) that can be easily programmed due to its simple C language-like style instead of GPU to solve the first problem. Moreover, OpenMP (Open Multi-Processing) is used to concurrently process multiple data with single instruction on multi-core CPU, which results ineffectively utilizing the memories of GPU. In the experiments, we implemented neural networks-based text detection system using the proposed architecture, and the computational times showed about 15 times faster than implementation using CPU and about 4 times faster than implementation on only GPU without OpenMP.

...read moreread less

142 citations

Proceedings Article•DOI•

Handwritten digit recognition using state-of-the-art techniques

[...]

Cheng-Lin Liu¹, Kazuki Nakashima¹, Hiroshi Sako¹, Hiromichi Fujisawa¹•Institutions (1)

Hitachi¹

06 Aug 2002

TL;DR: The latest results of handwritten digit recognition on well-known image databases using the state-of-the-art feature extraction and classification techniques are presented and they provide a baseline for evaluation of future works.

...read moreread less

Abstract: This paper presents the latest results of handwritten digit recognition on well-known image databases using the state-of-the-art feature extraction and classification techniques. The tested databases are CENPARMI, CEDAR, and MNIST. On the test dataset of each database, 56 recognition accuracies are given by combining 7 classifiers with 8 feature vectors. All the classifiers and feature vectors give high accuracies. Among the features, the chain-code feature and gradient feature show advantages, and the profile structure feature shows efficiency as a complementary feature. In comparison of classifiers, the support vector classifier with RBF kernel gives the highest accuracy but is extremely expensive in storage and computation. Among the non-SV classifiers, the polynomial classifier performs best, followed by a learning quadratic discriminant function classifier. The results are competitive compared to previous ones and they provide a baseline for evaluation of future works.

...read moreread less

49 citations

"Parallelization of digit recognitio..." refers methods in this paper

...[11], implemented the state of the art feature extraction and classification techniques for the digit recognition on three different database....
[...]