scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

NeuroEvolution : Using Genetic Algorithm for optimal design of Deep Learning models.

TL;DR: A genetic algorithm solution is proposed, that designs its own optimal CNN architecture and was proved useful in predicting time series data and was comparable to state of the art optimal RNN models.
Abstract: Convolutional Neural Networks (CNN) have proved to be influential in image classification techniques. However, CNN has not proved much useful in time series prediction. This is because of the causal nature of time series data. With proper modifications in the architecture of CNN the model can perform well on time series prediction as well. However, the modification and design of CNN architecture require expertise and knowledge of data. In order to tackle this problem, a genetic algorithm solution is proposed, that designs its own optimal CNN architecture and was proved useful in predicting time series data. The robustness and accuracy of the model were tested on our dataset and compared with an RNN architecture. It was found that the CNN model performed well irrespective of the duration of prediction and these results are comparable to state of the art optimal RNN models.
Citations
More filters
Journal ArticleDOI
TL;DR: This work comprehensively review and critically examine contributions made so far based on three axes - optimization and taxonomy, critical analysis, and challenges - which outline a complete vision of a merger of two technologies drawing up an exciting future for this area of fusion research.

15 citations

Journal ArticleDOI
TL;DR: A comprehensive review on the state-of-the-art encodings for CNNs can be found in this paper , where the authors present a comprehensive review of the most widely used encoding methods.
Abstract: Convolutional neural networks (CNNs) have shown outstanding results in different application tasks. However, the best performance is obtained when customized CNNs architectures are designed, which is labor intensive and requires highly specialized knowledge. Over three decades, neuroevolution (NE) has studied the application of evolutionary computation to optimize artificial neural networks (ANNs) at different levels. It is well known that the encoding of ANNs highly impacts the complexity of the search space and the optimization algorithms’ performance as well. As NE has rapidly advanced toward the optimization of CNNs topologies, researchers face the challenging duty of representing these complex networks. Furthermore, a compilation of the most widely used encoding methods is nonexistent. In response, we present a comprehensive review on the state-of-the-art of encodings for CNNs.

15 citations

Posted Content
TL;DR: The authors comprehensively review and critically examine contributions made so far based on three axes, each addressing a fundamental question in this research avenue: optimization and taxonomy (Why?), including a historical perspective, definitions of optimization problems in deep learning, and a taxonomy associated with an in-depth analysis of the literature, which together with two case studies, allows us to address learned lessons and recommendations for good practices following the analysis of literature, and new directions of research (What can be done, and what for?).
Abstract: Much has been said about the fusion of bio-inspired optimization algorithms and Deep Learning models for several purposes: from the discovery of network topologies and hyper-parametric configurations with improved performance for a given task, to the optimization of the model's parameters as a replacement for gradient-based solvers. Indeed, the literature is rich in proposals showcasing the application of assorted nature-inspired approaches for these tasks. In this work we comprehensively review and critically examine contributions made so far based on three axes, each addressing a fundamental question in this research avenue: a) optimization and taxonomy (Why?), including a historical perspective, definitions of optimization problems in Deep Learning, and a taxonomy associated with an in-depth analysis of the literature, b) critical methodological analysis (How?), which together with two case studies, allows us to address learned lessons and recommendations for good practices following the analysis of the literature, and c) challenges and new directions of research (What can be done, and what for?). In summary, three axes - optimization and taxonomy, critical analysis, and challenges - which outline a complete vision of a merger of two technologies drawing up an exciting future for this area of fusion research.

12 citations

Journal ArticleDOI
TL;DR: A comprehensive review of the evolutionary design of neural network architectures with a special emphasis on how evolutionary computation techniques were adopted and various encoding strategies proposed to provide a complete reference of works in this subject and guide researchers towards promising directions.
Abstract: We present a comprehensive review of the evolutionary design of neural network architectures. This work is motivated by the fact that the success of an Artificial Neural Network (ANN) highly depends on its architecture and among many approaches Evolutionary Computation, which is a set of global-search methods inspired by biological evolution has been proved to be an efficient approach for optimizing neural network structures. Initial attempts for automating architecture design by applying evolutionary approaches start in the late 1980s and have attracted significant interest until today. In this context, we examined the historical progress and analyzed all relevant scientific papers with a special emphasis on how evolutionary computation techniques were adopted and various encoding strategies proposed. We summarized key aspects of methodology, discussed common challenges, and investigated the works in chronological order by dividing the entire timeframe into three periods. The first period covers early works focusing on the optimization of simple ANN architectures with a variety of solutions proposed on chromosome representation. In the second period, the rise of more powerful methods and hybrid approaches were surveyed. In parallel with the recent advances, the last period covers the Deep Learning Era, in which research direction is shifted towards configuring advanced models of deep neural networks. Finally, we propose open problems for future research in the field of neural architecture search and provide insights for fully automated machine learning. Our aim is to provide a complete reference of works in this subject and guide researchers towards promising directions.

11 citations

Proceedings ArticleDOI
19 Jul 2020
TL;DR: The obtained results show that the proposed variable-length GA can evolve DRNN architectures that significantly outperform many state-of-the-art systems on most of the datasets.
Abstract: Deep Recurrent Neural Network (DRNN) is an effective deep learning method with a wide variety of applications. Manually designing the architecture of a DRNN for any specific task requires expert knowledge and the optimal DRNN architecture can vary substantially for different tasks. This paper focuses on developing an algorithm to automatically evolve task-specific DRNN architectures by using a Genetic Algorithm (GA). A variable-length encoding strategy is developed to represent DRNNs of different depths because it is not possible to determine the required depth of a DRNN in advance. Activation functions play an important role in the performance of DRNNs and must be carefully used in these networks. Driven by this understanding, knowledge-driven crossover and mutation operators will be proposed to carefully control the use of activation functions in GA in order for the algorithm to evolve best performing DRNNs. Our algorithm focuses particularly on evolving DRNN architectures that use Long Short Term Memory (LSTM) units. As a leading type of DRNN, LSTM-based DRNN can effectively handle long-term dependencies, achieving cutting-edge performance while processing various sequential data. Three different types of publicly available benchmark datasets for both classification and regression tasks have been considered in our experiments. The obtained results show that the proposed variable-length GA can evolve DRNN architectures that significantly outperform many state-of-the-art systems on most of the datasets.

7 citations


Cites background from "NeuroEvolution : Using Genetic Algo..."

  • ...Evolutionary Computation (EC) [11] approaches show great potential for optimizing the design of neural network architectures in a fully automated and effective way [12]....

    [...]

References
More filters
Proceedings Article
03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,978 citations

Proceedings Article
04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

55,235 citations


"NeuroEvolution : Using Genetic Algo..." refers methods in this paper

  • ...There have been various CNN variants introduced recently like AlexNet [2], VGG [6], GoogleNet [7] to name a few....

    [...]

Proceedings Article
01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

49,914 citations

Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations