Selective Layer Tuning and Performance Study of Pre-Trained Models Using Genetic Algorithm
Jae Cheol Jeong,Gwang-Hyun Yu,Min-Gyu Song,Dang Thanh Vu,Le Hoang Anh,Young-Ae Jung,Yoona Choi,Tai-Won Um,Jinyoung Kim +8 more
TLDR
This paper proposes tuning trainable layers using a genetic algorithm on a pre-trained model that is fine-tuned on single-channel image datasets for a classification task.Abstract:
Utilizing pre-trained models involves fully or partially using pre-trained parameters as initialization. In general, configuring a pre-trained model demands practitioners’ knowledge about problems or an exhaustive trial–error experiment according to a given task. In this paper, we propose tuning trainable layers using a genetic algorithm on a pre-trained model that is fine-tuned on single-channel image datasets for a classification task. The single-channel dataset comprises images from grayscale and preprocessed audio signals transformed into a log-Mel spectrogram. Four deep-learning models used in the experimental evaluation employed the pre-trained model with the ImageNet dataset. The proposed genetic algorithm was applied to find the highest fitness for every generation to determine the selective layer tuning of the pre-trained models. Compared to the conventional fine-tuning method and random layer search, our proposed selective layer search with a genetic algorithm achieves higher accuracy, on average, by 9.7% and 1.88% (MNIST-Fashion), 1.31% and 1.14% (UrbanSound8k), and 2.2% and 0.29% (HospitalAlarmSound), respectively. In addition, our searching method can naturally be applied to various datasets of the same task without prior knowledge about the dataset of interest.read more
Citations
More filters
Journal ArticleDOI
Genetic Algorithm-Based Hyperparameter Optimization for Convolutional Neural Networks in the Classification of Crop Pests
Journal ArticleDOI
Power Optimization in Multi-Tier Heterogeneous Networks Using Genetic Algorithm
TL;DR: In this paper , a power optimization model utilizing a modified genetic algorithm is proposed to manage power resources efficiently and reduce high power consumption, where each access point computes the optimal power using the modified GA until it meets the fitness criteria and assigns it to each cellular user.
References
More filters
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.