scispace - formally typeset
Proceedings ArticleDOI

Optuna: A Next-generation Hyperparameter Optimization Framework

TLDR
Optuna as mentioned in this paper is a next-generation hyperparameter optimization software with define-by-run (DBR) API that allows users to construct the parameter search space dynamically.
Abstract
The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).

read more

Citations
More filters
Posted Content

Graph Neural Networks Exponentially Lose Expressive Power for Node Classification.

TL;DR: In this article, the expressive power of graph NNs via their asymptotic behaviors as the layer size tends to infinity was investigated, and the proposed weight scaling enhances the predictive performance of GCNs in real data.
Posted Content

Data Augmentation for Graph Neural Networks

TL;DR: This work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra- class edges and demote inter-class edges in given graph structure, and introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction.
Journal ArticleDOI

Deep physical neural networks trained with backpropagation

TL;DR: Physical Neural Networks as discussed by the authors automatically train the functionality of any sequence of real physical systems, directly, using backpropagation, the same technique used for modern deep neural networks, using three diverse physical systems-optical, mechanical, and electrical.
References
More filters
Journal ArticleDOI

ImageNet classification with deep convolutional neural networks

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Automatic differentiation in PyTorch

TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Journal ArticleDOI

Completely Derandomized Self-Adaptation in Evolution Strategies

TL;DR: This paper puts forward two useful methods for self-adaptation of the mutation distribution - the concepts of derandomization and cumulation and reveals local and global search properties of the evolution strategy with and without covariance matrix adaptation.
Journal ArticleDOI

Taking the Human Out of the Loop: A Review of Bayesian Optimization

TL;DR: This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.
Proceedings Article

Algorithms for Hyper-Parameter Optimization

TL;DR: This work contributes novel techniques for making response surface models P(y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements.
Related Papers (5)