What is the main idea of a complete pipeline?

A complete pipeline includes data cleaning, feature engineering (selection and construction), model selection, hyperparameter optimization and finally building an ensemble of the top trained models to obtain good performance on unseen test data.

What is the main point of the paper?

In this paper, the authors have presented an independent evaluation of current approaches in the field of shallow AutoML, and have presented their own version of Portfolio Hyperband that shows promising results in terms of computational efficiency while being on par with the state of the art accuracy-wise.

(Open Access) Automated Machine Learning in Practice: State of the Art and Recent Results (2019) | Lukas Tuggener

Q: What are the three main problems targeted by the literature?

Feature engineering: Feature preprocessing, representation learning and selecting the most discriminant features for a given classification or regression task are problems targeted by the literature.

Q: What research lines have grabbed the attention in the past years?

The problem of manual hyperparameter tuning [13] inspired researchers to automate various blocks of the machine learning pipeline: feature engineering [14], meta-learning [15], architecture search [16] as well as full Combined Model Selection and Hyperparameter optimization [17] are the research lines which grabbed a great deal of attention in the past years.

Automated Machine Learning in Practice:

State of the Art and Recent Results

Lukas Tuggener

1,2

, Mohammadreza Amirian

1,3

, Katharina Rombach

, Stefan L

orwald

Anastasia Varlet

, Christian Westermann

, and Thilo Stadelmann

ZHAW Datalab,

Winterthur, Switzerland

{tugg, amir, romc, stdm}@zhaw.ch

USI, Lugano, Switzerland

Ulm University, Ulm, Germany

PricewaterhouseCoopers AG

(PwC), Zurich, Switzerland

{ﬁrstname.lastname}@ch.pwc.com

Abstract— A main driver behind the digitization of industry

and society is the belief that data-driven model building and

decision making can contribute to higher degrees of automation

and more informed decisions. Building such models from data

often involves the application of some form of machine learning.

Thus, there is an ever growing demand in work force with the

necessary skill set to do so. This demand has given rise to

a new research topic concerned with ﬁtting machine learning

models fully automatically—AutoML. This paper gives an

overview of the state of the art in AutoML with a focus on

practical applicability in a business context, and provides recent

benchmark results of the most important AutoML algorithms.

I. INTRODUCTION

Many organisations, private and public, have understood

that data analysis is a powerful tool to learn insights on

how to improve their business model, decision making and

even products [1]. A central step in the analysis process

often is the construction and training of a machine learning

model, which itself entails several challenging steps, most

notably feature preprocessing, algorithm selection, hyperpa-

rameter tuning and ensemble building. Usually a lot of expert

knowledge is necessary to successfully perform all these

steps [2]. The ﬁeld of automated machine learning (AutoML)

aims to develop methods that build suitable machine learning

models without (or as little as possible) human interven-

tion [3]. While there are many possible ways to state the

AutoML Problem, we here focus on systems that address

the ”Combined Algorithm Selection and Hyperparameter

optimization” (CASH) problem [4]. A solver for the CASH

problem aims to pick an algorithm from a list of options

and then tune it to give the highest validation performance

amongst all the (algorithm, hyperparameter) combinations.

In this paper, we give a comprehensive overview of the

current state of AutoML and present new independent bench-

mark results of currently out of the box available systems

as well as of cutting edge research. The next chapter gives

insight why AutoML is currently not only heavily researched,

but also of practical relevance in business and industry.

Chapter III gives an overview of the current state of AutoML

by introducing the most important concepts and systems.

In chapter IV, we present benchmark results between the

scientiﬁc state of the art and an industrial prototype; we

then discuss them with regard to practical AutoML system

design choices in chapter V. Lastly, in chapter VI we give a

summary and outlook on approaches that will likely play a

role the next important advancements in AutoML.

II. IMPACT OF PRACTICAL AUTOMATED ML

In recent years, machine learning has been applied to

more and more domains. Industrial applications for example,

such as predictive maintenance [5], [6] and defect detection

[7], enable companies to be more proactive and improve

efﬁciency. In the area of healthcare, patient data have helped

addressing complex diseases, such as multiple sclerosis, and

support doctors in identifying the most appropriate therapy

[8]. In the insurance and banking sectors, risks in loan

applications [9] and claims processing [10], [11] can be

estimated, enabling automated identiﬁcation of fraudulent

patterns. Finally, advances in sales and revenue forecasting

support supply chain optimization [12].

However, the process towards building such actionable

machine learning models, able to generate added value to the

business, is time-consuming and error-prone, if done manu-

ally; performance of different models should be compared,

considering different algorithms, hyperparameter tuning and

feature selection. This process is highly iterative and as such

is a ideal candidate for automation. With the use of AutoML,

the data scientist is freed from this tedious task and can focus

on more creative tasks, delivering more value to the company.

New business cases can be identiﬁed, assessed and validated

in a rapid-prototype-fashion.

In practice, AutoML can provide different kind of insights.

Already at an early stage, running different models on the

input data can provide feedback on how suitable the data

is for predicting the given target. When multiple models

built from a wide spectrum of algorithms do not perform

signiﬁcantly better than the baseline, this can be seen as

an indication of insufﬁcient predictive power in the data.

Ideally, however, good models will be generated and the

data scientist is left with the choice of deploying the best

generated model or to create an ensemble from a selection

of models. Finally, there is a by-product in AutoML when

optimizing over the feature set as well: One can derive an

estimate of feature importance by statistical analysis of the

model quality depending on which features are used as input

to the models.

Thus, the introduction of AutoML tools in a company can

drastically increase efﬁciency of the work of a data scientist.

Taking the example of one of our predictive maintenance

projects in the area of public transport, where a team of

three data scientists worked full time for weeks, a machine

learning model with an out-of-sample area under the curve

(AUC) of 0.81 was developed. A few months after that

milestone, the prototype of the AutoML tool described later

in Section IV (DSM), using the same dataset (and no other

help or information) was used as a benchmark to evaluate

the beneﬁts of this tool. It resulted in an automatically gen-

erated model that was slightly out-performing the manually

engineered model with an AUC of 0.82 after a run time of

half an hour.

III. STATE OF THE ART IN AUTOMATED ML

The problem of manual hyperparameter tuning [13] in-

spired researchers to automate various blocks of the machine

learning pipeline: feature engineering [14], meta-learning

[15], architecture search [16] as well as full Combined

Model Selection and Hyperparameter optimization [17] are

the research lines which grabbed a great deal of attention in

the past years. We review them in this order.

Feature engineering: Feature preprocessing, representa-

tion learning and selecting the most discriminant features for

a given classiﬁcation or regression task are problems targeted

by the literature. Gaudel et al. [18] consider feature engineer-

ing as a one-player game and train a reinforcement learning-

based agent to select the best features. To do so, they ﬁrst

model the feature selection problem as a Markov Decision

Process (MDP). Second, they propose a reward associated

with generalization error in the ﬁnal status. The agent learns

an optimal policy to minimize the ﬁnal generalization error.

Explorekit [19] not only iteratively selects the features

but also generates new feature candidates to obtain the

most discriminant ones. Katz et al. use normalization and

discrimination operators on a single feature to generate unary

features. They additionally combine two or more features to

generate new candidates and train a feature rank estimator

based on meta-features extracted from the datasets and

candidates. The feature with the highest rank that increases

the classiﬁcation accuracy above a certain threshold is added

to the selected feature set in every iteration.

To reduce the computational complexity of iterative fea-

ture selection methods, Learning Feature Engineering (LFE)

[20] learns the effectiveness of a transformation based on

previous experiments. The original feature space is subse-

quently mapped via the optimal transformation to compute

a discriminant feature representation.

AutoLearn [21] is a regression-based algorithm for au-

tomatic feature selection. The proposed algorithm starts by

ﬁltering the original features and discard the ones with small

information gain. Subsequently, feature pairs are ﬁltered

based on distance correlation to omit dependent pairs. The

new features are generated based on the remaining pairs

using ridge regression. Ultimately, the best features are the

ones with the highest information gain and stability [22].

AutoLearn has been applied to a range of datasets including

Gene expression data.

Meta-learning here refers to methods that try to leverage

meta information about the problem at hand, e.g. the dataset

as well as the available algorithms and their conﬁgurations, to

improve the performance of an AutoML system. This meta-

information is often gathered and processed using machine

learning methods, thus in a sense applying the discipline

to itself. The meta information about datasets often consists

of some basic statistical reference numbers and some land-

marks, i.e. performance ﬁgures of simple algorithms [23].

Learning curve prediction attempts to learn a model that

predicts how much the performance of a learner will improve

if given more training time [24]. Another take on this idea

are attempts to predict the running time of algorithms [25].

Instead of predicting absolute performance ﬁgures, it has

sometimes proven beneﬁcial to predict a ranking of the

available algorithms to choose from [26].

Meta-learners in the context of neural networks aim to im-

prove the optimizer of a deep or shallow (convolutional) neu-

ral network (CNN) to reach a minimum as quick as possible

through automatic hyper-parameter tuning. Andrychowicz et

al. [15] learn to predict the best set of hyperparameters for

optimizing neural networks with gradients and a Long Short-

Term Memory [27] network. Similarly, Chen et al. [28] train

an optimizer for simple synthetic functions such as Gaussian

Processes. They demonstrate that the optimizer generalizes

to a wide range of black-box problems. For instance, the

trained optimizer is used to tune the hyperparameters of a

Support Vector Machine [29] without accessing the gradients

of the loss function with respect to the hyperparameters.

Architecture search literature discusses methods for ﬁnd-

ing the best performing architecture for neural networks

automatically without human expert intervention. Elsken et

al. [30] propose Neural Architecture Search by Hill-climbing

(NASH) using local search. The algorithm starts with a well-

performing (preferably pretrained) convolutional architecture

(parent). Then, two types of network morphisms (transfor-

mations) are randomly applied to generate deeper or wider

architecture children from the original parent network. The

children architectures are trained, and the best-performing

architecture qualiﬁes for the next round. The algorithm

iterates until the validation accuracy saturates.

Real et al. [31] propose an evolutionary architecture search

based on a pairwise comparison within the population: the

algorithm starts with an initial population as parents, and

every network undergoes random mutations such as adding

and removing convolutional layers and skip connections to

produce offspring. Subsequently, parent and child compete

in pairwise comparison, with the winner model staying in

the population and the loser being discarded.

In contrast to evolutionary algorithms, where larger and

more accurate architectures are desired, He et al. [32] auto-

matically search for compressing a given CNN for mobile

and embedded applications. Their AutoML for Model Com-

pression (AMC) algorithm trains a reinforcement learning

agent to estimate the sparsity ratio of each layer and then

compress the layers sequentially.

Zoph et al. [16] train a controller using reinforcement

learning and a Recurrent Neural Network (RNN) to tune

the hyperparameters of a deep CNN architecture such as

width and height of ﬁlters and strides. The RNN controller

is trained using a policy gradient method to maximize the

network’s accuracy on a hold-out set of data.

CASH: The main focus of this paper lies on the Combined

Model Selection and Hyperparameter optimization problem,

which can be solved by employing a combination of building

blocks mentioned above. Ultimately, a full solution ﬁnds the

best machine learning pipeline for raw (un-preprocessed)

feature vectors in the shortest time for a given amount of

computational resources. This inspired the series of Auto-

mated Machine Learning (AutoML) challenges since 2015

[33]. A complete pipeline includes data cleaning, feature

engineering (selection and construction), model selection,

hyperparameter optimization and ﬁnally building an ensem-

ble of the top trained models to obtain good performance

on unseen test data. Optimizing the entire machine learning

pipeline (that is not necessarily differentiable end-to-end)

is a challenging task, and different solutions have been

investigated using various approaches.

Hyperparameter optimization is a crucial step for solv-

ing the entire CASH problem, with Bayesian optimization

[34] being the most prominent specimen of respective ap-

proaches. The goal here is to build a model of expected loss

and variance for every input. After each optimization step,

the model (or current belief) is updated using the a posteriori

information (hence the name Bayesian).

An acquisition function is deﬁned that decides at which

location to sample the next true loss, trading off regions of

low expected loss (exploitation) and regions of high variance

(exploration). Usually, Gaussian Processes are the model

of choice in Bayesian optimization; alternatively, Random

Forests have been used to model the loss surface of the

hyperparameters as a Gaussian distribution in Sequential

Model-based optimization for general Algorithm Conﬁgura-

tion (SMAC) [35] or the Tree-structured Parzen Estimator

[36].

Model free methods include Successive Halving [37] and

built on it Hyperband [13], which uses real time optimization

progress to narrow down a set of competing hyperparameter

conﬁgurations over the duration of a full optimization run,

possibly with many restarts. A slight variation of this are

Evolutionary Strategies that also allow for perturbations of

the individual conﬁgurations during training [38]. In the

special case where the optimizee as well as the optimizer

are differentiable, multiple iterations of the optimizer can

be unrolled, and an update for the hyperparamaters can be

computed by using gradient descent and backpropagation

[39].

Pipeline Optimization: Complete machine learning

pipelines, including feature prepossessing, model selection,

hyperparameters optimization and building ensembles, are

constructed based on different views of the entire problem.

Bayesian optimization, Genetic programming [40], and ban-

dit optimization inspired developing various pipeline opti-

mization frameworks.

Auto-sklearn is aimed at solving the CASH problem using

meta-learning, Bayesian optimization and ensemble building.

First, it extracts meta-features of a new dataset such as task

type (classiﬁcation or regression), number of classes, the

dimensionality of the feature vectors, number of samples and

so on. The meta-learner of Auto-sklearn uses these meta-

features to initialize the optimization step based on previous

experience on similar data sets (similar according to the

meta features). Then, preprocessing and model hyperparam-

eters are iteratively enhanced using Bayesian optimization.

Ultimately, a robust classiﬁer or regression model is built

based on an ensemble of models trained during the iterative

optimization.

Tree-based Pipeline Optimization Tool (TPOT) is an al-

gorithm uses genetic programming to optimize the machine

learning pipeline. It searches for the best pipeline including

feature processing, model and hyperparameters for a given

classiﬁcation or regression task. The feature processing mod-

ule of TPOT works in conjunction with feature selection

and generation. The feature generation block performs the

kernel trick [41] or dimensionality reduction methods. The

optimization is done using genetic programming: ﬁrst, the al-

gorithm generates some tree-based pipelines randomly. Then,

it selects the top 20% of the generated population based on

cross-validation accuracy, and produces 5 descendants from

each by randomly changing a point in the pipeline. The

algorithm continues until a stopping criterion is met.

The ATM framework [42] ﬁnally uses multi-armed bandit

optimization in combination with hybrid Bayesian optimiza-

tion to ﬁnd the best models.

IV. EXPERIMENTAL EVALUATION

In this section, we evaluate the usefulness of AutoML

for application in business and industry by empirically

comparing the most successful automated machine learning

algorithms with (a) an industrial prototype as well as (b) a

straight-forward improvement inspired by Hyperband [13],

[43] (c). This selection spans a wide range of different ap-

proaches for pipeline optimization (see Section III) to tackle

the CASH problem: the industrial prototype DSM [44] uses

random model and hyperparameter search and thus serves as

a baseline; Auto-sklearn [17] has won the recent AutoML

challenges [33]. Additionally, we report results with TPOT

[45], which is developed based on genetic programming [40]

instead of Auto-sklearn’s Bayesian optimization. We gave a

brief overview of each system below:

Data Science Machine (DSM): The Data Science Ma-

chine (DSM) [44] has been developed for both in-house and

client-related data science projects by PwC. DSM includes

a portfolio of open-source machine learning algorithms,

and also offers the possibility to add custom algorithms

through a language-agnostic API. Given a dataset, the tool

automatically optimizes over the set of algorithms, features,

and hyperparameters. The developed solution offers multiple

optimization strategies, e.g. tuneable genetic algorithms. In

the context of this paper, however, we limited it to use

random sampling to serve as a baseline.

Auto-sklearn [17]: We used the algorithm explained in

Section III to ﬁnd the best pipeline within the time-budget

computed for training 100 models by DSM. Auto-sklearn is

slow in start [17]; however, it eventually reaches a solution

very close to the optimum. The method beneﬁts from meta-

learning using similar datasets and it is usually pretrained on

OpenML datasets [46] in the challenges. We applied the base

model of Auto-sklearn to compare the exploration efﬁciency

of the different approaches.

TPOT [45]: This algorithm uses tree-based classiﬁers

which is similar to the second entry of the latest AutoML

challenge [33]. TPOT differs from the other presented meth-

ods since it used Genetic programming for optimization. We

initialized the algorithm with a population of 20 tree-based

pipelines and stopped the optimization with the same time

budget as DSM and Auto-sklearn.

Fig. 1. Schematic overview of the Portfolio Hyperband workﬂow.

Portfolio Hyperband [13], [43]: Inspired by PoSH Auto-

sklearn [43] that combines a portfolio of initial conﬁgurations

with successive halving (SH) and Bayesian optimization, we

built a system that combines a portfolio with Hyperband

[13]. Our goal was to combine the portfolio variant of meta-

learning, which is very simple and fast, with Hyperband that

should give the system a good asymptotic performance. At

the basis of Hyperband is the successive halving algorithm

that starts with an initial set of conﬁgurations and trains all

of them for a ﬁxed amount of time; then, the less performing

half of the conﬁgurations is dropped. This process is repeated

until only the best performing conﬁguration remains, hence

the name successive halving. An issue with SH is that it

is unclear with how many initial conﬁgurations to start the

process. Hyperband performs a geometric search to explore

the trade-off between individual training time and the amount

of different initial conﬁgurations.

In order to create a collection of promising conﬁguration

candidates, we surveyed all the meta data available on

OpenML [46] and extracted a list of different conﬁgurations

that worked well for binary classiﬁcation. This list is used to

seed Hyperband. Every run also uses some random initialisa-

tions to improve long term performance. To test the viability

of Portfolio Hyperband we applied it to binary classiﬁcation

only.

To compare the presented automated machine learning ap-

proaches, we select some datasets with various classiﬁcation

and regression tasks from previous AutoML challenges [33].

We chose all datasets that are supported by DSM, TPOT

and Auto-sklearn. Because the ofﬁcial test labels are not

public we used our own training, validation and test split.

For the DSM (random search as a baseline), we randomly

pick a set of 100 models and hyperparameters, train the

models and extract the best-performing ones for each data

set. During the experiments, the time budget used for DSM

is computed and used as the reference value for the rest of

the optimization methods. Auto-sklearn, TPOT and Portfolio

Hyperband subsequently use this time budget to ﬁnd the best

pipeline. We start training from scratch without pretraining

on any similar data set; therefore, the meta-learning block of

Auto-sklearn is not well tuned. Since Portfolio Hyperband

turned out to ﬁnd good initial models very fast, we also

report its performance after a considerably shortened period

of 10 minutes.

The results of the benchmarking are presented in Table

I. Numerical evaluation suggests that all three sophisticated

approaches outperform the DSM in baseline mode (only

random search) consistently, but not by a large margin.

However, the accuracy of the three approaches from

current research is quite similar, while Portfolio Hyperband

appears to be especially quick.

V. DISCUSSION

As we showed in the previous section, none of the pre-

sented methods is clearly superior to all of the others. We can

identify two major take-aways. First, we note that random

search (DSM) is still quite competitive, especially when it is

constrained to a relatively small set of tried and true options.

It falls short with respect to systems that leverage meta-

learning, especially with very constrained time budgets. A

take-away from this is that sometimes it can make sense to

invest in faster and more hardware for parallel search rather

than in a very sophisticated AutoML solution.

Second, we ﬁnd that the use of meta data to guide

the search or even pre-trained models is one of the most

potent ways to speed up AutoML. Working with completely

unrelated data (e.g. from OpenML) already yields a sizeable

speed-up in the Portfolio Hyperband system. The closer the

data on which the meta-learner is trained (ofﬂine data) and

DSM Auto-Sklearn TPOT Portfolio Hyperband

Dataset Task Metric Test Time Test Time Test Time Test, full time Test, 10 Min

Cadata Regression R2 (coefﬁcient of determination) 0.7119 55.0 0.7327 54.9 0.7989 54.6 - -

Christine Binary classiﬁcation Balanced accuracy score 0.7146 99.4 0.7392 99.3 0.7442 105.1 0.753 0.744

Digits Multiclass classiﬁcation Balanced accuracy score 0.8751 201.2 0.9542 201.2 0.9476 207.2 - -

Fabert Multiclass classiﬁcation Accuracy score 0.8665 77.5 0.8908 77.4 0.8835 78.5 - -

Helena Multiclass classiﬁcation Balanced accuracy score 0.2103 190.2 0.3235 216.4 0.3470 197.5 - -

Jasmine Binary classiﬁcation Balanced accuracy score 0.8371 24.1 0.8214 24.0 0.8326 25.9 - -

Madeline Binary classiﬁcation Balanced accuracy score 0.7686 48.3 0.8896 48.2 0.8684 53.0 0.868 0.848

Philippine Binary classiﬁcation Balanced accuracy score 0.7406 56.3 0.7634 56.2 0.7703 56.4 0.741 0.753

Sylvine Binary classiﬁcation Balanced accuracy score 0.9233 28.9 0.9350 28.9 0.9415 29.0 0.947 0.916

Volkert Multiclass classiﬁcation Accuracy score 0.8154 122.3 0.8880 122.2 0.8720 125.5 - -

Average Performance 0.7463 90.31 0.7938 92.85 0.8006 93.26

TABLE I

PERFORMANCE OF SELECTED AUTOMATED MACHINE LEARNING ALGORITHMS ON AUTOML CHALLENGE DATA SETS [33]. ”TEST” REFERS TO OUR

OWN TEST SPLIT; ”TIME” IS IN MINUTES; ”PORTFOLIO HYPERBAND” ONLY TESTED ON BINARY CLASSIFICATION.

the data to be analyzed (online data) are in distribution,

the the stronger this effect gets—up to the point where

the model can be almost fully trained on the ofﬂine data

and then is only ﬁne tuned using the online data. Meta

learning therefore is especially attractive for business cases

where a continuous data generating process (e.g. monthly

reports, continuous sensor feedback) produces new data that

is similar in distribution to the old data already seen from

the same source.

VI. SUMMARY AND OUTLOOK

As the amount of digital data is rapidly growing, data-

driven approaches such as shallow and deep machine learn-

ing are increasingly used, in turn increasing the demand for

efﬁcient and generalizable AutoML. In this paper, we have

presented an independent evaluation of current approaches

in the ﬁeld of shallow AutoML, and have presented our own

version of Portfolio Hyperband that shows promising results

in terms of computational efﬁciency while being on par with

the state of the art accuracy-wise.

Current AutoML approaches can be summarized as care-

fully engineered systems based on a collection of well

established ideas. In this context, the use of meta-information

for efﬁciency reasons—i.e., making the best of the used

compute time—seems to be the most expandable idea. It

bears similarity to the way human machine learning experts

use their experience to solve a machine learning task, and

we have surveyed some very successful attempts to utilize

previously trained meta-knowledge in AutoML setups above

[17]. Yet, the effectiveness of the meta-information is highly

dependent on the similarity of the tasks at hand to the ones

that have been encountered in the meta-learning step, as well

as on the metric to determine this similarity.

Future improvements in the ﬁeld will thus likely be made

based on more general concepts instead of just further

engineering. Of special interest in this regard is one line

of meta-learning research that aims at learning to exploit the

intrinsic structure in the problems of interest in an automatic

way [15]. Such a meta-learner is trained on a variety of

objective functions (directly or indirectly formulated) and

learns how to efﬁciently move on the objective function’s

response surface in order to ﬁnd an optimum. In other

words, the efﬁcient exploration of high dimensional spaces

is learned.

This concept of learning to optimize is currently lacking in

the AutoML frameworks introduced in Section III. However,

the approach has shown to be able to match the performance

of heavily engineered Bayesian optimization solutions such

as Spearmint, SMAC and TPE while being massively faster

[28]. Thus, we consider the concept of learning to optimize

a promising direction for future work. As it delivers a very

general black box optimizer, it is well-suited to be applied

to both AutoML as described in this paper as well as to

architecture search in deep learning—an increasingly more

relevant research topic.

Current approaches of learning to optimize require the

accessibility of the objective function’s gradient in either

the meta-training phase or both, the meta-training phase and

during execution [15] [28] [47]. As AutoML is a task of

optimizing a non-smooth and not explicitly known objective

function, there is often no accessibility to this gradient. Thus,

future work aims at the idea of learning to optimize, but the

meta-training paradigm is changed to reinforcement learning

to enable training on more realistic, i.e. non-smooth objective

functions.

ACKNOWLEDGEMENT

We are grateful for support by Innosuisse grant 25948.1

PFES “Ada” and helpful discussions with Martin Jaggi.

REFERENCES

[1] M. Braschler, K. Stockinger, and T. Stadelmann (Eds.), Applied Data

Science—Lessons Learned for the Data-Driven Business. Springer

International Publishing, 2019.

[2] B. B. Meier, I. Elezi, M. Amirian, O. D

urr, and T. Stadelmann,

“Learning neural models for end-to-end clustering,” in IAPR Workshop

on Artiﬁcial Neural Networks in Pattern Recognition, pp. 126–138,

Springer, 2018.

[3] F. Hutter, L. Kotthoff, and J. Vanschoren, “Automatic machine learn-

ing: methods, systems, challenges,” Challenges in Machine Learning,

2019.

[4] C. Thornton, F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Auto-

weka: Combined selection and hyperparameter optimization of clas-

siﬁcation algorithms,” in Proceedings of the 19th ACM SIGKDD

international conference on Knowledge discovery and data mining,

pp. 847–855, ACM, 2013.

Automated Machine Learning in Practice: State of the Art and Recent Results

Figures

Citations

Benchmark and Survey of Automated Machine Learning Frameworks

Automated Machine Learning: The New Wave of Machine Learning

Applications of machine learning algorithms for biological wastewater treatment: Updates and perspectives

The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Automated Machine Learning—A Brief Review at the End of the Early Years

References

Long short-term memory

Support-Vector Networks

Neural Architecture Search with Reinforcement Learning

Sequential model-based optimization for general algorithm configuration

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Related Papers (5)

Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms

Feature Evaluation of Emerging E-Learning Systems Using Machine Learning: An Extensive Survey

Machine Learning and Learning Analytics: Integrating Data with Learning

Parallel learning: a perspective and a framework

Challenges and Opportunities in Applied Machine Learning

Frequently Asked Questions (10)

Q1. What are the contributions mentioned in the paper "Automated machine learning in practice: state of the art and recent results" ?

Q2. What is the way to tune a support vector machine?

Q3. What are the three main problems targeted by the literature?

Q4. What is the main idea of a complete pipeline?

Q5. What research lines have grabbed the attention in the past years?

Q6. What is the main idea behind Explorekit?

Q7. What is the main point of the paper?

Q8. What is the architecture for a network?

Q9. What is the process of generating the top 20% of the pipeline?

Q10. What is the main idea of learning to optimize?