scispace - formally typeset
Open AccessJournal ArticleDOI

A Bayesian machine scientist to aid in the solution of challenging scientific problems.

Reads0
Chats0
TLDR
In this paper, the authors introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions.
Abstract
Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world; with the data revolution, we may now be in a position to uncover new such models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need “machine scientists” that are able to extract these models automatically from data. Here, we introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions. It explores the space of models using Markov chain Monte Carlo. We show that this approach uncovers accurate models for synthetic and real data and provides out-of-sample predictions that are more accurate than those of existing approaches and of other nonparametric methods.

read more

Citations
More filters
Journal ArticleDOI

The turning point and end of an expanding epidemic cannot be precisely forecast

Abstract: Epidemic spread is characterized by exponentially growing dynamics, which are intrinsically unpredictable. The time at which the growth in the number of infected individuals halts and starts decreasing cannot be calculated with certainty before the turning point is actually attained; neither can the end of the epidemic after the turning point. A susceptible-infected-removed (SIR) model with confinement (SCIR) illustrates how lockdown measures inhibit infection spread only above a threshold that we calculate. The existence of that threshold has major effects in predictability: A Bayesian fit to the COVID-19 pandemic in Spain shows that a slowdown in the number of newly infected individuals during the expansion phase allows one to infer neither the precise position of the maximum nor whether the measures taken will bring the propagation to the inhibition regime. There is a short horizon for reliable prediction, followed by a dispersion of the possible trajectories that grows extremely fast. The impossibility to predict in the midterm is not due to wrong or incomplete data, since it persists in error-free, synthetically produced datasets and does not necessarily improve by using larger datasets. Our study warns against precise forecasts of the evolution of epidemics based on mean-field, effective, or phenomenological models and supports that only probabilities of different outcomes can be confidently given.
Posted Content

Discovering Symbolic Models from Deep Learning with Inductive Biases

TL;DR: In this paper, a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases is proposed. But the approach is restricted to Graph Neural Networks (GNNs).
Journal ArticleDOI

Accelerating organic solar cell material's discovery: high-throughput screening and big data.

TL;DR: In this article, the authors present some of the computational (pre)screening approaches performed prior to experimentation to select the most promising molecular candidates from the available materials libraries or, alternatively, generate molecules beyond human intuition.
Journal ArticleDOI

Predicting the photocurrent–composition dependence in organic solar cells

TL;DR: Training artificial intelligence algorithms with self-consistent datasets consisting of thousands of data points obtained by high-throughput evaluation methods identifies highly predictive models that only employ the materials band gaps, thus largely simplifying the rationale of the photocurrent–composition space.
Journal ArticleDOI

Performance of Metal-Catalyzed Hydrodebromination of Dibromomethane Analyzed by Descriptors Derived from Statistical Learning

TL;DR: In this article, a combinatorial strategy for semi hydrogenation of dibromomethane (CH2Br2) to methyl bromide (CH3Br) is presented.
References
More filters
Journal Article

Scikit-learn: Machine Learning in Python

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal ArticleDOI

Estimating the Dimension of a Model

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.

Estimating the dimension of a model

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Journal ArticleDOI

Equation of state calculations by fast computing machines

TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Related Papers (5)