A Bayesian machine scientist to aid in the solution of challenging scientific problems.
Roger Guimerà,Ignasi Reichardt,Antoni Aguilar-Mogas,Francesco Alessandro Massucci,Manuel Miranda,Jordi Pallares,Marta Sales-Pardo +6 more
Reads0
Chats0
TLDR
In this paper, the authors introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions.Abstract:
Closed-form, interpretable mathematical models have been instrumental for advancing our understanding of the world; with the data revolution, we may now be in a position to uncover new such models for many systems from physics to the social sciences. However, to deal with increasing amounts of data, we need “machine scientists” that are able to extract these models automatically from data. Here, we introduce a Bayesian machine scientist, which establishes the plausibility of models using explicit approximations to the exact marginal posterior over models and establishes its prior expectations about models by learning from a large empirical corpus of mathematical expressions. It explores the space of models using Markov chain Monte Carlo. We show that this approach uncovers accurate models for synthetic and real data and provides out-of-sample predictions that are more accurate than those of existing approaches and of other nonparametric methods.read more
Citations
More filters
Journal ArticleDOI
The turning point and end of an expanding epidemic cannot be precisely forecast
Abstract: Epidemic spread is characterized by exponentially growing dynamics, which are intrinsically unpredictable. The time at which the growth in the number of infected individuals halts and starts decreasing cannot be calculated with certainty before the turning point is actually attained; neither can the end of the epidemic after the turning point. A susceptible-infected-removed (SIR) model with confinement (SCIR) illustrates how lockdown measures inhibit infection spread only above a threshold that we calculate. The existence of that threshold has major effects in predictability: A Bayesian fit to the COVID-19 pandemic in Spain shows that a slowdown in the number of newly infected individuals during the expansion phase allows one to infer neither the precise position of the maximum nor whether the measures taken will bring the propagation to the inhibition regime. There is a short horizon for reliable prediction, followed by a dispersion of the possible trajectories that grows extremely fast. The impossibility to predict in the midterm is not due to wrong or incomplete data, since it persists in error-free, synthetically produced datasets and does not necessarily improve by using larger datasets. Our study warns against precise forecasts of the evolution of epidemics based on mean-field, effective, or phenomenological models and supports that only probabilities of different outcomes can be confidently given.
Posted Content
Discovering Symbolic Models from Deep Learning with Inductive Biases
Miles Cranmer,Alvaro Sanchez-Gonzalez,Peter W. Battaglia,Rui Xu,Kyle Cranmer,David N. Spergel,Shirley Ho +6 more
TL;DR: In this paper, a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases is proposed. But the approach is restricted to Graph Neural Networks (GNNs).
Journal ArticleDOI
Accelerating organic solar cell material's discovery: high-throughput screening and big data.
TL;DR: In this article, the authors present some of the computational (pre)screening approaches performed prior to experimentation to select the most promising molecular candidates from the available materials libraries or, alternatively, generate molecules beyond human intuition.
Journal ArticleDOI
Predicting the photocurrent–composition dependence in organic solar cells
Xabier Rodríguez-Martínez,Enrique Pascual-San-José,Zhuping Fei,Martin Heeney,Roger Guimerà,Mariano Campoy-Quiles +5 more
TL;DR: Training artificial intelligence algorithms with self-consistent datasets consisting of thousands of data points obtained by high-throughput evaluation methods identifies highly predictive models that only employ the materials band gaps, thus largely simplifying the rationale of the photocurrent–composition space.
Journal ArticleDOI
Performance of Metal-Catalyzed Hydrodebromination of Dibromomethane Analyzed by Descriptors Derived from Statistical Learning
Ali J. Saadun,Sergio Pablo-García,Vladimir Paunović,Qiang Li,Albert Sabadell-Rendón,Kevin Kleemann,Frank Krumeich,Núria López,Javier Pérez-Ramírez +8 more
TL;DR: In this article, a combinatorial strategy for semi hydrogenation of dibromomethane (CH2Br2) to methyl bromide (CH3Br) is presented.
References
More filters
Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal ArticleDOI
Estimating the Dimension of a Model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Estimating the dimension of a model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Journal ArticleDOI
Equation of state calculations by fast computing machines
TL;DR: In this article, a modified Monte Carlo integration over configuration space is used to investigate the properties of a two-dimensional rigid-sphere system with a set of interacting individual molecules, and the results are compared to free volume equations of state and a four-term virial coefficient expansion.
Posted Content
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Andreas Müller,Joel Nothman,Gilles Louppe,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +18 more
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.