scispace - formally typeset
Search or ask a question
Author

Vincent Dubourg

Bio: Vincent Dubourg is an academic researcher from International Facility Management Association. The author has contributed to research in topics: Kriging & Reliability (statistics). The author has an hindex of 8, co-authored 16 publications receiving 63374 citations.

Papers
More filters
Journal Article
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

47,974 citations

Posted Content
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

28,898 citations

Journal ArticleDOI
TL;DR: In this paper, the authors propose to use a Kriging surrogate for the performance function as a means to build a quasi-optimal importance sampling density, which can be applied to analytical and finite element reliability problems and proves efficient up to 100 basic random variables.

389 citations

Journal ArticleDOI
TL;DR: The aim of the present paper is to develop a strategy for solving reliability-based design optimization (RBDO) problems that remains applicable when the performance models are expensive to evaluate.
Abstract: The aim of the present paper is to develop a strategy for solving reliability-based design optimization (RBDO) problems that remains applicable when the performance models are expensive to evaluate. Starting with the premise that simulation-based approaches are not affordable for such problems, and that the most-probable-failure-point-based approaches do not permit to quantify the error on the estimation of the failure probability, an approach based on both metamodels and advanced simulation techniques is explored. The kriging metamodeling technique is chosen in order to surrogate the performance functions because it allows one to genuinely quantify the surrogate error. The surrogate error onto the limit-state surfaces is propagated to the failure probabilities estimates in order to provide an empirical error measure. This error is then sequentially reduced by means of a population-based adaptive refinement technique until the kriging surrogates are accurate enough for reliability analysis. This original refinement strategy makes it possible to add several observations in the design of experiments at the same time. Reliability and reliability sensitivity analyses are performed by means of the subset simulation technique for the sake of numerical efficiency. The adaptive surrogate-based strategy for reliability estimation is finally involved into a classical gradient-based optimization algorithm in order to solve the RBDO problem. The kriging surrogates are built in a so-called augmented reliability space thus making them reusable from one nested RBDO iteration to the other. The strategy is compared to other approaches available in the literature on three academic examples in the field of structural mechanics.

354 citations

Dissertation
05 Dec 2011
TL;DR: This manuscript proposes a surrogate-based strategy where the limit-state function is progressively replaced by a Kriging meta-model, a probabilistic design approach aimed at considering the uncertainty attached to the system of interest in order to provide optimal and safe solutions.
Abstract: This thesis is a contribution to the resolution of the reliability-based design optimization problem. This probabilistic design approach is aimed at considering the uncertainty attached to the system of interest in order to provide optimal and safe solutions. The safety level is quantified in the form of a probability of failure. Then, the optimization problem consists in ensuring that this failure probability remains less than a threshold specified by the stakeholders. The resolution of this problem requires a high number of calls to the limit-state design function underlying the reliability analysis. Hence it becomes cumbersome when the limit-state function involves an expensive-to-evaluate numerical model (e.g. a finite element model). In this context, this manuscript proposes a surrogate-based strategy where the limit-state function is progressively replaced by a Kriging meta-model. A special interest has been given to quantifying, reducing and eventually eliminating the error introduced by the use of this meta-model instead of the original model. The proposed methodology is applied to the design of geometrically imperfect shells prone to buckling.

136 citations


Cited by
More filters
Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations

Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations

Journal ArticleDOI
TL;DR: SciPy as discussed by the authors is an open source scientific computing library for the Python programming language, which includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics.
Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

12,774 citations

Proceedings ArticleDOI
13 Aug 2016
TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.
Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

11,104 citations