scispace - formally typeset
Search or ask a question
Author

Tom Dhaene

Bio: Tom Dhaene is an academic researcher from Ghent University. The author has contributed to research in topics: Parametric statistics & Surrogate model. The author has an hindex of 44, co-authored 433 publications receiving 8012 citations. Previous affiliations of Tom Dhaene include Eindhoven University of Technology & Agilent Technologies.


Papers
More filters
Journal ArticleDOI
TL;DR: A new visualization technique is introduced, called FlowSOM, which analyzes Flow or mass cytometry data using a Self‐Organizing Map, using a two‐level clustering and star charts, to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise.
Abstract: The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self-Organizing Map. Using a two-level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor.

1,109 citations

Journal Article
TL;DR: This paper presents a mature, flexible, and adaptive machine learning toolkit for regression modeling and active learning to tackle issues of computational cost and model accuracy.
Abstract: An exceedingly large number of scientific and engineering fields are confronted with the need for computer simulations to study complex, real world phenomena or solve challenging design problems. However, due to the computational cost of these high fidelity simulations, the use of neural networks, kernel methods, and other surrogate modeling techniques have become indispensable. Surrogate models are compact and cheap to evaluate, and have proven very useful for tasks such as optimization, design space exploration, prototyping, and sensitivity analysis. Consequently, in many fields there is great interest in tools and techniques that facilitate the construction of such regression models, while minimizing the computational cost and maximizing model accuracy. This paper presents a mature, flexible, and adaptive machine learning toolkit for regression modeling and active learning to tackle these issues. The toolkit brings together algorithms for data fitting, model selection, sample selection (active learning), hyperparameter optimization, and distributed computing in order to empower a domain expert to efficiently generate an accurate model for the problem or data at hand.

490 citations

Journal ArticleDOI
TL;DR: A robust approach is presented which removes the sparsity of the block-structured least-squares equations by a direct application of the QR decomposition and considerable savings in terms of computation time and memory requirements are obtained.
Abstract: Broadband macromodeling of large multiport systems by vector fitting can be time consuming and resource demanding when all elements of the system matrix share a common set of poles. This letter presents a robust approach which removes the sparsity of the block-structured least-squares equations by a direct application of the QR decomposition. A 60-port printed circuit board example illustrates that considerable savings in terms of computation time and memory requirements are obtained.

473 citations

Journal ArticleDOI
TL;DR: The authors propose the efficient multiobjective optimization (EMO) algorithm which uses Kriging models and multi objective versions of the probability of improvement and expected improvement criteria to identify the Pareto front with a minimal number of expensive simulations.
Abstract: The use of surrogate based optimization (SBO) is widely spread in engineering design to reduce the number of computational expensive simulations. However, "real-world" problems often consist of multiple, conflicting objectives leading to a set of competitive solutions (the Pareto front). The objectives are often aggregated into a single cost function to reduce the computational cost, though a better approach is to use multiobjective optimization methods to directly identify a set of Pareto-optimal solutions, which can be used by the designer to make more efficient design decisions (instead of weighting and aggregating the costs upfront). Most of the work in multiobjective optimization is focused on multiobjective evolutionary algorithms (MOEAs). While MOEAs are well-suited to handle large, intractable design spaces, they typically require thousands of expensive simulations, which is prohibitively expensive for the problems under study. Therefore, the use of surrogate models in multiobjective optimization, denoted as multiobjective surrogate-based optimization, may prove to be even more worthwhile than SBO methods to expedite the optimization of computational expensive systems. In this paper, the authors propose the efficient multiobjective optimization (EMO) algorithm which uses Kriging models and multiobjective versions of the probability of improvement and expected improvement criteria to identify the Pareto front with a minimal number of expensive simulations. The EMO algorithm is applied on multiple standard benchmark problems and compared against the well-known NSGA-II, SPEA2 and SMS-EMOA multiobjective optimization methods.

197 citations

Journal ArticleDOI
TL;DR: In this article, the Orthonormal Vector Fitting (OVF) technique is presented, which uses orthonormal rational functions to improve the numerical stability of the method and reduces the numerical sensitivity of the system equations to the choice of starting poles significantly and limits the overall macromodeling time.
Abstract: Vector Fitting is widely accepted as a robust macromodeling tool for approximating frequency domain responses of complex physical structures. In this paper, the Orthonormal Vector Fitting technique is presented, which uses orthonormal rational functions to improve the numerical stability of the method. This reduces the numerical sensitivity of the system equations to the choice of starting poles significantly and limits the overall macromodeling time

192 citations


Cited by
More filters
Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Journal ArticleDOI

6,278 citations

01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Journal Article
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

1,992 citations