scispace - formally typeset
Search or ask a question
Author

Paul H. C. Eilers

Bio: Paul H. C. Eilers is an academic researcher from Erasmus University Rotterdam. The author has contributed to research in topics: Smoothing & Mixed model. The author has an hindex of 56, co-authored 212 publications receiving 15481 citations. Previous affiliations of Paul H. C. Eilers include Rijkswaterstaat & Wageningen University and Research Centre.


Papers
More filters
Journal ArticleDOI
TL;DR: A relatively large number of knots and a difference penalty on coefficients of adjacent B-splines are proposed to use and connections to the familiar spline penalty on the integral of the squared second derivative are shown.
Abstract: B-splines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number allows only limited control over smoothness and fit. We propose to use a relatively large number of knots and a difference penalty on coefficients of adjacent B-splines. We show connections to the familiar spline penalty on the integral of the squared second derivative. A short overview of B-splines, of their construction and of penalized likelihood is presented. We discuss properties of penalized B-splines and propose various criteria for the choice of an optimal penalty parameter. Nonparametric logistic regression, density estimation and scatterplot smoothing are used as examples. Some details of the computations are presented.

3,512 citations

Journal ArticleDOI
TL;DR: This paper presents a smoother based on penalized least squares, extending ideas presented by Whittaker 80 years ago, which is extremely fast, gives continuous control over smoothness, interpolates automatically, and allows fast leave-one-out cross-validation.
Abstract: The well-known and popular Savitzky−Golay filter has several disadvantages. A very attractive alternative is a smoother based on penalized least squares, extending ideas presented by Whittaker 80 years ago. This smoother is extremely fast, gives continuous control over smoothness, interpolates automatically, and allows fast leave-one-out cross-validation. It can be programmed in a few lines of Matlab code. Theory, implementation, and applications are presented.

998 citations

Journal ArticleDOI
TL;DR: The steady-state properties of the dynamic model of the relationship between light intensity and the rate of photosynthesis in phytoplankton are studied, and a production curve is derived from it that makes it possible to derive of temperature in a mechanistic way.

904 citations

Journal ArticleDOI
TL;DR: A parametric model is proposed for the warping function when aligning chromatograms, allowing alignment of batches of chromatogram based on warping functions for a limited number of calibration samples.
Abstract: A parametric model is proposed for the warping function when aligning chromatograms. A very fast and stable algorithm results that consumes little memory and avoids the artifacts of dynamic time warping. The parameters of the warping function are useful for quality control. They also are easily interpolated, allowing alignment of batches of chromatograms based on warping functions for a limited number of calibration samples.

572 citations

Journal ArticleDOI
TL;DR: The data provide compelling evidence that expression profiling is a more accurate and objective method to classify gliomas than histologic classification, and molecular classification therefore may aid diagnosis and can guide clinical decision making.
Abstract: Gliomas are the most common primary brain tumors with heterogeneous morphology and variable prognosis. Treatment decisions in patients rely mainly on histologic classification and clinical parameters. However, differences between histologic subclasses and grades are subtle, and classifying gliomas is subject to a large interobserver variability. To improve current classification standards, we have performed gene expression profiling on a large cohort of glioma samples of all histologic subtypes and grades. We identified seven distinct molecular subgroups that correlate with survival. These include two favorable prognostic subgroups (median survival, >4.7 years), two with intermediate prognosis (median survival, 1-4 years), two with poor prognosis (median survival, <1 year), and one control group. The intrinsic molecular subtypes of glioma are different from histologic subgroups and correlate better to patient survival. The prognostic value of molecular subgroups was validated on five independent sample cohorts (The Cancer Genome Atlas, Repository for Molecular Brain Neoplasia Data, GSE12907, GSE4271, and Li and colleagues). The power of intrinsic subtyping is shown by its ability to identify a subset of prognostically favorable tumors within an external data set that contains only histologically confirmed glioblastomas (GBM). Specific genetic changes (epidermal growth factor receptor amplification, IDH1 mutation, and 1p/19q loss of heterozygosity) segregate in distinct molecular subgroups. We identified a subgroup with molecular features associated with secondary GBM, suggesting that different genetic changes drive gene expression profiles. Finally, we assessed response to treatment in molecular subgroups. Our data provide compelling evidence that expression profiling is a more accurate and objective method to classify gliomas than histologic classification. Molecular classification therefore may aid diagnosis and can guide clinical decision making.

558 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal Article
TL;DR: For the next few weeks the course is going to be exploring a field that’s actually older than classical population genetics, although the approach it’ll be taking to it involves the use of population genetic machinery.
Abstract: So far in this course we have dealt entirely with the evolution of characters that are controlled by simple Mendelian inheritance at a single locus. There are notes on the course website about gametic disequilibrium and how allele frequencies change at two loci simultaneously, but we didn’t discuss them. In every example we’ve considered we’ve imagined that we could understand something about evolution by examining the evolution of a single gene. That’s the domain of classical population genetics. For the next few weeks we’re going to be exploring a field that’s actually older than classical population genetics, although the approach we’ll be taking to it involves the use of population genetic machinery. If you know a little about the history of evolutionary biology, you may know that after the rediscovery of Mendel’s work in 1900 there was a heated debate between the “biometricians” (e.g., Galton and Pearson) and the “Mendelians” (e.g., de Vries, Correns, Bateson, and Morgan). Biometricians asserted that the really important variation in evolution didn’t follow Mendelian rules. Height, weight, skin color, and similar traits seemed to

9,847 citations

Journal ArticleDOI
TL;DR: The Central Brain Tumor Registry of the United States (CBTRUS), in collaboration with the Centers for Disease Control and Prevention and National Cancer Institute, is the largest population-based registry focused exclusively on primary brain and other central nervous system (CNS) tumors in the US.
Abstract: The Central Brain Tumor Registry of the United States (CBTRUS), in collaboration with the Centers for Disease Control (CDC) and National Cancer Institute (NCI), is the largest population-based registry focused exclusively on primary brain and other central nervous system (CNS) tumors in the United States (US) and represents the entire US population. This report contains the most up-to-date population-based data on primary brain tumors (malignant and non-malignant) and supersedes all previous CBTRUS reports in terms of completeness and accuracy. All rates (incidence and mortality) are age-adjusted using the 2000 US standard population and presented per 100,000 population. The average annual age-adjusted incidence rate (AAAIR) of all malignant and non-malignant brain and other CNS tumors was 23.79 (Malignant AAAIR=7.08, non-Malignant AAAIR=16.71). This rate was higher in females compared to males (26.31 versus 21.09), Blacks compared to Whites (23.88 versus 23.83), and non-Hispanics compared to Hispanics (24.23 versus 21.48). The most commonly occurring malignant brain and other CNS tumor was glioblastoma (14.5% of all tumors), and the most common non-malignant tumor was meningioma (38.3% of all tumors). Glioblastoma was more common in males, and meningioma was more common in females. In children and adolescents (age 0-19 years), the incidence rate of all primary brain and other CNS tumors was 6.14. An estimated 83,830 new cases of malignant and non-malignant brain and other CNS tumors are expected to be diagnosed in the US in 2020 (24,970 malignant and 58,860 non-malignant). There were 81,246 deaths attributed to malignant brain and other CNS tumors between 2013 and 2017. This represents an average annual mortality rate of 4.42. The 5-year relative survival rate following diagnosis of a malignant brain and other CNS tumor was 23.5% and for a non-malignant brain and other CNS tumor was 82.4%.

9,802 citations

Journal ArticleDOI
TL;DR: VOSviewer’s ability to handle large maps is demonstrated by using the program to construct and display a co-citation map of 5,000 major scientific journals.
Abstract: We present VOSviewer, a freely available computer program that we have developed for constructing and viewing bibliometric maps. Unlike most computer programs that are used for bibliometric mapping, VOSviewer pays special attention to the graphical representation of bibliometric maps. The functionality of VOSviewer is especially useful for displaying large bibliometric maps in an easy-to-interpret way. The paper consists of three parts. In the first part, an overview of VOSviewer’s functionality for displaying bibliometric maps is provided. In the second part, the technical implementation of specific parts of the program is discussed. Finally, in the third part, VOSviewer’s ability to handle large maps is demonstrated by using the program to construct and display a co-citation map of 5,000 major scientific journals.

7,719 citations

Journal ArticleDOI
Simon N. Wood1
TL;DR: In this article, a Laplace approximation is used to obtain an approximate restricted maximum likelihood (REML) or marginal likelihood (ML) for smoothing parameter selection in semiparametric regression.
Abstract: Summary. Recent work by Reiss and Ogden provides a theoretical basis for sometimes preferring restricted maximum likelihood (REML) to generalized cross-validation (GCV) for smoothing parameter selection in semiparametric regression. However, existing REML or marginal likelihood (ML) based methods for semiparametric generalized linear models (GLMs) use iterative REML or ML estimation of the smoothing parameters of working linear approximations to the GLM. Such indirect schemes need not converge and fail to do so in a non-negligible proportion of practical analyses. By contrast, very reliable prediction error criteria smoothing parameter selection methods are available, based on direct optimization of GCV, or related criteria, for the GLM itself. Since such methods directly optimize properly defined functions of the smoothing parameters, they have much more reliable convergence properties. The paper develops the first such method for REML or ML estimation of smoothing parameters. A Laplace approximation is used to obtain an approximate REML or ML for any GLM, which is suitable for efficient direct optimization. This REML or ML criterion requires that Newton–Raphson iteration, rather than Fisher scoring, be used for GLM fitting, and a computationally stable approach to this is proposed. The REML or ML criterion itself is optimized by a Newton method, with the derivatives required obtained by a mixture of implicit differentiation and direct methods. The method will cope with numerical rank deficiency in the fitted model and in fact provides a slight improvement in numerical robustness on the earlier method of Wood for prediction error criteria based smoothness selection. Simulation results suggest that the new REML and ML methods offer some improvement in mean-square error performance relative to GCV or Akaike's information criterion in most cases, without the small number of severe undersmoothing failures to which Akaike's information criterion and GCV are prone. This is achieved at the same computational cost as GCV or Akaike's information criterion. The new approach also eliminates the convergence failures of previous REML- or ML-based approaches for penalized GLMs and usually has lower computational cost than these alternatives. Example applications are presented in adaptive smoothing, scalar on function regression and generalized additive model selection.

4,846 citations