scispace - formally typeset
Search or ask a question

Showing papers in "Statistica Neerlandica in 2010"


Journal ArticleDOI
TL;DR: In this article, the pseudobalanced estimation method, ML(R), and two diagonally weighted least squares (DWLS) methods are compared by simulating a multilevel factor model with unbalanced data.
Abstract: Multilevel structural equation modeling (multilevel SEM) has become an established method to analyze multilevel multivariate data. The first useful estimation method was the pseudobalanced method. This method is approximate because it assumes that all groups have the same size, and ignores unbalance when it exists. In addition, full information maximum likelihood (ML) estimation is now available, which is often combined with robust chi-squares and standard errors to accommodate unmodeled heterogeneity (MLR). In addition, diagonally weighted least squares (DWLS) methods have become available as estimation methods. This article compares the pseudobalanced estimation method, ML(R), and two DWLS methods by simulating a multilevel factor model with unbalanced data. The simulations included different sample sizes at the individual and group levels and different intraclass correlation (ICC). The within-group part of the model posed no problems. In the between part of the model, the different ICC sizes had no effect. There is a clear interaction effect between number of groups and estimation method. ML reaches unbiasedness fastest, then the two DWLS methods, then MLR, and then the pseudobalanced method (which needs more than 200 groups). We conclude that both ML(R) and DWLS are genuine improvements on the pseudobalanced approximation. With small sample sizes, the robust methods are not recommended.

233 citations


Journal ArticleDOI
TL;DR: In this paper, a short survey on limit theorems for certain functionals of semimartingales, which are observed at high frequency, is presented, and the main ideas of the theory to a broader audience are explained.
Abstract: This paper presents a short survey on limit theorems for certain functionals of semimartingales, which are observed at high frequency. Our aim is to explain the main ideas of the theory to a broader audience. We introduce the concept of stable convergence, which is crucial for our purpose. We show some laws of large numbers (for the continuous and the discontinuous case) that are the most interesting from a practical point of view, and demonstrate the associated stable central limit theorems. Moreover, we state a simple sketch of the proofs and give some examples.

93 citations


Journal ArticleDOI
TL;DR: In this paper, an estimator of the Levy-Khinchine characteristics of the process and optimal rates of convergence simultaneously in T and ¢ were derived, under the usual low and high-frequency assumptions and obtain also asymptotics in the midfrequency regime.
Abstract: A Levy process is observed at time points of distance ¢ until time T. We construct an estimator of the Levy-Khinchine characteristics of the process and derive optimal rates of convergence simultaneously in T and ¢. Thereby, we encompass the usual low- and high-frequency assumptions and obtain also asymptotics in the mid-frequency regime.

90 citations


Journal ArticleDOI
TL;DR: By incorporating surgeon factors, the accuracy of out‐of‐sample prediction of case durations of about 1250 surgical operations in 2009 is improved by up to more than 15% compared with current planning procedures.
Abstract: Accurate prediction of medical operation times is of crucial importance for cost-efficient operation room planning in hospitals. This paper investigates the possible dependence of procedure times on surgeon factors like age, experience, gender and team composition. The effect of these factors is estimated for over 30 different types of medical operations in two hospitals, by means of ANOVA models for logarithmic case durations. The estimation data set contains about 30,000 observations from 2005 to 2008. The relevance of surgeon factors depends on the type of operation. The factors found most often to be significant are team composition, experience and time of the day. Contrary to widespread opinions among surgeons, gender has nearly never a significant effect. By incorporating surgeon factors, the accuracy of out-of-sample prediction of case durations of about 1250 surgical operations in 2009 is improved by up to more than 15% compared with current planning procedures.

61 citations


Journal ArticleDOI
TL;DR: This paper considers non-parametric maximum likelihood and least squares estimators of a k-monotone density g(0), and proves existence of the estimators and gives characterizations.
Abstract: Shape constrained densities are encountered in many nonparametric estimation problems. The classes of monotone or convex (and monotone) densities can be viewed as special cases of the classes of k monotone densities. A density g is said to be k monotone if and only if ( 1) l g (l) is nonnegative, nonincreasing and convex for l = 0,...,k 2 if k 2, and g is simply nonincreasing if k = 1. These classes of shaped constrained densities bridge the gap between the classes of monotone (1-monotone) and convex decreasing (2-monotone) densities for which asymptotic results are known, and the class of completely monotone (1 monotone) densities. It is well-known that a density is completely monotone if and only if it is a scale mixture of exponential densities (Bernstein’s theorem). Thus one motivation for studying the problem of estimation of a k monotone density is to try to gain insight into the problem of estimating a completely monotone density. In this series of four papers we consider both (nonparametric) Maximum Likelihood estimators and Least Squares estimators of a k monotone estimator. In this first part (part 1), we prove existence of the estimators and give careful characterizations. We also establish consistency properties, and show that the estimators are splines of order k (degree k 1) with simple knots. We further provide asymptotic minimax risk lower

53 citations


Journal ArticleDOI
TL;DR: In this paper, a time-dependent criterion to compare the residual lifetimes of two systems is introduced, by taking into account the age of systems, which enables one to obtain, at time t, the probability that the residual lifetime Xt is greater than the residual life cycle Yt.
Abstract: Let the random variables X and Y denote the lifetimes of two systems. In reliability theory to compare between the lifetimes of X and Y there are several approaches. Among the most popular methods of comparing the lifetimes are to compare the survival functions, the failure rates and the mean residual lifetime functions of X and Y. Assume that both systems are operating at time t > 0. Then the residual lifetimes of them are Xt=X−t | X>t and Yt=Y−t | Y>t, respectively. In this paper, we introduce, by taking into account the age of systems, a time-dependent criterion to compare the residual lifetimes of them. In other words, we concentrate on function R(t ):=P(Xt>Yt) which enables one to obtain, at time t, the probability that the residual lifetime Xt is greater than the residual lifetime Yt. It is mentioned, in Brown and Rutemiller (IEEE Transactions on Reliability, 22, 1973) that the probability of type R(t) is important for designing as long-lived a product as possible. Several properties of R(t) and its connection with well-known reliability measures are investigated. The estimation of R(t) based on samples from X and Y is also discussed.

43 citations


Journal ArticleDOI
TL;DR: In this article, an overview of recent research on the performance evaluation and design of carousel systems is given, including picking strategies for problems involving one carousel, consider the throughput of the system for problems with two carousels, and present an extensive literature review.
Abstract: This paper gives an overview of recent research on the performance evaluation and design of carousel systems. We discuss picking strategies for problems involving one carousel, consider the throughput of the system for problems involving two carousels, give an overview of related problems in this area and present an extensive literature review. Emphasis has been given on future research directions in this area.

37 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered the conditional mode function when the covariates take values in some abstract function space and established the almost complete convergence and the asymptotic normality of the kernel estimator of the conditional modes when the process is assumed to be strongly mixing and under the concentration property over the functional regressors.
Abstract: We consider the estimation of the conditional mode function when the covariates take values in some abstract function space. The main goal of this paper was to establish the almost complete convergence and the asymptotic normality of the kernel estimator of the conditional mode when the process is assumed to be strongly mixing and under the concentration property over the functional regressors. Some applications are given. This approach can be applied in time-series analysis to the prediction and confidence band building. We illustrate our methodology by using El Nio data.

37 citations


Journal ArticleDOI
TL;DR: In this article, a new bivariate generalized Poisson distribution (GPD) that allows any type of correlation is defined and studied, and the marginal distributions of the bivariate model are the univariate GPDs.
Abstract: In this paper, a new bivariate generalized Poisson distribution (GPD) that allows any type of correlation is defined and studied The marginal distributions of the bivariate model are the univariate GPDs The parameters of the bivariate distribution are estimated by using the moment and maximum likelihood methods Some test statistics are discussed and one numerical data set is used to illustrate the applications of the bivariate model

34 citations


Journal ArticleDOI
TL;DR: In this article, a general asymptotic theory for the estimation of strictly stationary and ergodic time series models is developed, under simple conditions that are straightforward to check.
Abstract: This paper develops a general asymptotic theory for the estimation of strictly stationary and ergodic time–series models. Under simple conditions that are straightforward to check, we establish the strong consistency, the rate of strong convergence and the asymptotic normality of a general class of estimators that includes LSE, MLE and some M-type estimators. As an application, we verify the assumptions for the long-memory fractional ARIMA model. Other examples include the GARCH(1,1) model, random coefficient AR(1) model and the threshold MA(1) model.

30 citations


Journal ArticleDOI
TL;DR: In this paper, the effects of the introduction of 5-year impact factors were investigated for all disciplines available in the Journal Citation Reports, and the results showed that the statistics discipline indeed benefits from the 5 year window, that is, the impact factor increases.
Abstract: In this paper, we investigate the effects of the introduction of 5-year impact factors. We collect impact factor data for all disciplines available in the Journal Citation Reports. For all these categories, we give insights into the relationship between the traditional 2-year impact factor and the new 5-year impact factor. Our main focus is to investigate whether the traditionally low impact factors for statistics journals improve with this new measure. The results show that the statistics discipline indeed benefits from the 5-year window, that is, the impact factor increases. This appears to be true for most disciplines, although the statistics discipline ranks among the top 15 (out of 171) disciplines in this respect.

Journal ArticleDOI
TL;DR: In this article, a multi-sample step-stress model is proposed for the case of a simple step-stress experiment under exponentially distributed lifetimes when time constraints are in place in the experimentation.
Abstract: In the context of accelerated life testing, a step-stress model allows for testing under different conditions at various intermediate stages of the experiment. The goal is to develop inference for the mean lifetime at each stress level. The maximum likelihood estimates (MLEs) exist only when some (at least one) failures are observed at each stress level. This limitation can be tackled by a multi-sample step-stress model, which imposes a weaker condition for the existence of the MLEs, i.e. at each stress level, some failures (at least one) must be observed in at least one of the samples. The step-stress experiment with multiple samples at the same stress levels was introduced by Kateri et al. (Journal of Statistical Planning and Inference, 139, 2009a). In this article, we focus on the likelihood inference under such a multi-sample set-up for the case of a simple step-stress experiment under exponentially distributed lifetimes when time constraints are in place in the experimentation.

Journal ArticleDOI
TL;DR: A randomized two-stage adaptive design for allocation of patients to treatments and comparison in a phase III clinical trial with survival time as treatment responses and the possibility of several covariates is considered.
Abstract: A randomized two-stage adaptive design is proposed and studied for allocation of patients to treatments and comparison in a phase III clinical trial with survival time as treatment responses. We consider the possibility of several covariates in the design and analysis. Several exact and limiting properties of the design and the follow-up inference are studied, both numerically and theoretically. The applicability of the proposed methodology is illustrated by using some real data.

Journal ArticleDOI
TL;DR: In this paper, nonparametric estimation of the Levy density for pure jump Levy processes is studied for discrete time observations that may be irregularly sampled or possibly corrupted by a small noise independent of the main process.
Abstract: In this paper, we study nonparametric estimation of the Levy density for pure jump Levy processes. We consider $n$ discrete time observations that may be irregularly sampled or possibly corrupted by a small noise independent of the main process. The case of non noisy observations with regular sampling interval has been studied by the authors in previous works which are the benchmark for the extensions proposed here. We study first the case of a regular sampling interval and noisy data, then the case of irregular sampling for non noisy data. In each case, non adaptive and adaptive estimators are proposed and risk bounds are derived.

Journal ArticleDOI
TL;DR: In this article, the authors review the construction and properties of some popular approaches to modeling LIBOR rates and discuss the following frameworks: classical LIBOR market models, forward price models and Markov functional models.
Abstract: In this article, we review the construction and properties of some popular approaches to modeling LIBOR rates. We discuss the following frameworks: classical LIBOR market models, forward price models and Markov-functional models. We close with the recently developed affine LIBOR models.

Journal ArticleDOI
TL;DR: This article showed that there is no formal statistical testing method to support combining categories in a standard ordered regression model and discussed the practical implications of this result, and showed that combining categories is difficult to be done in practice.
Abstract: We show that there is no formal statistical testing method to support combining categories in a standard ordered regression model. We discuss practical implications of this result.

Journal ArticleDOI
TL;DR: This analysis suggests that minimal correlation exists between reported malaria rates and climate in western Africa, and negates the idea that climate change will increase malaria transmission in this region.
Abstract: Malaria is a leading cause of infectious disease and death worldwide. As a common example of a vector-borne disease, malaria could be greatly affected by the influence of climate change. Climate impacts the transmission of malaria in several ways, affecting all stages of the disease's development. Using various weather-related factors that influence climate change, this study utilizes statistical analysis to determine the effect of climate change on reported malaria rates in an African region with endemic malaria. It examines the relationship between malaria prevalence and climate in western Africa using spatial regression modeling and tests for correlation. Our analysis suggests that minimal correlation exists between reported malaria rates and climate in western Africa. This analysis further contradicts the prevailing theory that climate and malaria prevalence are closely linked and negates the idea that climate change will increase malaria transmission in this region.

Journal ArticleDOI
TL;DR: In this article, the authors present an alternative and apparently simple theory to derive the moment convergence of Z-estimators, where the cases of parameters with different rate of convergence can be treated easily and smoothly and any large deviation type inequalities necessary for the same result for M-stimators do not appear in this approach.
Abstract: The problem to establish the asymptotic distribution of statistical estimators as well as the moment convergence of such estimators has been recognized as an important issue in advanced theories of statistics. This problem has been deeply studied for M-estimators for a wide range of models by many authors. The purpose of this paper is to present an alternative and apparently simple theory to derive the moment convergence of Z-estimators. In the proposed approach the cases of parameters with different rate of convergence can be treated easily and smoothly and any large deviation type inequalities necessary for the same result for M-estimators do not appear in this approach. Applications to the model of i.i.d. observation, Cox’s regression model as well as some diffusion process are discussed.

Journal ArticleDOI
TL;DR: In this paper, the authors introduce and study multivariate elliptic processes, thus providing a dynamic counterpart to the (static) multi-dimensional elliptic distributions, and discuss discrete versus continuous time modelling, jump processes versus diffusions, and semimartingales.
Abstract: We introduce and study multivariate elliptic processes, thus providing a dynamic counterpart to the (static) multivariate elliptic distributions. We pay special attention to the dynamics for Levy processes and diffusions. We also discuss discrete versus continuous time modelling, jump processes versus diffusions, and semimartingales. Some data analysis illustrates the theory.

Journal ArticleDOI
TL;DR: In this article, the authors introduce a methodology for the semiparametric or non-parametric two-sample equivalence problem when the effects are specified by statistical functionals, which is invariant under strictly monotone transformations of the data.
Abstract: The present paper introduces a methodology for the semiparametric or non-parametric two-sample equivalence problem when the effects are specified by statistical functionals. The mean relative risk functional of two populations is given by the average of the time-dependent risk. This functional is a meaningful non-parametric quantity, which is invariant under strictly monotone transformations of the data. In the case of proportional hazard models, the functional determines just the proportional hazard risk factor. It is shown that an equivalence test of the type of the two-sample Savage rank test is appropriate for this functional. Under proportional hazards, this test can be carried out as an exact level α test. It also works quite well under other semiparametric models. Similar results are presented for a Wilcoxon rank-sum test for equivalence based on the Mann–Whitney functional given by the relative treatment effect.

Journal ArticleDOI
TL;DR: In 1998, the International Skating Union and the International Olympic Committee decided to skate the 500m twice during World Single Distances Championships, Olympic Games, and World Cups as mentioned in this paper.
Abstract: In 1998, the International Skating Union and the International Olympic Committee decided to skate the 500-m twice during World Single Distances Championships, Olympic Games, and World Cups. The decision was based on a study by the Norwegian statistician N. L. Hjort, who showed that in the period 1984-1994, there was a significant difference between 500-m times skated with a start in the inner and outer lanes. Since the introduction of the clap skate in the season 1997-1998, however, there has been a general feeling that this difference is no longer significant. In this article we show that this is, in fact, the case.

Journal ArticleDOI
TL;DR: In this article, the authors introduce a third-order characteristic for multi-type point processes, based on the number of r-close triples of points, where the three points are of three different types (species).
Abstract: The description and analysis of spatial point patterns have mainly been based on first- and second-order characteristics. However, and especially when analyzing complex and multivariate point patterns, the use of higher-order characteristics would be more informative. In this paper, we introduce a third-order characteristic for multi-type point processes, which is based on the number of r-close triples of points, where the three points are of three different types (species). This characteristic is useful, when the second-order characteristics indicate that the three point patterns are pairwise uncorrelated but there is some relationship between triples of points. Furthermore, we conjecture that the new statistic can be used to test independence between the three point processes.

Journal ArticleDOI
TL;DR: In this article, a class of weighted profile least squares estimators (WPLSEs) was proposed for the parametric components, which achieve the semiparametric efficiency bound and are asymptotically normal.
Abstract: This paper is concerned with the statistical inference on seemingly unrelated varying coefficient partially linear models. By combining the local polynomial and profile least squares techniques, and estimating the contemporaneous correlation, we propose a class of weighted profile least squares estimators (WPLSEs) for the parametric components. It is shown that the WPLSEs achieve the semiparametric efficiency bound and are asymptotically normal. For the non-parametric components, by applying the undersmoothing technique, and taking the contemporaneous correlation into account, we propose an efficient local polynomial estimation. The resulting estimators are shown to have mean-squared errors smaller than those estimators that neglect the contemporaneous correlation. In addition, a class of variable selection procedures is developed for simultaneously selecting significant variables and estimating unknown parameters, based on the non-concave penalized and weighted profile least squares techniques. With a proper choice of regularization parameters and penalty functions, the proposed variable selection procedures perform as efficiently as if one knew the true submodels. The proposed methods are evaluated using wide simulation studies and applied to a set of real data.

Journal ArticleDOI
TL;DR: In this paper, the authors put most of the algorithms in one framework, using classical Operations Research paradigms such as backtracking, depth-first and breadth-first search and branch-and-bound.
Abstract: Mining frequent itemsets is a flourishing research area. Many papers on this topic have been published and even some contests have been held. Most papers focus on speed and introduce ad hoc algorithms and data structures. In this paper we put most of the algorithms in one framework, using classical Operations Research paradigms such as backtracking, depth-first and breadth-first search and branch-and-bound. Moreover, we present experimental results where the different algorithms are implemented under similar designs.

Journal ArticleDOI
TL;DR: A novel formulation of a shared random effects model is presented and shown to provide a dropout selection parameter with a meaningful interpretation and a large reduction of bias for the semiparametric model relatively to the parametric model at times where the drop out rate is high or the dropout model is misspecified.
Abstract: We propose a family of regression models to adjust for nonrandom dropouts in the analysis of longitudinal outcomes with fully observed covariates. The approach conceptually focuses on generalized linear models with random effects. A novel formulation of a shared random effects model is presented and shown to provide a dropout selection parameter with a meaningful interpretation. The proposed semiparametric and parametric models are made part of a sensitivity analysis to delineate the range of inferences consistent with observed data. Concerns about model identifiability are addressed by fixing some model parameters to construct functional estimators that are used as the basis of a global sensitivity test for parameter contrasts. Our simulation studies demonstrate a large reduction of bias for the semiparametric model relatively to the parametric model at times where the dropout rate is high or the dropout model is misspecified. The methodology's practical utility is illustrated in a data analysis.

Journal ArticleDOI
TL;DR: In this article, the asymptotic behavior of the power variation of processes of the form, where Sα is an α-stable process with index of stability 0 <α<2 and the integral is an Ito integral, is investigated.
Abstract: In this article we consider the asymptotic behavior of the power variation of processes of the form , where Sα is an α-stable process with index of stability 0<α<2 and the integral is an Ito integral. We establish stable convergence of corresponding fluctuations. These results provide statistical tools to infer the process u from discrete observations.