Top 47 Statistical Methods and Applications papers published in 2015

Journal Article•DOI•

Multivariate functional outlier detection

[...]

Mia Hubert¹, Peter J. Rousseeuw¹, Pieter Segaert¹•Institutions (1)

03 Mar 2015-Statistical Methods and Applications

TL;DR: A taxonomy of functional outliers is set up, and new numerical and graphical techniques for the detection of outliers in multivariate functional data are constructed, with univariate curves included as a special case.

...read moreread less

Abstract: Functional data are occurring more and more often in practice, and various statistical techniques have been developed to analyze them. In this paper we consider multivariate functional data, where for each curve and each time point a $p$-dimensional vector of measurements is observed. For functional data the study of outlier detection has started only recently, and was mostly limited to univariate curves $(p=1)$. In this paper we set up a taxonomy of functional outliers, and construct new numerical and graphical techniques for the detection of outliers in multivariate functional data, with univariate curves included as a special case. Our tools include statistical depth functions and distance measures derived from them. The methods we study are affine invariant in $p$-dimensional space, and do not assume elliptical or any other symmetry.

...read moreread less

156 citations

Journal Article•DOI•

Cluster-weighted t-factor analyzers for robust model-based clustering and dimension reduction

[...]

Sanjeena Subedi¹, Antonio Punzo², Salvatore Ingrassia², Paul D. McNicholas³•Institutions (3)

University of Guelph¹, University of Catania², McMaster University³

01 Mar 2015-Statistical Methods and Applications

TL;DR: Artificial and real data show that these cluster-weighted models have very good clustering performance and that the algorithm is able to recover the parameters very well.

...read moreread less

Abstract: Cluster-weighted models represent a convenient approach for model-based clustering, especially when the covariates contribute to defining the cluster-structure of the data. However, applicability may be limited when the number of covariates is high and performance may be affected by noise and outliers. To overcome these problems, common/uncommon $t$-factor analyzers for the covariates, and a $t$-distribution for the response variable, are here assumed in each mixture component. A family of twenty parsimonious variants of this model is also presented and the alternating expectation-conditional maximization algorithm, for maximum likelihood estimation of the parameters of all models in the family, is described. Artificial and real data show that these models have very good clustering performance and that the algorithm is able to recover the parameters very well.

...read moreread less

52 citations

Journal Article•DOI•

On the Gini coefficient normalization when attributes with negative values are considered

[...]

Emanuela Raffinetti¹, Elena Siletti¹, Achille Vernizzi¹•Institutions (1)

University of Milan¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: A reformulation of the Gini coefficient with respect to that proposed in the literature is presented and discussed in light of the negative income issue, revealing more coherence with the classical situation of maximum inequality.

...read moreread less

Abstract: Typically, inequality indices appear both as basic concepts in the analysis of welfare economics and as technical tools applied to income or other transferable attributes. Several findings in such research fields are provided by the standard Gini coefficient, traditionally introduced for incomes taking non-negative values. Even if negative income can appear as an unfamiliar concept, it can arise in real surveys, especially when assessing families’ financial assets. The main troubles associated with the treatment of negative income regards the violation of the normalization principle. The inclusion of income taking negative values can yield for the standard Gini coefficient achieving values $$>$$ 1. The Gini coefficient then has to be adjusted in order to ensure that its range is bounded between 0 and 1. In this paper, a reformulation of the Gini coefficient with respect to that proposed in the literature is presented and discussed in light of the negative income issue. In particular, a new definition of the Gini coefficient normalization term, revealing more coherence with the classical situation of maximum inequality, is provided. Finally, an empirical application based on the Survey of Household Income and Wealth data of the Bank of Italy (2012) further validates the actual attitude of the new Gini coefficient in catching inequality in the distribution of the attribute.

...read moreread less

43 citations

Journal Article•DOI•

Analysis of spatio-temporal mobile phone data: a case study in the metropolitan area of Milan

[...]

Piercesare Secchi¹, Simone Vantini¹, Valeria Vitelli²•Institutions (2)

Polytechnic University of Milan¹, University of Oslo²

15 Jan 2015-Statistical Methods and Applications

TL;DR: This novel approach integrates the treelet decomposition with a proper treatment of spatial dependence, obtained through a Bagging Voronoi strategy, and points out some interesting temporal patterns interpretable in terms of population density mobility.

...read moreread less

Abstract: We analyze geo-referenced high-dimensional data describing the use over time of the mobile-phone network in the urban area of Milan, Italy. Aim of the analysis is to identify subregions of the metropolitan area of Milan sharing a similar pattern along time, and possibly related to activities taking place in specific locations and/or times within the city. To tackle this problem, we develop a non-parametric method for the analysis of spatially dependent functional data, named Bagging Voronoi Treelet analysis. This novel approach integrates the treelet decomposition with a proper treatment of spatial dependence, obtained through a Bagging Voronoi strategy. The latter relies on the aggregation of different replicates of the analysis, each involving a set of functional local representatives associated to random Voronoi-based neighborhoods covering the investigated area. Results clearly point out some interesting temporal patterns interpretable in terms of population density mobility (e.g., daily work activities in the tertiary district, leisure activities in residential areas in the evenings and in the weekend, commuters movements along the highways during rush hours, and localized mob concentrations related to occasional events). Moreover we perform simulation studies, aimed at investigating the properties and performances of the method, and whose description is available online as Supplementary material.

...read moreread less

39 citations

Journal Article•DOI•

On the efficiency of Gini’s mean difference

[...]

Carina Gerstenberger¹, Daniel Vogel²•Institutions (2)

Ruhr University Bochum¹, University of Aberdeen²

07 May 2015-Statistical Methods and Applications

TL;DR: The efficiency of Gini’s mean difference (the mean of all pairwise distances) is examined, an analytic expression for the finite-sample variance of Ginis mean difference at the normal mixture model is derived by means of the residue theorem, and the contamination fraction in Tukey's 1:3 normal mixture distribution is determined.

...read moreread less

Abstract: The asymptotic relative eciency of the mean deviation with respect to the standard deviation is 88% at the normal distribution. In his seminal 1960 paper A survey of sampling from contaminated distributions, J. W. Tukey points out that, if the normal distribution is contaminated by a small -fraction of a normal distribution with three times the standard deviation, the mean deviation is more ecient than the standard deviation| already for < 1%. This came as a surprise to most statisticians at the time, and the publication is today considered as one of the main pioneering works in the development of robust statistics. In the present article, we examine the eciency of the mean deviation and Gini's mean dierence

...read moreread less

37 citations

Journal Article•DOI•

On a new absolutely continuous bivariate generalized exponential distribution

[...]

Seyed Mohsen Mirhosseini¹, Mohammad Amini¹, Debasis Kundu², Ali Dolati³•Institutions (3)

Ferdowsi University of Mashhad¹, Indian Institute of Technology Kanpur², Yazd University³

01 Mar 2015-Statistical Methods and Applications

TL;DR: This paper studied a three-parameter absolutely continuous bivariate distribution whose marginals are generalized exponential distributions, and proposes to use EM algorithm to compute the maximum likelihood estimators, which can be implemented quite conveniently.

...read moreread less

Abstract: In this paper we studied a three-parameter absolutely continuous bivariate distribution whose marginals are generalized exponential distributions. The proposed three-parameter bivariate distribution can be used quite effectively as an alternative to the Block and Basu bivariate exponential distribution. The joint probability density function, the joint cumulative distribution function and its associated copula have simple forms. We derive different properties of this new distribution. The maximum likelihood estimators of the unknown parameters can be obtained by solving simultaneously three non-linear equations. We propose to use EM algorithm to compute the maximum likelihood estimators, which can be implemented quite conveniently. One data set has been analyzed for illustrative purposes. Finally we propose some generalization of the proposed model.

...read moreread less

24 citations

Journal Article•DOI•

Exploring copulas for the imputation of complex dependent data

[...]

F. Marta L. Di Lascio¹, Simone Giannerini², Alessandra Reale³•Institutions (3)

Free University of Bozen-Bolzano¹, University of Bologna², National Institute of Statistics³

01 Mar 2015-Statistical Methods and Applications

TL;DR: A copula-based method for imputing missing data by using conditional density functions of the missing variables given the observed ones and its results indicate that the proposal compares favourably with classical methods in terms of preservation of microdata, margins and dependence structure.

...read moreread less

Abstract: In this work we introduce a copula-based method for imputing missing data by using conditional density functions of the missing variables given the observed ones. In theory, such functions can be derived from the multivariate distribution of the variables of interest. In practice, it is very difficult to model joint distributions and derive conditional distributions, especially when the margins are different. We propose a natural solution to the problem by exploiting copulas so that we derive conditional density functions through the corresponding conditional copulas. The approach is appealing since copula functions enable us (1) to fit any combination of marginal distribution functions, (2) to take into account complex multivariate dependence relationships and (3) to model the marginal distributions and the dependence structure separately. We describe the method and perform a Monte Carlo study in order to compare it with two well-known imputation techniques: the nearest neighbour donor imputation and the regression imputation by EM algorithm. Our results indicate that the proposal compares favourably with classical methods in terms of preservation of microdata, margins and dependence structure.

...read moreread less

20 citations

Journal Article•DOI•

A new method for adding two parameters to a family of distributions with application to the normal and exponential families

[...]

Haroon M. Barakat¹•Institutions (1)

Zagazig University¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: A new method is introduced to add two parameters to a family of distributions that may be positively–negatively asymmetric and leptokurtic–platykurtic and may be symmetric and has non-constant hazard rate function.

...read moreread less

Abstract: In this paper we introduce a new method to add two parameters to a family of distributions. Through the additional parameters we can fully control the skewness and kurtosis of the resulting family. This method is applied to yield a new two-parameter extension of the standard normal distribution, which may be positively–negatively asymmetric and leptokurtic–platykurtic. In addition, this method is applied to yield a new three-parameter extension of the exponential distribution, which may be symmetric and has non-constant hazard rate function.

...read moreread less

17 citations

Journal Article•DOI•

Multiple seasonal cycles forecasting model: the Italian electricity demand

[...]

Mauro Bernardi¹, Lea Petrella²•Institutions (2)

University of Padua¹, Sapienza University of Rome²

02 Jun 2015-Statistical Methods and Applications

TL;DR: This work model the time series of Italian electricity consumption from 2004 to 2014 using an exponential smoothing approach and demonstrates that the proposed model performs remarkably well, in terms of lower root mean squared error and mean absolute percentage error criteria, in both short term and medium term forecasting horizons.

...read moreread less

Abstract: Forecasting energy load demand data based on high frequency time series has become of primary importance for energy suppliers in nowadays competitive electricity markets. In this work, we model the time series of Italian electricity consumption from 2004 to 2014 using an exponential smoothing approach. Data are observed hourly showing strong seasonal patterns at different frequencies as well as some calendar effects. We combine a parsimonious model representation of the intraday and intraweek cycles with an additional seasonal term that captures the monthly variability of the series. Irregular days, such as public holidays, are modelled separately by adding a specific exponential smoothing seasonal term. An additive ARMA error term is then introduced to lower the volatility of the estimated trend component and the residuals’ autocorrelation. The forecasting exercise demonstrates that the proposed model performs remarkably well, in terms of lower root mean squared error and mean absolute percentage error criteria, in both short term and medium term forecasting horizons.

...read moreread less

16 citations

Journal Article•DOI•

Local polynomial modelling of the conditional quantile for functional data

[...]

Fatiha Messaci, Nahima Nemouchi, Idir Ouassou¹, Mustapha Rachdi²•Institutions (2)

École Normale Supérieure¹, Pierre Mendès-France University²

01 Feb 2015-Statistical Methods and Applications

TL;DR: This paper deduces the uniform almost-complete convergence of the obtained local linear conditional quantile estimator of the local linear nonparametric estimation of the quantile of a scalar response variable given a functional covariate.

...read moreread less

Abstract: As the problem of prediction is of great interest, several tools based on different methods and devoted to various contexts, have been developed in the statistical literature. The contribution of this paper is to focus on the study of the local linear nonparametric estimation of the quantile of a scalar response variable given a functional covariate. In fact, the covariate is a random variable taking values in a semi-metric space which can have an infinite dimension in order to permit to deal with curves. We first establish pointwise and uniform almost-complete convergences, with rates, of the conditional distribution function estimator. Then, we deduce the uniform almost-complete convergence of the obtained local linear conditional quantile estimator. We also bring out the application of our results to the multivariate case as well as to the particular case of the kernel method. Moreover, a real data study allows to place our conditional median estimator in relation to several other predictive tools.

...read moreread less

14 citations

Journal Article•DOI•

Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing

[...]

David R. Bickel¹•Institutions (1)

University of Ottawa¹

14 Feb 2015-Statistical Methods and Applications

TL;DR: In this article, a minimax procedure for hypothesis testing is proposed, which blends strict Bayesian methods with p values and confidence intervals or with default-prior methods, depending on the prior knowledge about the prior.

...read moreread less

Abstract: The proposed minimax procedure blends strict Bayesian methods with p values and confidence intervals or with default-prior methods. Two applications to hypothesis testing bring some implications to light. First, the blended probability that a point null hypothesis is true is equal to the p value or a lower bound of an unknown posterior probability, whichever is greater. As a result, the p value is reported instead of any posterior probability in the case of complete prior ignorance but is ignored in the case of a fully known prior. In the case of partial knowledge about the prior, the possible posterior probability that is closest to the p value is used for inference. The second application provides guidance on the choice of methods used for small numbers of tests as opposed to those appropriate for large numbers. Whereas statisticians tend to prefer a multiple comparison procedure that adjusts each p value for small numbers of tests, large numbers instead lead many to estimate the local false discovery rate (LFDR), a posterior probability of hypothesis truth. Each blended probability reduces to the LFDR estimate if it can be estimated with sufficient accuracy or to the adjusted p value otherwise.

...read moreread less

Journal Article•DOI•

Nonparametric estimation of general multivariate tail dependence and applications to financial time series

[...]

Yuri Salazar¹, Wing Lon Ng²•Institutions (2)

Macquarie University¹, University of Essex²

01 Mar 2015-Statistical Methods and Applications

TL;DR: This work presents and studies several nonparametric estimators of general tail dependence functions, and employs selected estimators in two empirical applications to detect and measure the general multivariate non-positive tail dependence in financial data, which popular parametric copula models commonly applied in the financial literature fail to capture.

...read moreread less

Abstract: In order to analyse the entire tail dependence structure among random variables in a multidimensional setting, we present and study several nonparametric estimators of general tail dependence functions. These estimators measure tail dependence in different orthants, complementing the commonly studied positive (lower and upper) tail dependence. This approach is in line with the parametric analysis of general tail dependence. Under this unifying approach the different dependencies are analysed using the associated copulas. We generalise estimators of the lower and upper tail dependence coefficient to the general multivariate tail dependence function and study their statistical properties. Tail dependence measures come as a response to the incapability of the correlation coefficient as an extreme dependence measure. We run a Monte Carlo simulation study to assess the performance of the nonparametric estimators. We also employ selected estimators in two empirical applications to detect and measure the general multivariate non-positive tail dependence in financial data, which popular parametric copula models commonly applied in the financial literature fail to capture.

...read moreread less

Journal Article•DOI•

Inference on the parameters of two Weibull distributions based on record values

[...]

Hojatollah Zakerzadeh¹, Ali Akbar Jafari¹•Institutions (1)

Yazd University¹

01 Mar 2015-Statistical Methods and Applications

TL;DR: A simple exact method is proposed for testing and constructing confidence interval for the ratio of shape parameters in two Weibull distributions and a generalized approach is given for inference about the common scale parameter.

...read moreread less

Abstract: The Weibull distribution is a very applicable model for the lifetime data. Therefore, comparing the parameters of two Weibull distributions is very important. However, there is not an appropriate method for comparing the shape or scale parameters in the literatures based on record values. In this paper, we have proposed a simple exact method for testing and constructing confidence interval for the ratio of shape parameters in two Weibull distributions. In addition, a simple exact method is proposed for inference about the common shape parameter. For comparing the scale parameters, we use the concepts of generalized confidence interval and generalized p value, and derive approaches when the shape parameters are equal or unequal. Also, we give a generalized approach for inference about the common scale parameter. At the end, we investigate inference about stress–strength reliability. Simulation results show that proposed approaches are satisfactory. All approaches are illustrated using a real example.

...read moreread less

Journal Article•DOI•

Discussion of “Multivariate functional outlier detection”

[...]

Ana Arribas-Gil¹, Juan Romo¹•Institutions (1)

Charles III University of Madrid¹

01 Aug 2015-Statistical Methods and Applications

TL;DR: The paper discusses carefully the problem of outlier detection in this setting and starts by establishing a classification of different outlying behaviours, and compares the proposed taxonomy of functional outliers with the classification currently adopted in the literature.

...read moreread less

Abstract: This interesting paper adds on the authors long and outstanding trajectory in robust statistics and , more specifically, on robust functional data analysis. We congratulate Mia Hubert, Peter Rousseeuw and Pieter Segaert for this important contribution. As the authors point out, most of the literature to date on functional outlier detection deals with univariate functional data (one curve observed by individual). This work considers the case of p-variate functional data (p curves observed by individual). The paper discusses carefully the problem of outlier detection in this setting and starts by establishing a classification of different outlying behaviours. Then, several p-variate functional depths and distance functions are defined by integrating over time the existing or newly defined p-variate counterparts. Finally, by combining these measures, several graphical diagnostic tools are proposed. We would like to contribute to the discussion by focusing on two aspects. Firstly, we will compare the proposed taxonomy of functional outliers with the classification currently adopted in the literature. Secondly, we will comment on the differences between the proposed collection of methods and the outliergram (Arribas-Gil and Romo 2014), the recent procedure to detect shape outliers. We compare it with the proposed methodology in several examples.

...read moreread less

Journal Article•DOI•

Adaptive designs with arbitrary dependence structure based on Fisher’s combination test

[...]

Rene Schmidt¹, Andreas Faldum¹, Joachim Gerß¹•Institutions (1)

University of Münster¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: It turns out that considerable inflation of the type I error rate can occur and this emphasizes that the examination of the true dependence structure between the stage-wise $$p$$p-values and an adequate choice of the conditional error function is crucial when adaptive designs are used.

...read moreread less

Abstract: Adaptive designs were originally developed for independent and uniformly distributed p-values. However, in general the type I error rate of a given adaptive design depends on the true dependence structure between the stage-wise p-values. Since there are settings, where the p-values of the stages might be dependent with even unknown dependence structure, it is of interest to consider the most adverse dependence structure maximizing the type I error rate of a given adaptive design (worst case). In this paper, we explicitly study the type I error rate in the worst case for adaptive designs without futility stop based on Fisher's combination test. Potential inflation of the type I error rate is studied if the dependence structure between the p-valuesofthestagesisnottakenintoaccountadequately.Itturnsoutthatconsiderable inflation of the type I error rate can occur. This emphasizes that the examination of the true dependence structure between the stage-wise p-values and an adequate choice of the conditional error function is crucial when adaptive designs are used.

...read moreread less

Journal Article•DOI•

Saddlepoint expansions for GEL estimators

[...]

Gubhinder Kundhi¹, Paul Rilstone²•Institutions (2)

Memorial University of Newfoundland¹, York University²

01 Mar 2015-Statistical Methods and Applications

TL;DR: A simple saddlepoint approximation for the distribution of generalized empirical likelihood (GEL) estimators is derived and trials compare the performance of the SP and other methods such as the Edgeworth and the bootstrap for special cases of GEL: continuous updating, empirical likelihood and exponential tilting estimators.

...read moreread less

Abstract: A simple saddlepoint (SP) approximation for the distribution of generalized empirical likelihood (GEL) estimators is derived. Simulations compare the performance of the SP and other methods such as the Edgeworth and the bootstrap for special cases of GEL: continuous updating, empirical likelihood and exponential tilting estimators.

...read moreread less

Journal Article•DOI•

Rejoinder to ‘multivariate functional outlier detection’

[...]

Mia Hubert¹, Peter J. Rousseeuw¹, Pieter Segaert¹•Institutions (1)

Katholieke Universiteit Leuven¹

21 Jul 2015-Statistical Methods and Applications

TL;DR: The authors were surprised by the number of invited comments and grateful to their contributing authors, all of whom raised important points and/or offered valuable suggestions.

...read moreread less

Abstract: First of all we would like to thank the editor, Professor Andrea Cerioli, for inviting us to submit our work and for requesting comments from some esteemed colleagues. We were surprised by the number of invited comments and grateful to their contributing authors, all of whom raised important points and/or offered valuable suggestions. We are happy for the opportunity to rejoin the discussion. Rather than addressing the comments in turn we will organize our rejoinder by topic, starting with comments directly related to concepts we proposed in the paper and continuing with some extensions.

...read moreread less

Journal Article•DOI•

Childcare and participation at work in North-East Italy: Why do Italian and foreign mothers behave differently?

[...]

Anna Giraldo¹, Gianpiero Dalla-Zuanna¹, Enrico Rettore¹•Institutions (1)

University of Padua¹

13 Mar 2015-Statistical Methods and Applications

TL;DR: In this article, the authors examined two decision-making processes following the birth of a child: whether a working mother should continue with her job, and whether the couple should provide the child with formal childcare.

...read moreread less

Abstract: This paper examines two of the decision-making processes following the birth of a child: whether a working mother should continue with her job, and whether the couple should provide the child with formal childcare. Focusing on Padova and its district, this paper discusses differences in the strategies used by Italian and foreign mothers, controlling for socio-economic status and opinions on women’s roles, according to the Blinder–Oaxaca decomposition technique. Six to thirty-six months after the birth of a child, the proportion of foreign mothers who are not employed is more than double that of Italian mothers (51 vs. 21 %; pre-birth 40 vs. 12). In addition, 25 % of Italian women entrust their children to the care of their parents and in-laws, versus only 13 % of foreign women. Although there are differences in the effects of individual characteristics on participation at work across the two groups, what matters most is the different composition of the Italian and foreign women’s groups, especially in regard to education, partners’ characteristics and attitudes towards the job market and motherhood. Regarding the maximum price a couple is willing to pay for formal childcare, intended to represent parents’ preferences for formal childcare, the differences between the two groups are also mainly explained by differences in composition.

...read moreread less

Journal Article•DOI•

A group VISA algorithm for variable selection

[...]

Abdallah Mkhadri¹, Mohamed Ouhourane¹•Institutions (1)

Cadi Ayyad University¹

01 Mar 2015-Statistical Methods and Applications

TL;DR: A novel sparse regression method, called the Group-VISA (GVISA), which extends the VISA effect to grouped variables and establishes a theoretical property on sparsity inequality of GVISA estimator that is a non-asymptotic bound on the estimation error.

...read moreread less

Abstract: We consider the problem of selecting grouped variables in a linear regression model based on penalized least squares. The group-Lasso and the group-Lars procedures are designed for automatically performing both the shrinkage and the selection of important groups of variables. However, since the same tuning parameter is used (as in Lasso or Lars ) for both group variable selection and shrinkage coefficients, it can lead to over shrinkage the significant groups of variables or inclusion of many irrelevant groups of predictors. This situation occurs when the true number of non-zero groups of coefficients is small relative to the number $$p$$ of variables. We introduce a novel sparse regression method, called the Group-VISA (GVISA), which extends the VISA effect to grouped variables. It combines the idea of VISA algorithm which avoids the over shrinkage problem of regression coefficients and the idea of the GLars-type estimator which shrinks and selects the members of the group together. Hence, GVISA is able to select a sparse group model by avoiding the over shrinkage of GLars-type estimator. We distinguish two variants of the GVISA algorithm, each one is associated with each version of GLars (I and II). Moreover, we provide a path algorithm, similar to GLars, for efficiently computing the entire sample path of GVISA coefficients. We establish a theoretical property on sparsity inequality of GVISA estimator that is a non-asymptotic bound on the estimation error. A detailed simulation study in small and high dimensional settings is performed, which illustrates the advantages of the new approach in relation to several other possible methods. Finally, we apply GVISA on two real data sets.

...read moreread less

Journal Article•DOI•

Expressions for moments of order statistics and records from the skew-normal distribution in terms of multivariate normal orthant probabilities

[...]

Mahdi Salehi¹, Mahdi Doostparast¹•Institutions (1)

Ferdowsi University of Mashhad¹

21 Apr 2015-Statistical Methods and Applications

TL;DR: In this paper, the authors derived explicit expressions for central moments of order statistics coming from the skew-normal (SN) distribution and used them for prediction purposes such as predictive maintenance in a repairable system and prediction of performance of a transmission pipeline.

...read moreread less

Abstract: In designs of experiments and reliability analyze, order statistics (OS) are used for various purposes including model checking, estimations of parameters and prediction. Most of these procedures are defined on the basis of expectations of OS problems. In this paper, explicit expressions for central moments of OS coming from the skew-normal (SN) distribution are derived. The SN model enjoys interesting properties from the normal distribution while captures asymmetric behavior in the parent population. Another important topic which is related to OS is record statistics. These data are arising in some practical situations including shock models, sports and epoch times of a non-homogeneous Poisson processes. Here, we derive moments of the upper and the lower record values arising from the skew normal distribution. The obtained results may be used for prediction purposes such as predictive maintenance in a repairable system and prediction of performance of a transmission oil pipeline. Some real data sets are analyzed using the results obtained for illustration purposes.

...read moreread less

Journal Article•DOI•

D-optimal experimental designs for a growth model applied to a Holstein-Friesian dairy farm

[...]

Santiago Campos-Barreiro¹, Jesús López-Fidalgo¹•Institutions (1)

University of Castilla–La Mancha¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: Design issues of West's ontogenetic growth model applied to a Holstein-Friesian dairy farm in the northwest of Spain are discussed and a robust correlation structure is chosen that provides a methodology that can be used for any correlation structure.

...read moreread less

Abstract: The body mass growth of organisms is usually represented in terms of what is known as ontogenetic growth models, which represent the relation of dependence between the mass of the body and time This paper discusses design issues of West’s ontogenetic growth model applied to a Holstein-Friesian dairy farm in the northwest of Spain D-optimal experimental designs were computed to obtain an optimal fitting of the model A correlation structure has been included in the statistical model due to the fact that observations on a particular animal are not independent The choice of a robust correlation structure is an important contribution of this paper; it provides a methodology that can be used for any correlation structure The experimental designs undertaken provide a tool to control the proper weight of heifers, which will help improve their productivity and, by extension, the competitiveness of the dairy farm

...read moreread less

Journal Article•DOI•

Reconciliation of systems of time series according to a growth rates preservation principle

[...]

Tommaso Di Fonzo¹, Marco Marini²•Institutions (2)

University of Padua¹, International Monetary Fund²

12 May 2015-Statistical Methods and Applications

TL;DR: The experiments show that the nonlinear GRP problem can be efficiently solved through an interior-point optimization algorithm, and GRP-based procedures preserve better the growth rates than PFD solutions, especially for series with high temporal discrepancy and high volatility.

...read moreread less

Abstract: We propose new simultaneous and two-step procedures for reconciling systems of time series subject to temporal and contemporaneous constraints according to a growth rates preservation (GRP) principle. The techniques exploit the analytic gradient and Hessian of the GRP objective function, making full use of all the derivative information at disposal. We apply the new GRP procedures to two systems of economic series, and compare the results with those of reconciliation procedures based on the proportional first differences (PFD) principle, widely used by data-producing agencies. Our experiments show that (1) the nonlinear GRP problem can be efficiently solved through an interior-point optimization algorithm, and (2) GRP-based procedures preserve better the growth rates than PFD solutions, especially for series with high temporal discrepancy and high volatility.

...read moreread less

Journal Article•DOI•

A randomized two stage allocation for continuous response clinical trials

[...]

Rahul Bhattacharya¹, Madhumita Shome•Institutions (1)

University of Calcutta¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: A randomized two treatment allocation design, conducted in two stages, is proposed for a class of continuous response trials and relevant properties of the proposed allocation design are investigated and compared with suitable competitors.

...read moreread less

Abstract: A randomized two treatment allocation design, conducted in two stages, is proposed for a class of continuous response trials. Patients are assigned to each treatment in equal numbers in the first stage and p value of a test of equality of treatment effects based on these data is used to determine the assignment probability of second stage patients. Relevant properties of the proposed allocation design are investigated and compared with suitable competitors.

...read moreread less

Journal Article•DOI•

Wild bootstrap tests for unit root in ESTAR models

[...]

Daiki Maki¹•Institutions (1)

Ryukoku University¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: Monte Carlo simulations show that in asymptotic tests, severe over-rejection of the null hypothesis occurs under heteroskedastic variances, whereas the proposed wild bootstrap tests have reasonable size and power properties.

...read moreread less

Abstract: This paper introduces wild bootstrap tests for unit root in exponential smooth transition autoregressive (ESTAR) models. Asymptotic unit root tests in ESTAR models have severe size distortions in the presence of heteroskedastic variances such as generalized autoregressive conditional heteroskedasticity and stochastic volatility, and hence, to improve these distortions, we use a wild bootstrap. Monte Carlo simulations show that in asymptotic tests, severe over-rejection of the null hypothesis occurs under heteroskedastic variances, whereas the proposed wild bootstrap tests have reasonable size and power properties.

...read moreread less

Journal Article•DOI•

A new powerful version of the BUS test of normality

[...]

Aldo Goia¹, Ernesto Salinelli¹, Pascal Sarda²•Institutions (2)

University of Eastern Piedmont¹, Institut de Mathématiques de Toulouse²

01 Sep 2015-Statistical Methods and Applications

TL;DR: A modified version of the BUS test is introduced, which is called NBUS (New Borovkov–Utev Statistic), which defines a family of goodness of fit tests that can be used to detect normality against alternative hypothesis of which all moments up to the fifth exist.

...read moreread less

Abstract: In this paper we introduce a modified version of the BUS test, which we call NBUS (New Borovkov–Utev Statistic). This latter defines a family of goodness of fit tests that can be used to detect normality against alternative hypothesis of which all moments up to the fifth exist. The test statistic depends on empirical moments and real parameters that have to be chosen appropriately. The good abilities of the NBUS with respect to BUS and other powerful normality tests are illustrated by means of a Monte Carlo experiment for finite samples. Besides, we show how an adaptation of NBUS for testing departing from normality due only to kurtosis, leads to comparable performances with classical tests based on the fourth moment.

...read moreread less

Journal Article•DOI•

Analysis of an outcome-dependent enriched sample: hypothesis tests

[...]

C. I. Vahl¹, Qing Kang•Institutions (1)

Kansas State University¹

01 Sep 2015-Statistical Methods and Applications

TL;DR: The theoretical derivation and simulation show that tests based on the PL- and WL-based families of tests on the model parameter from an ODE sample carry nominal type I error and good power.

...read moreread less

Abstract: An outcome-dependent sample is generated by a stratified survey design where the stratification depends on the outcome. It is also known as a case–control sample in epidemiological studies and a choice-based sample in econometrical studies. An outcome-dependent enriched sample (ODE) results from combining an outcome-dependent sample with an independently collected random sample. Consider the situation where the conditional probability of a categorical outcome given its covariates follows an explicit model with an unknown parameter whereas the marginal probability of the outcome and its covariates are left unspecified. Profile-likelihood (PL) and weighted-likelihood (WL) methods have been employed to estimate the model parameter from an ODE sample. This article develops the PL- and WL-based families of tests on the model parameter from an ODE sample. Asymptotic properties of their test statistics are derived. The PL likelihood-ratio, Wald and score tests are shown to obey classical inference, i.e. their test statistics are asymptotically equivalent and Chi-squared distributed. In contrast, the WL likelihood-ratio statistic asymptotically has a weighted Chi-squared distribution and is not equivalent to the WL Wald and score statistics. Our theoretical derivation and simulation show that tests based on these new statistics carry nominal type I error and good power. Advantages of ODE sampling together with the implementation of the PL and WL methods are demonstrated in an illustrative example.

...read moreread less

Journal Article•DOI•

A first step to implement Gillespie’s algorithm with rejection sampling

[...]

Qihong Duan¹, Junrong Liu²•Institutions (2)

Xi'an Jiaotong University¹, Northwest University (China)²

01 Mar 2015-Statistical Methods and Applications

TL;DR: An exact implementation of a continuous-time Markov chain with bounded intensity which can simulate the process at given time points is presented.

...read moreread less

Abstract: It is well known that firings of a well-stirred chemically reacting system can be described by a continuous-time Markov chain. The currently-used exact implementations of Gillespie’s algorithm simulate every reaction event individually and thus the computational cost is inevitably high. In this paper, we present an exact implementation of a continuous-time Markov chain with bounded intensity which can simulate the process at given time points. The implementation involves rejection sampling, with a trajectory either accepted or rejected based on just a few reaction events. A simulation study on the Schlogl model is presented and supplementary materials for this article are available online.

...read moreread less

Journal Article•DOI•

Further results on orthogonal arrays for the estimation of global sensitivity indices based on alias matrix

[...]

Xue-Ping Chen¹, Xue-Ping Chen², Jin-Guan Lin¹, Xiao-di Wang³, Xing-Fang Huang¹ - Show less +1 more•Institutions (3)

Southeast University¹, Jiangsu University², Central University of Finance and Economics³

01 Sep 2015-Statistical Methods and Applications

TL;DR: This paper first generalizes the alias matrix for ANOVA high-dimensional model representation based on matrix image, and then by sequentially minimizing the squared alias degrees, a approach for the estimation of sensitivity indices is presented.

...read moreread less

Abstract: In this paper, the use of orthogonal arrays with strength $$s

...read moreread less

Journal Article•DOI•

Discussion of Multivariate functional outlier detection by M. Hubert, P. Rousseeuw and P. Segaert

[...]

Sara López-Pintado¹•Institutions (1)

Columbia University¹

16 May 2015-Statistical Methods and Applications

TL;DR: The authors define and classify rigorously different types of functional outliers and propose several techniques for detecting them in multivariate functional data and develop statistical and graphical tools to detect and visualize potential outliers.

...read moreread less

Abstract: I would like to congratulate M. Hubert, P. Rousseeuw and P. Segaert for this stimulating and useful work on outlier detection methods for multivariate functional data. They define and classify rigorously different types of functional outliers and propose several techniques for detecting them in multivariate functional data. These authors use the notion of data depth and distances derived from them to develop statistical and graphical tools to detect and visualize potential outliers. Several important ideas are emphasized in this paper. First, it is shown that there are many different ways of being outlier when dealing with multivariate functional data and that multivariate outliers do not necessarily have to be marginal outliers. Second, depth alone does not always provide sufficient information for detecting outliers, and some distance-based ranking is needed. The authors present several approaches for outlier detection and visualization. One uses a proposed bagdistance based on the center and dispersion provided by the multivariate funtional halfspace depth introduced in Claeskens et al. (2014). Another approach uses a skew-adjusted projection depth that can be extended to multivariate functional data. Several diagnostic and exploratory tools based on these notions are proposed and analyzed.

...read moreread less

Journal Article•DOI•

P. Secchi, S. Vantini and V. Vitelli: Analysis of spatio-temporal mobile phone data: a case study in the metropolitan area of Milan

[...]

Piotr Kokoszka¹•Institutions (1)

Colorado State University¹

28 Mar 2015-Statistical Methods and Applications

TL;DR: Commentary on a new tool set for the analysis of functional data indexed by spatial locations, which reviews research on geostatistical functional data, point processes associated withfunctional data, and functional areal data.

...read moreread less

Abstract: I thank the authors and the editorial board for inviting me to comment on this work. I would like to congratulate the authors on bringing in a new tool set for the analysis of functional data indexed by spatial locations. Data of this type have been attracting attention of the FDA research community for some time, but the focus has been of functional principal component (FPC) expansions to form a suitable temporal basis. My comments will focus on providing several relevant references to my own research, which are related to this work. Many other researchers have contributed to this field, but I do not aim at a comprehensive review. Delicado et al. (2010) provide a very good review. They review research on geostatistical functional data (focusing on kriging), point processes associated with functional data, and functional areal data. My own work in this area has focused on geostatistical functional data, which can be denoted as X (sk; t), where sk is the spatial location and t is time. Using this notation, the Erlang data can be written as E(xk; t), where the xk are the locations with almost complete records. Gromenko et al. (2012) study the estimation of the mean function μ and the FPC’s v j in the Karhunen–Loeve representation

...read moreread less

Showing papers in "Statistical Methods and Applications in 2015"