scispace - formally typeset
Search or ask a question

Showing papers in "Statistics in 2006"


Journal ArticleDOI
TL;DR: In this article, the authors study some properties for multivariate weighted distributions related to reliability measures, ordering, characterization and dependence properties, by compiling and extending previous results given by different authors.
Abstract: We study some properties for multivariate weighted distributions related to reliability measures, ordering, characterization and dependence properties, by compiling and extending previous results given by different authors. We pay special attention to the multivariate size biased and equilibrium distributions, and propose a new definition for the multivariate equilibrium distribution.

48 citations


Journal ArticleDOI
TL;DR: In this paper, the authors presented the goodness-of-fit tests for the Laplace distribution based on its maximum entropy characterization result, and the critical values of the test statistics estimated by Monte Carlo simulations are tabulated for various window and sample sizes.
Abstract: This article presents the goodness-of-fit tests for the Laplace distribution based on its maximum entropy characterization result. The critical values of the test statistics estimated by Monte Carlo simulations are tabulated for various window and sample sizes. The test statistics use an entropy estimator depending on the window size; so, the choice of the optimal window size is an important problem. The window sizes for yielding the maximum power of the tests are given for selected sample sizes. Power studies are performed to compare the proposed tests with goodness-of-fit tests based on the empirical distribution function. Simulation results report that entropy-based tests have consistently higher power than EDF tests against almost all alternatives considered.

43 citations


Journal ArticleDOI
TL;DR: In this paper, maximum likelihood and Bayes estimates for one and two parameters and the reliability function of the Rayleigh distribution are obtained on the basis of progressively type-II censored samples.
Abstract: Maximum likelihood and Bayes estimates for one and two parameters and the reliability function of the Rayleigh distribution are obtained on the basis of progressively type-II censored samples. The inverted gamma conjugate prior density is assumed for the one-parameter case, whereas the joint prior density of the two-parameter case is composed of the inverted gamma and the uniform densities. A comparison between the obtained estimators is made through a Monte Carlo simulation study.

32 citations


Journal ArticleDOI
TL;DR: In this article, simple expressions for marginal density functions of multiply censored generalized order statistics based on continuous distribution functions are obtained, and it is shown that generalized orders are multivariate totally positive and associated.
Abstract: In this article, simple expressions for marginal density functions of multiply censored generalized order statistics based on continuous distribution functions are obtained. Moreover, it is shown that generalized order statistics are multivariate totally positive and, thus, associated. This property is applied to show that regressions of generalized order statistics are nondecreasing under weak conditions.

30 citations


Journal ArticleDOI
TL;DR: In this paper, a new family of asymmetric distributions, which depends on two parameters namely, α and β, and in the special case where β = 0, the skew-normal (SN) distribution considered by Azzallini [Azzalini, A., 1985, A class of distributions which includes the normal ones] is obtained.
Abstract: In this article, we introduce a new family of asymmetric distributions, which depends on two parameters namely, α and β, and in the special case where β = 0, the skew-normal (SN) distribution considered by Azzallini [Azzalini, A., 1985, A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 12, 171–178.] is obtained. Basic properties such as a stochastic representation and the derivation of maximum likelihood and moment estimators are studied. The asymptotic behaviour of both types of estimators is also investigated. Results of a small-scale simulation study is provided illustrating the usefulness of the new model. An application to a real data set is reported showing that it can present better fit than the SN distribution.

26 citations


Journal ArticleDOI
TL;DR: In this paper, the asymptotic normality for the least squares estimator of the unknown slope parameter β and the nonparametric component g is studied under appropriate conditions, and strong convergence rates of these estimators are considered.
Abstract: In this article, we are concerned with the regression model y i =x i β+g(t i )+V i (1≤i≤n), where the known design points (x i , t i ), the unknown slope parameter β, and the nonparametric component g are non-random and where the correlated errors , with and negatively associated e i , are random variables. Under appropriate conditions, we study the asymptotic normality for the least squares estimator of β and the nonparametric estimator of g(·). Moreover, strong convergence rates of these estimators are considered. Our results show that the nonparametric estimator of g(·) can attain the optimal convergence rate.

24 citations


Journal ArticleDOI
TL;DR: In this paper, the optimal size of the choice sets in generic choice experiments for asymmetric attributes when estimating main effects only was established, and an upper bound for the determinant of the information matrix was given.
Abstract: In this paper, we establish the optimal size of the choice sets in generic choice experiments for asymmetric attributes when estimating main effects only. We give an upper bound for the determinant of the information matrix when estimating main effects and all two-factor interactions for binary attributes. We also derive the information matrix for a choice experiment in which the choice sets are of different sizes and use this to determine the optimal sizes for the choice sets.

19 citations


Journal ArticleDOI
TL;DR: In this article, the influence analysis of nonlinear reproductive dispersion mixed models (NRDMMs) is discussed, and the equivalence of case-deletion models and mean-shift outlier models in NRDMMs is investigated.
Abstract: The class of nonlinear reproductive dispersion mixed models (NRDMMs) is an extension of nonlinear reproductive dispersion models and generalized linear mixed models. This paper discusses the influence analysis of the model based on Laplace approximation. The equivalence of case-deletion models and mean-shift outlier models in NRDMMs is investigated, and some diagnostic measures are proposed via the case-deletion method. We also investigate the assessment of local influence of various perturbation schemes. The proposed method is illustrated with an example.

18 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of finding D-optimal or D-efficient designs in the presence of covariates is considered under a completely randomized design set-up with v treatments, k covariates and N experimental units.
Abstract: The problem of finding D-optimal or D-efficient designs in the presence of covariates is considered under a completely randomized design set-up with v treatments, k covariates and N experimental units. In contrast to Lopes Troya [Lopes Troya, J., 1982, Optimal designs for covariates models. Journal of Statistical Planning and Inference, 6, 373–419.], who considered this problem in the equireplicate case, we do not assume that N/v is an integer, and this allows us to study situations where no equireplicate design exists. Even when N/v is an integer, it is seen quite counter-intuitively that there are situations where a non-equireplicate design outperforms the best equireplicate design under the D-criterion.

18 citations


Journal ArticleDOI
TL;DR: In this article, an easily computable estimation method based on moment estimation is introduced for estimating the support of a distribution function when the empirical data are contaminated, and the practical performance of the estimation procedure is studied in a simulation section.
Abstract: This note deals with the problem of estimating the support of a distribution function when the empirical data are contaminated. An easily computable estimation method based on moment estimation is introduced. Unlike earlier approaches, these estimators also cover situations in which the sharpness of the target distribution at the support boundaries is arbitrarily low and in which the distribution is discontinuous at several points within the support. General consistency results for the estimators are achieved and rates of convergence are studied in the case of normal measurement error. The practical performance of the estimation procedure is studied in a simulation section.

17 citations


Journal ArticleDOI
TL;DR: In this paper, the sharp upper mean-variance bounds on expectations of generalized order statistics based on the distributions with decreasing failure rate and with decreasing fail rate on the average were derived.
Abstract: We present the sharp upper mean-variance bounds on expectations of generalized order statistics based on the distributions with decreasing failure rate and with decreasing failure rate on the average. Also we specify the distributions for which the bounds are attained. The bounds are derived by the application of the projection method.

Journal ArticleDOI
TL;DR: In this article, a normal population with unknown mean μ and unknown variance σ2 is estimated under an asymmetric LINEX loss function such that the associated risk is bounded from above by a known quantity w.r.t.
Abstract: Consider a normal population with unknown mean μ and unknown variance σ2. We estimate μ under an asymmetric LINEX loss function such that the associated risk is bounded from above by a known quantity w. This necessitates the use of a random number (N) of observations. Under a fairly broad set of assumptions on N, we derive the asymptotic second-order expansion of the associated risk function. Some examples have been included involving accelerated sequential and three-stage sampling techniques. Performance comparisons of these procedures are considered using a Monte-Carlo study.

Journal ArticleDOI
TL;DR: In this article, the exact distribution of the sum of the largest n−k out of n normally distributed random variables, with differing mean values, was derived for an application in electrical engineering.
Abstract: Motivated by an application in Electrical Engineering, we derive the exact distribution of the sum of the largest n−k out of n normally distributed random variables, with differing mean values. Comparisons are made with two normal approximations to this distribution—one arising from the asymptotic negligibility of the omitted order statistics and one from the theory of L-statistics. The latter approximation is found to be in excellent agreement with the exact distribution.

Journal ArticleDOI
TL;DR: In this paper, the discriminant analysis for axial data assumed to be constructed on the hypersphere is considered, and the Watson distribution is used to model the axial distribution on hyperspheres.
Abstract: The Watson distribution defined on the hypersphere is one of the most used distributions for modelling axial data. In this paper, we consider the discriminant analysis for axial data assumed to com...

Journal ArticleDOI
TL;DR: In this article, the authors investigate the efficiency of score tests for testing a censored Poisson regression model against censored negative binomial regression alternatives and find that bootstrap methods keep the significance level close to the nominal one and have greater power uniformly than does the normal approximation for testing the hypothesis.
Abstract: In this article, we investigate the efficiency of score tests for testing a censored Poisson regression model against censored negative binomial regression alternatives. Based on the results of a simulation study, score tests using the normal approximation, underestimate the nominal significance level. To remedy this problem, bootstrap methods are proposed. We find that bootstrap methods keep the significance level close to the nominal one and have greater power uniformly than does the normal approximation for testing the hypothesis.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the model of samples drawn from finite populations with replacement and established optimal, lower non-negative and upper non-positive bounds on the expectations of linear combinations of order statistics centred about the population mean in units generated by the population central absolute moments of various orders.
Abstract: We consider the model of samples drawn from finite populations with replacement. For the model, we establish optimal, lower non-negative and upper non-positive bounds on the expectations of linear combinations of order statistics centred about the population mean in units generated by the population central absolute moments of various orders. We show that the bounds tend to zero in increasing populations with a possible exception of ones expressed in terms of the mean absolute deviation units. We also specify the general results for important examples of sample extremes, range and Gini mean differences. The article completes the results of Rychlik [Rychlik, T., 2004, Optimal bounds on L-statistics based on samples drawn with replacement from finite populations. Statistics, 38, 391–412], in which sharp negative lower and positive upper bounds on the combinations were presented for the drawing with replacement scheme.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the log-concavity of the survival function of the minimum and maximum from Gumbel bivariate exponential models, through the logconsistency of generalized mixtures of four or fewer exponential distributions, extending the papers of Baggs and Nagaraja.
Abstract: In the classical risk theory, it is often used that different type dimensions can be aggregated into a single-dimensional statistic, as well as the assumption of properties on log-concavity of this aggregation. The extreme-order statistics, minimum and maximum, might be used as aggregate statistics. In this paper, we discuss the log-concavity of the survival function of the minimum and maximum from Gumbel bivariate exponential models, through the log-concavity of generalized mixtures of four or fewer exponential distributions, extending the papers of Baggs and Nagaraja [Baggs, G.E. and Nagaraja, H.N., 1996, Reliability properties of order statistics from bivariate exponential distributions. Communications in Statistics—Stochastic Models, 12, 611–631] and Franco and Vivo [Franco, M. and Vivo, J.M., 2002, Reliability properties of series and parallel systems from bivariate exponential models. Communications in Statistics—Theory and Methods, 31, 2349–2360] devote to the log-concavity for generalized mixtures...

Journal ArticleDOI
TL;DR: In this article, the problem of approximating an image S from a random sample of points is considered, and a plausible estimator of S is defined as the union of the marked bins (those containing a sample point).
Abstract: The problem of approximating an ‘image’ S⊂ℝ d from a random sample of points is considered. If S is included in a grid of square bins, a plausible estimator of S is defined as the union of the ‘marked’ bins (those containing a sample point). We obtain convergence rates for this estimator and study its performance in the approximation of the border of S. The practical aspects of implementation are discussed, including some technical improvements on the estimator, whose performance is checked through a real data example.

Journal ArticleDOI
TL;DR: In this paper, a Hoeffding-like inequality for the survival function of a sum of symmetric independent identically distributed random variables, taking values in a segment [−b, b] of the reals, is shown.
Abstract: In this paper, we prove a Hoeffding-like inequality for the survival function of a sum of symmetric independent identically distributed random variables, taking values in a segment [−b, b] of the reals. The symmetric case is relevant to the auditing practice and is an important case study for further investigations. The bounds as given by Hoeffding in 1963 cannot be improved upon unless we restrict the class of random variables, for instance, by assuming the law of the random variables to be symmetric with respect to their mean, which we may assume to be zero. The main result in this paper is an improvement of the Hoeffding bound for i.i.d. random variables which are bounded and have a (upper bound for the) variance by further assuming that they have a symmetric law.

Journal ArticleDOI
TL;DR: In this article, a linear regression model with an unbalanced 1-fold nested error structure, where group effect and error are from nonnormal universes, is considered, and the limiting distribution of the F-statistic in this model is derived.
Abstract: We consider a linear regression model with an unbalanced 1-fold nested error structure, where group effect and error are from nonnormal universes. The limiting distribution of the F-statistic in this model is derived, as the sample size is large and group sizes take values from a finite set of distinct integers. The result is used to approximate the F-distribution quantile and to test the significance of the random effect variance component. Results are also applicable to the F-statistic in the one-way random-effects model. The effects of departure from normality on the F-statistic distribution are given.

Journal ArticleDOI
TL;DR: In this paper, the exact distributions of P = XY and the corresponding moment properties are derived when X and Y follow six flexible bivariate exponential distributions, and the expressions turn out to involve several special functions.
Abstract: The distribution of products of random variables is of interest in many areas of the sciences, engineering and medicine. This has increased the need to have available the widest possible range of statistical results on products of random variables. In this article, the exact distributions of P = XY and the corresponding moment properties are derived when X and Y follow six flexible bivariate exponential distributions. The expressions turn out to involve several special functions.

Journal ArticleDOI
TL;DR: In this paper, asymptotic properties of procedures for selecting linear models, which are based on certain data-dependent criteria such as Mallows' C p, cross-validation and the generalized information criterion, are investigated.
Abstract: In the regression analysis, there is typically a large collection of competing models available from which we want to select an appropriate one. This article is concerned with asymptotic properties of procedures for selecting linear models, which are based on certain data-dependent criteria such as Mallows’ C p , cross-validation and the generalized information criterion. We avoid the assumption of an adequate (‘correct’) model and allow the maximal model dimension to increase with the sample size. General asymptotic concepts are introduced, covering the usual ones of consistency and asymptotic optimality. The focus is on conditions for penalizing the model complexity, which are necessary to obtain the different optimality properties. For example, the consistency of a procedure is decided by the interplay between these penalties, the complexity of the class of model candidates, and some quantity describing the ability to identify ‘wrong’ (pseudo-inadequate) models. Many results known from the literature a...

Journal ArticleDOI
TL;DR: In this article, a nonparametric Bayesian insurance risk model is considered, where the claims are seen as a marked point process (T i, Y i ), where T i is the time of occurrence of the ith claim and Y i is its size.
Abstract: We consider a nonparametric Bayesian insurance risk model. The claims are seen as a marked point process (T i , Y i ), where T i is the time of occurrence of the ith claim and Y i is its size. We assume that this is a nonhomogeneous Poisson process on ℝ+ 2 with intensity measure P×Θ. Here P describes the exposure to risk and it is known, whereas Θ is regarded as an unknown risk characteristic. According to the Bayesian paradigm, we assume that the measure Θ is random. Processes with independent increments are used as prior distributions. In particular, Gamma processes are conjugate priors. The problem is to predict the sum of future claims in a given period, given the past of the process. We consider the asymmetric criterion LINEX (linear-exponential) that penalizes underestimation of claims more severely than overestimation. For the conjugate Gamma prior, we construct the best predictor. Under a relaxed assumption on the prior distribution, we construct the best linear predictor.

Journal ArticleDOI
TL;DR: In this article, a class of generalized maximum likelihood asymptotic power one tests for detection of various types of changes in a linear regression model is proposed and examined for detecting the existence and estimation of a possible unknown threshold.
Abstract: We propose and examine a class of generalized maximum likelihood asymptotic power one tests for detection of various types of changes in a linear regression model. In economic and epidemiologic studies, such segmented regression models often occur as threshold models, where it is assumed that the exposure has no influence on the response up to a possible unknown threshold. An important task of such studies is testing the existence and estimation of this threshold. Guaranteed non-asymptotic upper bounds for the significance levels of these tests are presented. We demonstrate how the proposed tests were applied toward solving an actual problem encountered with real data.

Journal ArticleDOI
TL;DR: In this article, the structural skew-normal probit model is extended by considering that the unobserved covariate follows a skew normal distribution, and the likelihood function is obtained analytically which can be maximized by using existing statistical software.
Abstract: In this paper, we extend the structural probit measurement error model by considering that the unobserved covariate follows a skew-normal distribution. The new model is termed the structural skew-normal probit model. As in the normal case, the likelihood function is obtained analytically which can be maximized by using existing statistical software. A Bayesian approach using Markov chain Monte Carlo techniques to generate from the posterior distributions is also developed. A simulation study demonstrates the usefulness of the approach in avoiding attenuation which is the case with the naive procedure and it seems to be more efficient than using the structural probit model when the distribution of the covariate (predictor) is skew.

Journal ArticleDOI
TL;DR: In this paper, the authors define nonparametric estimators of the survival function S(t) = P(T ≥ t) for a partially observed time variable T have been defined by several methods, in particular, by integral self-consistency equations.
Abstract: Nonparametric estimators of the survival function S(t) = P(T ≥ t) for a partially observed time variable T have been defined by several methods, in particular, by integral self-consistency equations. The author establishes explicit expressions of the estimators in an additive form and extend this approach to several cases: a left-truncated and right-censored variable and the left-censored or left-truncated sojourn times of a right-censored semi-Markov process. These estimators are always identical to the product-limit estimators if hazard functions may be defined.

Journal ArticleDOI
TL;DR: In this article, it was shown that when the coefficients C m, m≠0, are close to 0, this PCA is close to the usual PCA, that is, the PCA in the temporal domain.
Abstract: The principal components analysis (PCA) in the frequency domain of a stationary p-dimensional time series (X n ) n∈ℤ leads to a summarizing time series written as a linear combination series X′ n =∑ m C m ° X n−m . Therefore, we observe that, when the coefficients C m , m≠0, are close to 0, this PCA is close to the usual PCA, that is the PCA in the temporal domain. When the coefficients tend to 0, the corresponding limit is said to satisfy a property noted 𝒫, of which we will study the consequences. Finally, we will examine, for any series, the proximity between the two PCAs.

Journal ArticleDOI
TL;DR: In this article, a class of skew-normal distributions driven by the convolution of two independent random variables: a normal and a beta distributed random variables, is studied, motivated by the numerical simulation of the oviductal egg transport in mammals.
Abstract: In this paper, we study a class of skew-normal distributions driven by the convolution of two independent random variables: a normal and a beta distributed random variables. This problem is motivated by the numerical simulation of the oviductal egg transport in mammals, expressed as a series of microsphere instant velocities regulated by ovarian hormones including estradiol. We propose a closed form convolution formula, represented in terms of the infinite series expanded using Hermite polynomials. We also analyse the convergence of such series and perform the numerical experiments to illustrate these formulae.

Journal ArticleDOI
TL;DR: In this article, classical perturbation theory is employed to produce theoretical and empirical influence functions for partial least squares under the constraint of uncorrelated scores, and these influence functions are carefully interpreted and then applied to a protein analysis problem.
Abstract: Influence theory has been studied extensively in multivariate analysis and detailed results are available for a host of multivariate techniques, including principal components, canonical correlations, and linear discrimination. In this article, the first such results are derived for partial least squares (PLS). In particular, classical perturbation theory is employed to produce theoretical and empirical influence functions for PLS under the constraint of uncorrelated scores. These influence functions are carefully interpreted and then applied to a protein analysis problem.

Journal ArticleDOI
TL;DR: In this article, the authors introduce estimators which are based on direct approximations of the (nonobservable) best linear unbiased estimator of the coefficient β in the regression model where the regression function f is similar to the covariance kernel R of the error process N.
Abstract: Consider the problem of estimating the coefficient β in the regression model where the regression function f is similar to the covariance kernel R of the error process N, i.e., f is an element of the reproducing kernel Hilbert space associated with R. Conventional approaches discuss asymptotically optimal estimators if the kernel satisfies certain regularity conditions and if f is expressible as the image of R under an appropriate linear transformation. This paper introduces estimators which are based on direct approximations of the (nonobservable) best linear unbiased estimator of β. Regularity conditions are not required, the representation of f may also depend on derivatives of R, and particular emphasis is laid on computational stability.