scispace - formally typeset
Search or ask a question

Showing papers in "Metrika in 2002"


Journal ArticleDOI
01 Jan 2002-Metrika
TL;DR: In this article, the least trimmed squares estimator and the minimum covariance determinant estimator are used to make the estimators consistent at the normal model, however, for small data sets these factors do not make the estimation unbiased.
Abstract: The least trimmed squares estimator and the minimum covariance determinant estimator [6] are frequently used robust estimators of regression and of location and scatter. Consistency factors can be computed for both methods to make the estimators consistent at the normal model. However, for small data sets these factors do not make the estimator unbiased. Based on simulation studies we therefore construct formulas which allow us to compute small sample correction factors for all sample sizes and dimensions without having to carry out any new simulations. We give some examples to illustrate the effect of the correction factor.

156 citations


Journal ArticleDOI
01 Jul 2002-Metrika
TL;DR: An extension of FGM class of bivariate distributions with given marginals is presented in this article, which allows the new family of distributions to achieve correlation between the components greater than 0.5.
Abstract: An extension of FGM class of bivariate distributions with given marginals is presented. For Huang-Kotz FGM distributions some theorems characterizing symmetry and conditions for independence are obtained. The new family of distributions allows us to achieve correlation between the components greater than 0.5.

93 citations


Journal ArticleDOI
01 Oct 2002-Metrika
TL;DR: In this article, the degree of dependence between two interval-valued random sets, when the dependence is intended in the sense of an affine function relating these random elements, is determined.
Abstract: The ultimate goal of this paper is to determine a measure of the degree of dependence between two interval-valued random sets, when the dependence is intended in the sense of an affine function relating these random elements. For this purpose, a general study on the least squares fitting of an affine function for interval-valued data is first carried out, where the least squares method we will present considers that squared residuals are based on a generalized metric on the space of nonempty compact intervals, and output and input random mechanisms are modelled by means of convex compact random sets. For the general case of nondegenerate convex compact random sets, solutions are presented in an algorithmic way, and the few cases leading to nonunique solutions are characterized. On the basis of this regression study we later introduce and analyze a well-defined determination coefficient of two interval-valued random sets, which will allow us to quantify the strength of association between them, and an algorithm for the computation of the coefficient has been also designed. Finally, a real-life example illustrates the study developed in the paper.

81 citations


Journal ArticleDOI
01 Jan 2002-Metrika
TL;DR: In this article, an alternative inference technique based on a statistic which uses the highly robust MCD estimator (Rousseeuw, 1984) instead of the classical mean and covariance matrix was constructed.
Abstract: Hotelling’s T2 statistic is an important tool for inference about the center of a multivariate normal population. However, hypothesis tests and confidence intervals based on this statistic can be adversely affected by outliers. Therefore, we construct an alternative inference technique based on a statistic which uses the highly robust MCD estimator (Rousseeuw, 1984) instead of the classical mean and covariance matrix. Recently, a fast algorithm was constructed to compute the MCD (Rousseeuw and Van Driessen, 1999). In our test statistic we use the reweighted MCD, which has a higher efficiency. The distribution of this new statistic differs from the classical one. Therefore, the key problem is to find a good approximation for this distribution. Similarly to the classical T2 distribution, we obtain a multiple of a certain F-distribution. A Monte Carlo study shows that this distribution is an accurate approximation of the true distribution. Finally, the power and the robustness of the one-sample test based on our robust T2 are investigated through simulation.

58 citations


Book ChapterDOI
01 Apr 2002-Metrika
TL;DR: In this article, the authors used quantile regression to estimate the cross-sectional relationship between high school characteristics and student achievement as measured by ACT scores, where the importance of school characteristics on student achievement has been traditionally framed in terms of the effect on the expected value.
Abstract: Quantile regression is used to estimate the cross sectional relationship between high school characteristics and student achievement as measured by ACT scores. The importance of school characteristics on student achievement has been traditionally framed in terms of the effect on the expected value. With quantile regression the impact of school characteristics is allowed to be different at the mean and quantiles of the conditional distribution. Like robust estimation, the quantile approach detects relationships missed by traditional data analysis. Robust estimates detect the influence of the bulk of the data, whereas quantile estimates detect the influence of co-variates on alternate parts of the conditional distribution. Since our design consists of multiple responses (individual student ACT scores) at fixed explanatory variables (school characteristics) the quantile model can be estimated by the usual regression quantiles, but additionally by a regression on the empirical quantile at each school. This is similar to least squares where the estimate based on the entire data is identical to weighted least squares on the school averages. Unlike least squares however, the regression through the quantiles produces a different estimate than the regression quantiles.

58 citations


Journal ArticleDOI
M. C. Jones1
01 Feb 2002-Metrika
TL;DR: In this paper, the relationship between F, skew t and beta distributions in the univariate case is extended in a natural way to the multivariate case, and two new distributions are proposed: a multivariate t/skew t distribution (on ℜm) and a multiivariate beta distribution ( on (0,1)m).
Abstract: Relationships between F, skew t and beta distributions in the univariate case are in this paper extended in a natural way to the multivariate case. The result is two new distributions: a multivariate t/skew t distribution (on ℜm) and a multivariate beta distribution (on (0,1)m). A special case of the former distribution is a new multivariate symmetric t distribution. The new distributions have a natural relationship to the standard multivariate F distribution (on (ℜ+)m) and many of their properties run in parallel. We look at: joint distributions, mathematically and graphically; marginal and conditional distributions; moments; correlations; local dependence; and some limiting cases.

57 citations


Journal ArticleDOI
01 Jan 2002-Metrika
TL;DR: In this article, a construction method of mixed-level supersaturated designs consisting of two-level and three-level columns is proposed, where the χ2 statistic is used for a measure of dependency of the design columns.
Abstract: Supersaturated design is a form of fractional factorial design in which the number of columns is greater than the number of experimental runs. Construction methods of supersaturated design have been mainly focused on two levels cases. Much practical experience, however, indicates that two-level may sometimes be inadequate. This paper proposed a construction method of mixed-level supersaturated designs consisting of two-level and three-level columns. The χ2 statistic is used for a measure of dependency of the design columns. The dependency properties for the newly constructed designs are derived and discussed. It is shown that these new designs have low dependencies and thus can be useful for practical uses.

47 citations


Book ChapterDOI
01 Apr 2002-Metrika
TL;DR: In this paper, the problem of estimating risk-minimizing portfolios from a sample of historical returns is addressed, when the underlying distribution that generates returns exhibits departures from the standard Gaussian assumption.
Abstract: We address the problem of estimating risk-minimizing portfolios from a sample of historical returns, when the underlying distribution that generates returns exhibits departures from the standard Gaussian assumption. Specifically, we examine how the underlying estimation problem is influenced by marginal heavy tails, as modeled by the univariate Student-t distribution, and multivariate tail-dependence, as modeled by the copula of a multivariate Student-t distribution. We show that when such departures from normality are present, robust alternatives to the classical variance portfolio estimator have lower risk.

41 citations


Journal ArticleDOI
01 Feb 2002-Metrika
TL;DR: In this article, the authors provided a comparison between C′′pmk and other existing generalizations of CPMK on the accuracy of measuring process performance for processes with asymmetric tolerances.
Abstract: Pearn et al. (1999) considered a capability index C′′pmk, a new generalization of Cpmk, for processes with asymmetric tolerances. In this paper, we provide a comparison between C′′pmk and other existing generalizations of Cpmk on the accuracy of measuring process performance for processes with asymmetric tolerances. We show that the new generalization C′′pmk is superior to other existing generalizations of Cpmk. Under the assumption of normality, we derive explicit forms of the cumulative distribution function and the probability density function of the estimated index \(\). We show that the cumulative distribution function and the probability density function of the estimated index \(\) can be expressed in terms of a mixture of the chi-square distribution and the normal distribution. The explicit forms of the cumulative distribution function and the probability density function considerably simplify the complexity for analyzing the statistical properties of the estimated index \(\).

28 citations


Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: In this paper, some important randomized response strategies have been compared with the Warner's model, taking into account the aspect of privacy protection, and an attempt has been made to make this direction.
Abstract: Comparisons between different randomized response strategies have already been performed by several workers but all have concentrated solely on comparing the variances of the appropriate estimators. A very little attention has been paid by these workers to the degree of privacy protection offered to the interviewees. In the present paper, an attempt has been made in this direction and some important randomized response strategies have been compared with the Warner's model, taking into account the aspect of privacy protection.

25 citations


Journal ArticleDOI
01 Oct 2002-Metrika
TL;DR: In this article, the authors consider the problem of component-wise estimation of ordered scale parameters of two gamma populations, when it is known apriori which population corresponds to each ordered parameter.
Abstract: We consider the problem of component-wise estimation of ordered scale parameters of two gamma populations, when it is known apriori which population corresponds to each ordered parameter. Under the scale equivariant squared error loss function, smooth estimators that improve upon the best scale equivariant estimators are derived. These smooth estimators are shown to be generalized Bayes with respect to a non-informative prior. Finally, using Monte Carlo simulations, these improved smooth estimators are compared with the best scale equivariant estimators, their non-smooth improvements obtained in Vijayasree, Misra & Singh (1995), and the restricted maximum likelihood estimators.

Journal ArticleDOI
01 Oct 2002-Metrika
TL;DR: A new stochastic randomized response technique is proposed that elicits greater cooperation from the respondents, and can be made more efficient than Warner's (1965) and Mangat and Singh (1990) method by selecting certain parameters of the proposed randomization device.
Abstract: The collection of data through personal interview surveys on sensitive issues such as induced abortions, drug abuse and family income is a serious issue in social sciences. Warner (1965) was the first to propose an ingenious method to collect information on such questions without disclosing the privacy of the respondents, which is called randomized response technique. In the present investigation, a new stochastic randomized response technique is proposed that elicits greater cooperation from the respondents, and can be made more efficient than Warner's (1965) and Mangat and Singh (1990) method by selecting certain parameters of the proposed randomization device.

Journal ArticleDOI
01 Jul 2002-Metrika
TL;DR: In this paper, the authors consider constancy of regression instead of Lukacs' independence condition in three new schemes, and show that constancy can be achieved for regressions of U = X/(X + Y) given V = X + Y for independent X and Y.
Abstract: In the paper we study regressional versions of Lukacs' characterization of the gamma law. We consider constancy of regression instead of Lukacs' independence condition in three new schemes. Up to now the constancy of regressions of U=X/(X + Y) given V=X + Y for independent X and Y has been considered in the literature. Here we are concerned with constancy of regressions for X and Y while independence of U and V is assumed instead.

Book ChapterDOI
01 Apr 2002-Metrika
TL;DR: In this article, the authors presented an algorithm for computing the global minimum of an objective function with an inequality constraint, which can easily be modified to compute S-estimates as well.
Abstract: Constrained M-estimators for regression were introduced by Mendes and Tyler in 1995 as an alternative class of robust regression estimators with high breakdown point and high asymptotic efficiency. To compute the CM-estimate, the global minimum of an objective function with an inequality constraint has to be localized. To find the S-estimate for the same problem, we instead restrict ourselves to the boundary of the feasible region. The algorithm presented for computing CM-estimates can easily be modified to compute S-estimates as well. Testing is carried out with a comparison to the algorithm SURREAL by Ruppert.

Journal ArticleDOI
01 Oct 2002-Metrika
TL;DR: In this article, the estimation of the population mean when ranked set sampling [rss] is used for selecting the sample and non responses [nr] are present, is studied, where the nr stratum is sub sampled using simple random sampling with replacement.
Abstract: The estimation of the population mean when ranked set sampling [rss] is used for selecting the sample and non responses [nr] are present, is studied. The nr stratum is sub sampled using simple random sampling with replacement. Two strategies are analyzed. One of them is based on the selection of a sub sample from the nr in each cycle. The other uses sub samples selected among the nr in each rank.

Book ChapterDOI
01 Apr 2002-Metrika
TL;DR: In this article, the asymptotic behavior of a wide class of kernel estimators for estimating an unknown regression function is studied, in particular at discontinuity points of the regression function.
Abstract: We study the asymptotic behavior of a wide class of kernel estimators for estimating an unknown regression function. In particular we derive the asymptotic behavior at discontinuity points of the regression function. It turns out that some kernel estimators based on outlier robust estimators are consistent at jumps.

Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: In this paper, an estimator of the conditional bias is proposed and conditions that guarantee its unbiasedness are studied. And the results are applied in a Simple Random Sampling and in a Proportional Probability Aggregated Size Sampling, when the ratio estimator is used.
Abstract: The conditional bias has been proposed by Moreno Rebollo et al. (1999) as an influence diagnostic in survey sampling, when the inference is based on the randomization distribution generated by a random sampling. The conditional bias is a population parameter. So, from an applied point of view, it must be estimated. In this paper, we propose an estimator of the conditional bias and we study conditions that guarantee its unbiasedness. The results are applied in a Simple Random Sampling and in a Proportional Probability Aggregated Size Sampling, when the ratio estimator is used.

Journal ArticleDOI
01 Dec 2002-Metrika
TL;DR: The history of Latin and Eulerian squares from the times even before Leonhard Euler (1707-1783) to their systematic use in the design of experiments by (Sir) Ronald Aylmer Fisher (1890-c1962) is described in this article.
Abstract: The article presents some aspects of the history of Latin and, especially, Eulerian squares from the times even before Leonhard Euler (1707-1783) to their systematic use in the design of experiments by (Sir) Ronald Aylmer Fisher (1890–c1962).

Journal ArticleDOI
01 Dec 2002-Metrika
TL;DR: In this article, the sequence of test treatments and control treatment within each block of an A-optimal balanced treatment block design such that certain conditions are satisfied, the resulting design is an A -optimal repeated measurements designs when blocks are regarded as units or periods.
Abstract: Rearranging the sequence of test treatments and control treatment within each block of an A-optimal balanced treatment block design such that certain conditions are satisfied, the resulting design is an A-optimal repeated measurements designs when blocks are regarded as units or periods. The efficiencies of designs which are obtained from universally optimal repeated mea-surements designs with test treatments only by changing some treatment labels into control treatment are given.

Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: In this paper, a closed-form expression for the proportion which maximizes the maximum deviation of the mixture of normals to the best normal distribution is derived for scale-contaminated normal.
Abstract: In this paper we consider the case of the scale-contaminated normal (mixture of two normals with equal mean components but different component variances: (1−p)N(μ,σ2)+pN(μ,τ2) with σ and τ being non-negative and 0≤p≤1). Here \(\) is the scale error and p denotes the amount with which this error occurs. It's maximum deviation to the best normal distribution is studied and shown to be montone increasing with increasing scale error. A closed-form expression is derived for the proportion which maximizes the maximum deviation of the mixture of normals to the best normal distribution. Implications to power studies of tests for normality are pointed out.

Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: In this paper, a discrete model for life lengths and its properties is introduced and the corresponding maximum likelihood estimators of the parameters under Type I and Type II right-censoring are derived.
Abstract: Although there are many sophisticated models for estimation of failure rate based on censored data in continuous distributions, not much work has been done in the discrete case. We introduce a discrete model for life lengths and consider its properties. For this model, we derive the corresponding maximum likelihood estimators of the parameters under Type I and Type II right-censoring.

Journal ArticleDOI
01 Oct 2002-Metrika
TL;DR: In this article, a nonlinear regression model with N observations yi=η(xi,θ) +ei, and with the parameter θ subject to q nonlinear constraints Cj (θ)=0; j=1, …,q, is considered.
Abstract: The nonlinear regression model with N observations yi=η(xi,θ) +ei, and with the parameter θ subject to q nonlinear constraints Cj (θ)=0; j=1, …,q, is considered. As an example, the spline regression with unknown nodes is taken. Expressions for the variances (variance matrices) of the LSE are discussed. Because of the complexity of these expressions, and the singularity of the variance matrix of the LSE for θ, the optimality criteria and their properties, in particular the convexity and the equivalence theorem are considered from different aspects. Also the possibility of restriction to designs with limited values of measures of nonlinearity is mentioned.

Journal ArticleDOI
01 Jan 2002-Metrika
TL;DR: In this paper, the authors consider M-estimators for a class of semiparametric mixed-effect models without time-dependent covariates and show that the simple marginal estimation method is generally better than the same M estimator applied to the de-correlated response based on a known or estimated covariance matrix for each subject.
Abstract: We consider M-estimators for a class of semiparametric mixed-effect models without time-dependent covariates and show that the simple marginal estimation method is generally better than the same M-estimator applied to the de-correlated response based on a known or estimated covariance matrix for each subject.

Journal ArticleDOI
01 Apr 2002-Metrika
TL;DR: This article derived the asymptotic distributions of these two estimators for a large class of nonlinear regression and autoregressive models when the errors are independent and identically distributed.
Abstract: Often in the robust analysis of regression and time series models there is a need for having a robust scale estimator of a scale parameter of the errors. One often used scale estimator is the median of the absolute residuals s 1. It is of interest to know its limiting distribution and the consistency rate. Its limiting distribution generally depends on the estimator of the regression and/or autoregressive parameter vector unless the errors are symmetrically distributed around zero. To overcome this difficulty it is then natural to use the median of the absolute differences of pairwise residuals, s 2, as a scale estimator. This paper derives the asymptotic distributions of these two estimators for a large class of nonlinear regression and autoregressive models when the errors are independent and identically distributed. It is found that the asymptotic distribution of a suitably standardized s 2 is free of the initial estimator of the regression/autoregressive parameters. A similar conclusion also holds for s 1 in linear regression models through the origin and with centered designs, and in linear autoregressive models with zero mean errors.

Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: In this article, the authors considered the sequential point estimation problem of the powers of a normal scale parameter σr with r≠ 0 when the loss function is squared error plus linear cost, and showed that the regret due to using their fully sequential procedure in ignorance of σ is asymptotically minimized for estimating σ−2.
Abstract: We consider the sequential point estimation problem of the powers of a normal scale parameter σr with r≠ 0 when the loss function is squared error plus linear cost. It is shown that the regret due to using our fully sequential procedure in ignorance of σ is asymptotically minimized for estimating σ−2. We also propose a bias-corrected procedure to reduce the risk and show that the larger the distance between r and −2 is, the more effective our bias-corrected procedure is.

Journal ArticleDOI
01 Jul 2002-Metrika
TL;DR: It is proved that the best invariant estimators d0 exist and are the same as the best iterative estimator of a continuous distribution function under the squared error loss function L (F, d) =∫|F (t) −d ( t) |2dF ( t).
Abstract: For the invariant decision problem of estimating a continuous distribution function F with two entropy loss functions, it is proved that the best invariant estimators d 0 exist and are the same as the best invariant estimator of a continuous distribution function under the squared error loss function L (F, d)=∫|F (t) −d (t) |2 dF (t). They are minimax for any sample size n≥1.

Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: In this article, the non-null distribution of the product moment correlation coefficient r was obtained when sample is drawn from a mixture of two bivariate Gaussian distributions, and the moments of 1−r 2 have been used to derive the nonnull density of r.
Abstract: The present paper obtains the nonnull distribution of the product moment correlation coefficient r when sample is drawn from a mixture of two bivariate Gaussian distributions. The moments of 1−r 2 have been used to derive the nonnull density of r.

Journal ArticleDOI
01 Dec 2002-Metrika
TL;DR: Within the framework of classical linear regression model optimal design criteria of stochastic nature are considered in this article, where particular attention is paid to the shape criterion and its limit behaviour is established.
Abstract: Within the framework of classical linear regression model optimal design criteria of stochastic nature are considered. The particular attention is paid to the shape criterion. Also its limit behaviour is established which generalizes that of the distance stochastic optimality criterion. Examples of the limit maximin criterion are considered and optimal designs for the line fit model are found.

Book ChapterDOI
01 Apr 2002-Metrika
TL;DR: Empirical results suggest that the EA is faster and more accurate than the usual p-subset algorithm for computing the least quartile difference estimate in a multiple linear regression model.
Abstract: We propose an exchange algorithm (EA) for computing the least quartile difference estimate in a multiple linear regression model. Empirical results suggest that the EA is faster and more accurate than the usual p-subset algorithm.

Journal ArticleDOI
01 Jun 2002-Metrika
TL;DR: The authors considered the design problem for specific types of SS-ANOVA models as criteria for choosing the design points, and derived the integrated mean squared error (IMSE) for the estimate and its asymptotic approximation.
Abstract: Smoothing spline estimation of a function of several variables based on an analysis of variance decomposition (SS-ANOVA) is one modern nonparametric technique This paper considers the design problem for specific types of SS-ANOVA models As criteria for choosing the design points, the integrated mean squared error (IMSE) for the SS-ANOVA estimate and its asymptotic approximation are derived based on the correspondence between the SS-ANOVA model and the random effects model with a partially improper prior Three examples for additive and interaction spline models are provided for illustration A comparison of the asymptotic designs, the 2d factorial designs, and the glp designs is given by numerical computation