scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Applied Statistics in 2000"


Journal ArticleDOI
TL;DR: This paper proposes a general methodology for bootstrapping in frontier models, extending the more restrictive method proposed in Simar & Wilson (1998) by allowing for heterogeneity in the structure of efficiency.
Abstract: The Data Envelopment Analysis method has been extensively used in the literature to provide measures of firms' technical efficiency. These measures allow rankings of firms by their apparent performance. The underlying frontier model is non-parametric since no particular functional form is assumed for the frontier model. Since the observations result from some data-generating process, the statistical properties of the estimated efficiency measures are essential for their interpretations. In the general multi-output multi-input framework, the bootstrap seems to offer the only means of inferring these properties (i.e. to estimate the bias and variance, and to construct confidence intervals). This paper proposes a general methodology for bootstrapping in frontier models, extending the more restrictive method proposed in Simar & Wilson (1998) by allowing for heterogeneity in the structure of efficiency. A numerical illustration with real data is provided to illustrate the methodology.

1,086 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider various unresolved inference problems for the skewnormal distribution and give reasons as to why the direct parameterization should not be used as a general basis for estimation, and consider method of moments and maximum likelihood estimation for the distribution's centred parameterization.
Abstract: This paper considers various unresolved inference problems for the skewnormal distribution We give reasons as to why the direct parameterization should not be used as a general basis for estimation, and consider method of moments and maximum likelihood estimation for the distribution's centred parameterization Large sample theory results are given for the method of moments estimators, and numerical approaches for obtaining maximum likelihood estimates are discussed Simulation is used to assess the performance of the two types of estimation We also present procedures for testing for departures from the limiting folded normal distribution Data on the percentage body fat of elite athletes are used to illustrate some of the issues raised

202 citations


Journal ArticleDOI
TL;DR: In this paper, the size and power of various generalization tests for the Granger-causality in integrated-cointegrated VAR systems are considered using Monte Carlo methods.
Abstract: The size and power of various generalization tests for the Granger-causality in integrated-cointegrated VAR systems are considered. By using Monte Carlo methods, properties of eight versions of the test are studied in two different forms, the standard form and the modified form by Dolado & Lutkepohl (1996) in a study confined to properties of the Wald test only. In their study as well as in ours, both the standard and the modified Wald tests are shown to perform badly especially in small samples. We find, however, that the corrected LR tests exhibit correct size even in small samples. The power of the test is higher when the true VAR(2) model is estimated, and the modified test loses information by estimating the extra coefficients. The same is true when considering the power results in the VAR(3) model, and the power of the tests is somewhat lower than those in the VAR(2).

174 citations


Journal ArticleDOI
TL;DR: In this article, the authors propose and analyse semiparametric and parametric transformation models for this two-sample problem, assuming that the healthy and diseased measurements have two normal distributions with different means and variances.
Abstract: A receiver operating characteristic (ROC) curve is a plot of two survival functions, derived separately from the diseased and healthy samples. A special feature is that the ROC curve is invariant to any monotone transformation of the measurement scale. We propose and analyse semiparametric and parametric transformation models for this two-sample problem. Following an unspecified or specified monotone transformation, we assume that the healthy and diseased measurements have two normal distributions with different means and variances. Maximum likelihood algorithms for estimating ROC curve parameters are developed. The proposed methods are illustrated on the marker CA125 in the diagnosis of gastric cancer.

111 citations


Journal ArticleDOI
TL;DR: Six models are applied to data sets obtained from biological control assays for Diatraea saccharalis, a common pest in sugar cane production, based on a finite mixture model with the binomial distribution or its overdispersed version for the positive data.
Abstract: Biological control of pests is an important branch of entomology, providing environmentally friendly forms of crop protection. Bioassays are used to find the optimal conditions for the production of parasites and strategies for application in the field. In some of these assays, proportions are measured and, often, these data have an inflated number of zeros. In this work, six models will be applied to data sets obtained from biological control assays for Diatraea saccharalis , a common pest in sugar cane production. A natural choice for modelling proportion data is the binomial model. The second model will be an overdispersed version of the binomial model, estimated by a quasi-likelihood method. This model was initially built to model overdispersion generated by individual variability in the probability of success. When interest is only in the positive proportion data, a model can be based on the truncated binomial distribution and in its overdispersed version. The last two models include the zero proport...

83 citations


Journal ArticleDOI
TL;DR: In this paper, the analysis of Weibull distributed lifetime data observed under Type II progressive censoring with random removals, where the number of units removed at each failure time follows a binomial distribution is derived.
Abstract: This paper considers the analysis of Weibull distributed lifetime data observed under Type II progressive censoring with random removals, where the number of units removed at each failure time follows a binomial distribution. Maximum likelihood estimators of the parameters and their asymptotic variances are derived. The expected time required to complete the life test under this censoring scheme is investigated.

83 citations


Journal ArticleDOI
TL;DR: In this article, the authors employed the Burr distribution to conduct the economic-statistical design of X ¥ charts for non-normal data, and the cumulative function of the Burr distributions was applied to derive the statistical constraints of the design.
Abstract: When the X ¥ control chart is used to monitor a process, three parameters should be determined: the sample size, the sampling interval between successive samples, and the control limits of the chart. Duncan presented a cost model to determine the three parameters for an X ¥ chart. Alexander et al. combined Duncan's cost model with the Taguchi loss function to present a loss model for determining the three parameters. In this paper, the Burr distribution is employed to conduct the economic-statistical design of X ¥ charts for non-normal data. Alexander's loss model is used as the objective function, and the cumulative function of the Burr distribution is applied to derive the statistical constraints of the design. An example is presented to illustrate the solution procedure. From the results of the sensitivity analyses, we find that small values of the skewness coefficient have no significant effect on the optimal design; however, a larger value of skewness coefficient leads to a slightly larger sample siz...

76 citations


Journal ArticleDOI
TL;DR: This paper provides a simple and robust method for detecting cheating and non-cheating behaviour and not cheating behaviour is modelled because this requires the fewest assumptions.
Abstract: This paper provides a simple and robust method for detecting cheating. Unlike some methods, non-cheating behaviour and not cheating behaviour is modelled because this requires the fewest assumptions. The main concern is the prevention of false accusations. The model is suitable for screening large classes and the results are simple to interpret. Simulation and the Bonferroni inequality are used to prevent false accusation due to 'data dredging'. The model has received considerable application in practice and has been verified through the adjacent seating method.

69 citations


Journal ArticleDOI
TL;DR: A way to find the best partition of each variable using a simulated annealing strategy and theoretical and empirical comparisons of two such additive models, one based on weights of evidence and another based on logistic regression are presented.
Abstract: In many domains, simple forms of classification rules are needed because of requirements such as ease of use. A particularly simple form splits each variable into just a few categories, assigns weights to the categories, sums the weights for a new object to be classified, and produces a classification by comparing the score with a threshold. Such instruments are often called scorecards. We describe a way to find the best partition of each variable using a simulated annealing strategy. We present theoretical and empirical comparisons of two such additive models, one based on weights of evidence and another based on logistic regression.

44 citations


Journal ArticleDOI
TL;DR: In this paper, a parsimonious model for the analysis of underreported Poisson count data is presented, which is able to derive analytic expressions for the key marginal posterior distributions that are of interest.
Abstract: In this paper we present a parsimonious model for the analysis of underreported Poisson count data. In contrast to previously developed methods, we are able to derive analytic expressions for the key marginal posterior distributions that are of interest. The usefulness of this model is explored via a re-examination of previously analysed data covering the purchasing of port wine (Ramos, 1999).

38 citations


Journal ArticleDOI
TL;DR: This paper showed that conversion to well-formed polynomials is beneficial in terms of goodness of fit, as well as giving fits invariant to linear transformation of the x variables.
Abstract: Well-formed polynomials contain the marginal terms of all terms; for example, they contain both x 1 and x 2 if x 1 x 2 is present. Such models have a goodness of fit that is invariant to linear transformations of the x variables. Recently, selection procedures have been proposed which may not give well-formed polynomials. Analysis of two data sets for which non-well-formed polynomials have been selected shows that conversion to well-formed polynomials is beneficial in terms of goodness of fit, as well as giving fits invariant to linear transformation of the x variables. It is concluded that selection procedures should search among well-formed polynomials only.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an alternative class of Liu-type M-estimators, which are obtained by shrinking an M-stimator beta M, instead of the OLS estimator using the matrix (X'X + I) -1 (X''X + dI).
Abstract: Consider the regression model y = beta 0 1 + Xbeta + epsilon. Recently, the Liu estimator, which is an alternative biased estimator beta L (d) = (X'X + I) -1 (X'X + dI)beta OLS , where 0

Journal ArticleDOI
TL;DR: The delta-corrected Kolmogorov-Smirnov test has been shown to be uniformly more powerful than the classical KSM test for small to moderate sample sizes as discussed by the authors.
Abstract: The delta-corrected Kolmogorov-Smirnov test has been shown to be uniformly more powerful than the classical Kolmogorov-Smirnov test for small to moderate sample sizes. However, the delta-corrected test consists of two tests, leading to a slight inflation of the experimentwise type I error rate. The critical values of the delta-corrected test are adjusted to take into account the two-stage nature of the test, ensuring an experimentwise error rate at the nominal level. A power study confirms that the resulting so-called two-stage delta-corrected test is uniformly more powerful than the classical Kolmogorov-Smirnov test, with power improvements of up to 46 percentage points.

Journal ArticleDOI
TL;DR: The concept acceptance number has been incorporated to the single level continuous sampling plan CSP-1 to achieve a reduction in the average fraction inspected at good quality levels.
Abstract: In this paper, the concept acceptance number has been incorporated to the single level continuous sampling plan CSP-1. The advantage of the proposed plan, designated as the CSP-C plan, is to achieve a reduction in the average fraction inspected at good quality levels. Nomographs for the design of the proposed plan are presented. The expressions of the performance measures for this new plan such as OC, AOQ and AFI are also provided.

Journal ArticleDOI
TL;DR: In this article, the problem of process monitoring when the process is of high quality and measurement values possess a certain serial dependence is examined and a Markov model for this type of process is studied, upon which suitable control procedures can be developed.
Abstract: For high-quality processes, non-conforming items are seldom observed and the traditional p (or np) charts are not suitable for monitoring the state of the process. A type of chart based on the count of cumulative conforming items has recently been introduced and it is especially useful for automatically collected one-at-a-time data. However, in such a case, it is common that the process characteristics become dependent as items produced one after another are inspected. In this paper, we study the problem of process monitoring when the process is of high quality and measurement values possess a certain serial dependence. The problem of assuming independence is examined and a Markov model for this type of process is studied, upon which suitable control procedures can be developed.

Journal ArticleDOI
TL;DR: In this paper, the authors show that when series are fractionally integrated, but unit root tests wrongly indicate that they are I(1), Johansen likelihood ratio (LR) tests tend to find too much spurious cointegration, while the Engle-Granger test presents a more robust performance.
Abstract: This paper shows that when series are fractionally integrated, but unit root tests wrongly indicate that they are I(1), Johansen likelihood ratio (LR) tests tend to find too much spurious cointegration, while the Engle-Granger test presents a more robust performance. This result holds asymptotically as well as infinite samples. The different performance of these two methods is due to the fact that they are based on different principles. The Johansen procedure is based on maximizing correlations (canonical correlation) while Engle-Granger minimizes variances (in the spirit of principal components).

Journal ArticleDOI
TL;DR: In this article, a quantile approach to produce a control chart and estimate median rankit for various non-normal distributions is discussed, where the data for a statistical control chart is normally distributed.
Abstract: It is desirable that the data for a statistical control chart be normally distributed However, if the data are not normal, then a transformation can be used, eg Box-Cox transformations, to produce a suitable control chart In this paper we will discuss a quantile approach to produce a control chart and to estimate median rankit for various non-normal distributions We will also provide examples of logistic data to indicate how a quantile approach could be used to construct a control chart for a non-normal distribution using a median rankit

Journal ArticleDOI
TL;DR: The two approaches to quasi-likelihood (QL) are described and contrasted, and an example is used to illustrate the advantages of the QL approach proper.
Abstract: Models described as using quasi-likelihood (QL) are often using a different approach based on the normal likelihood, which I call pseudo-likelihood. The two approaches are described and contrasted, and an example is used to illustrate the advantages of the QL approach proper.

Journal ArticleDOI
TL;DR: In this article, a generalized version of the staggered nested design is proposed, which has a simple open-ended structure and each sum of squares in the analysis of variance has almost the same degrees of freedom.
Abstract: Staggered nested experimental designs are the most popular class of unbalanced nested designs in practical fields. The most important features of the staggered nested design are that it has a very simple open-ended structure and each sum of squares in the analysis of variance has almost the same degrees of freedom. Based on the features, a class of unbalanced nested designs that is a generalization of the staggered nested design is proposed in this paper. Formulae for the estimation of variance components and their sums are provided. Comparing the variances of the estimators to the staggered nested designs, it is found that some of the generalized staggered nested designs are more efficient than the traditional staggered nested design in estimating some of the variance components and their sums. An example is provided for illustration.

Journal ArticleDOI
TL;DR: In this article, the influence of observations on the parameter estimates for the simple structural errors-in-variables model with no equation error is investigated using the local influence method, which is useful for outlier detection especially when a masking phenomenon is present.
Abstract: The influence of observations on the parameter estimates for the simple structural errors-in-variables model with no equation error is investigated using the local influence method. Residuals themselves are not sufficient for detecting outliers. The likelihood displacement approach is useful for outlier detection especially when a masking phenomenon is present. An illustrative example is provided.

Journal ArticleDOI
TL;DR: In this article, a normalizing power transformation is used to transform individual measurement control charts based on the transformed data and a comparison with the control chart when using probability limits is also carried out for cases of known and estimated parameters.
Abstract: Many process characteristics follow an exponential distribution, and control charts based on such a distribution have attracted a lot of attention. However, traditional control limits may be not appropriate because of the lack of symmetry. In this paper, process monitoring through a normalizing power transformation is studied. The traditional individual measurement control charts can be used based on the transformed data. The properties of this control chart are investigated. A comparison with the chart when using probability limits is also carried out for cases of known and estimated parameters. Without losing much accuracy, even compared with the exact probability limits, the power transformation approach can easily be used to produce charts that can be interpreted when the normality assumption is valid.

Journal ArticleDOI
TL;DR: In this article, the authors extend the widely used classical Brownian motion technique for monitoring clinical trial data to a larger class of stochastic processes, i.e., fractional Brownian Motion, and compare these results.
Abstract: The purpose of this paper is to extend the widely used classical Brownian motion technique for monitoring clinical trial data to a larger class of stochastic processes, i.e. fractional Brownian motion, and compare these results. The beta-blocker heart attack trial is presented as an example to illustrate both methods.

Journal ArticleDOI
TL;DR: In this article, the authors combine continuation-ratio logits and the theory for generalized linear mixed models to analyze the age composition of fishes in an ordered multinomial response, where the associated log-likelihood splits into separate terms for each category levels.
Abstract: Major sources of information for the estimation of the size of the fish stocks and the rate of their exploitation are samples from which the age composition of catches may be determined. However, the age composition in the catches often varies as a result of several factors. Stratification of the sampling is desirable, because it leads to better estimates of the age composition, and the corresponding variances and covariances. The analysis is impeded by the fact that the response is ordered categorical. This paper introduces an easily applicable method to analyze such data. The method combines continuation-ratio logits and the theory for generalized linear mixed models. Continuation-ratio logits are designed for ordered multinomial response and have the feature that the associated log-likelihood splits into separate terms for each category levels. Thus, generalized linear mixed models can be applied separately to each level of the logits. The method is illustrated by the analysis of age-composition data c...

Journal ArticleDOI
TL;DR: The computational algebraic techniques (Gröbner bases in particular) are coupled with statistical strategies and the links to more standard approaches made and a new method of analysing a non-orthogonal experiment based on the GröBner basis method is introduced.
Abstract: The Grobner basis method in experimental design (Pistone & Wynn, 1996) is developed in a practical setting. The computational algebraic techniques (Grobner bases in particular) are coupled with statistical strategies and the links to more standard approaches made. A new method of analysing a non-orthogonal experiment based on the Grobner basis method is introduced. Examples are given utilizing the approaches.

Journal ArticleDOI
TL;DR: In this paper, the average run length for a one-sided Cusum chart varies as a function of the length of the sampling interval between consecutive observations, the decision limit for the cusum statistic, and the amount of autocorrelation between successive observations.
Abstract: This paper shows how the average run length for a one-sided Cusum chart varies as a function of the length of the sampling interval between consecutive observations, the decision limit for the Cusum statistic, and the amount of autocorrelation between successive observations. It is shown that the rate of false alarms can be decreased considerably, without modifying the rate of valid alarms, by decreasing the sampling interval and appropriately increasing the decision interval. It is also shown that this can be done even when the shorter sampling interval induces moderate autocorrelation between successive observations.

Journal ArticleDOI
TL;DR: In this article, the distribution-free slope estimates are first trimmed on both sides and then the test statistic t is transformed by Johnson's method for each group to correct non-normality, and an approximate test such as the James second-order test, the Welch test, or the DeShon-Alexander test is applied to test the equality of regression slopes.
Abstract: To deal with the problem of non-normality and heteroscedasticity, the current study proposes applying approximate transformation trimmed mean methods to the test of simple linear regression slope equality. The distribution-free slope estimates are first trimmed on both sides and then the test statistic t is transformed by Johnson's method for each group to correct non-normality. Lastly, an approximate test such as the James second-order test, the Welch test, or the DeShon-Alexander test, which are robust for heterogeneous variances, is applied to test the equality of regression slopes. Bootstrap methods and Monte Carlo simulation results show that the proposed methods provide protection against both unusual y values, as well as unusual x values. The new methods are valid alternatives for testing the simple linear regression slopes when heteroscedastic variances and nonnormality are present.

Journal ArticleDOI
TL;DR: In this paper, the authors derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the aggregate data (mean values) from each study are available.
Abstract: We derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the 'aggregate' data (mean values) from each study are available. The estimators we proposed are the discriminant function estimator and the reverse Taylor series approximation. These two methods of estimation gave similar estimators using an example of individual data. However, when aggregate data were used, the discriminant function estimators were quite different from the other two estimators. A simulation study was then performed to evaluate the performance of these two estimators as well as the estimator obtained from the model that simply uses the aggregate data in a logistic regression model. The simulation study showed that all three estimators are biased. The bias increases as the variance of the covariate increases. The distribution type of the covariates also affects the bias. In general, the estimator from the logistic regression using the aggregate data has less bias and better coverage probab...

Journal ArticleDOI
TL;DR: In this article, a class of B-optimal modifications of the stepwise alternatives to Hotellings T 2 is presented. But these modifications have superior power properties and reasonable type I error control with non-normal populations.
Abstract: Hotelling's T 2 test is known to be optimal under multivariate normality and is reasonably validity-robust when the assumption fails. However, some recently introduced robust test procedures have superior power properties and reasonable type I error control with non-normal populations. These, including the tests due to Tiku & Singh (1982), Tiku & Balakrishnan (1988) and Mudholkar & Srivastava (1999b, c), are asymptotically valid but are useful with moderate size samples only if the population dimension is small. A class of B-optimal modifications of the stepwise alternatives to Hotellings T 2 introduced by Mudholkar & Subbaiah (1980) are simple to implement and essentially equivalent to the T 2 test even with small samples. In this paper we construct and study the robust versions of these modified stepwise tests using trimmed means instead of sample means. We use the robust one- and two-sample trimmed- t procedures as in Mudholkar et al. (1991) and propose statistics based on combining them. The results o...

Journal ArticleDOI
TL;DR: In this paper, the authors define a composite quantile function estimator in order to improve the accuracy of the classical bootstrap procedure in small sample setting, which is easily programmed using standard software packages and has general applicability.
Abstract: In this note we define a composite quantile function estimator in order to improve the accuracy of the classical bootstrap procedure in small sample setting. The composite quantile function estimator employs a parametric model for modelling the tails of the distribution and uses the simple linear interpolation quantile function estimator to estimate quantiles lying between 1/(n+1) and n/(n+1). The method is easily programmed using standard software packages and has general applicability. It is shown that the composite quantile function estimator improves the bootstrap percentile interval coverage for a variety of statistics and is robust to misspecification of the parametric component. Moreover, it is also shown that the composite quantile function based approach surprisingly outperforms the parametric bootstrap for a variety of small sample situations.

Journal ArticleDOI
TL;DR: A modified tightened two-level continuous sampling plan for MLP-T-2 is considered, for which the rules concerning partial inspection depend on the length of time it takes to decide that the process quality is good enough that 100% inspection may be suspended (e.g. the time required to find i consecutive items free of defects).
Abstract: In this paper, a modification is proposed on the tightened two-level continuous sampling plan. The tightened two-level plan is one of the three tightened multi-level continuous sampling plans of Derman et al. (1957) with two sampling levels. A modified tightened two-level continuous sampling plan is considered, for which the rules concerning partial inspection depend, in part, on the length of time it takes to decide that the process quality is good enough that 100% inspection may be suspended (e.g. the time required to find i consecutive items free of defects). Using a Markov chain model, expressions for the performance measures of the modified MLP-T-2 plan are derived. The modified MLP-T-2 plan is shown to be identical to the MLP-T-2 plan. Tables are also presented for the selection of the modified MLP-T-2 plan when the AQL or LQL and AOQL are specified.