Showing papers in "Journal of Applied Statistics in 1998"
••
TL;DR: Parametric and non-parametric approaches to warping, and matching criteria, are reviewed.
Abstract: Summary Image warping is a transformation which maps all positions in one image plane to positions in a second plane. It arises in many image analysis problems, whether in order to remove optical distortions introduced by a camera or a particular viewing perspective, to register an image with a map or template, or to align two or more images. The choice of warp is a compromise between a smooth distortion and one which achieves a good match. Smoothness can be ensured by assuming a parametric form for the warp or by constraining it using differential equations. Matching can be specified by points to be brought into alignment, by local measures of correlation between images, or by the coincidence of edges. Parametric and non-parametric approaches to warping, and matching criteria, are reviewed.
337 citations
••
TL;DR: In this article, the authors compare properties of parameter estimators under Akaike information criterion (AIC) and consistent AIC (CAIC) model selection in a nested sequence of open population capture-recapture models.
Abstract: Summary We compare properties of parameter estimators under Akaike information criterion (AIC) and 'consistent' AIC (CAIC) model selection in a nested sequence of open population capture-recapture models. These models consist of product multinomials, where the cell probabilities are parameterized in terms of survival ( ) and capture ( p ) i i probabilities for each time interval i . The sequence of models is derived from 'treatment' effects that might be (1) absent, model H ; (2) only acute, model H ; or (3) acute and 0 2 p chronic, lasting several time intervals, model H . Using a 35 factorial design, 1000 3 repetitions were simulated for each of 243 cases. The true number of parameters ranged from 7 to 42, and the sample size ranged from approximately 470 to 55 000 per case. We focus on the quality of the inference about the model parameters and model structure that results from the two selection criteria. We use achieved confidence interval coverage as an integrating metric to judge what constitutes a ...
255 citations
••
TL;DR: The robustness of the t -test was evaluated by repeated computer testing for differences between samples from two populations of equal means but non-normal distributions and with different variances and sample sizes as discussed by the authors.
Abstract: Summary When the assumptions of parametric statistical tests for the difference between two means are violated, it is commonly advised that non-parametric tests are a more robust substitute. The history of the investigation of this issue is summarized. The robustness of the t -test was evaluated, by repeated computer testing for differences between samples from two populations of equal means but non-normal distributions and with different variances and sample sizes. Two common alternatives to t -Welch's approximate t and the Mann-Whitney U -test-were evaluated in the same way. The t -test is sufficiently robust for use in all likely cases, except when skew is severe or when population variances and sample sizes both differ. The Welch test satisfactorily addressed the latter problem, but was itself sensitive to departures from normality. Contrary to its popular reputation, the U -test showed a dramatic 'lack of robustness' in many cases-largely because it is sensitive to population differences other than b...
91 citations
••
TL;DR: A new index Cp"k is introduced, which is shown to be superior to the existing generalizations of Cpk, and the statistical properties of the natural estimator of C pk are investigated, assuming that the process is normally distributed.
Abstract: Summary The process capability index Cpk has been widely used in manufacturing industry to provide numerical measures of process potential and performance. As noted by many quality control researchers and practitioners, Cpk is yield-based and is independent of the target T. This fails to account for process centering with symmetric tolerances, and presents an even greater problem with asymmetric tolerances. To overcome the problem, several generalizations of Cpk have been proposed to handle processes with asymmetric tolerances. Unfortunately, these generalizations understate or overstate the process capability in many cases, so reflect the process potential and performance inaccurately. In this paper, we first introduce a new index Cp"k, which is shown to be superior to the existing generalizations of Cpk. We then investigate the statistical properties of the natural estimator of Cp"k, assuming that the process is normally distributed.
63 citations
••
TL;DR: In this article, the authors discuss the use of the process capability indexes Cp and Cpk when the process data are autocorrelated, and propose an interval estimation procedure for each of them.
Abstract: Process capability indexes are widely used in the manufacturing industries and by supplier companies in process assessments and in the evaluation of purchasing decisions. One concern about using the process capability indexes is the assumption of the mutual independence of the process data, because, in process industries, process data are often autocorrelated. This paper discusses the use of the process capability indexes Cp and Cpk when the process data are autocorrelated. Interval estimation procedures for Cp and Cpk are proposed and their properties are studied.
61 citations
••
TL;DR: This article investigated the relationship between economic growth and carbon dioxide emissions and found that richer countries exhibit technical progress in a way that economizes on CO 2 emissions but that poorer countries do not, and there is no indication that the growth process is leading poorer countries to move towards the same pollution-ameliorating technology as characterizes richer countries.
Abstract: This paper uses data for 44 countries from 1970-1990, to investigate the relationship between economic growth and carbon dioxide emissions. Empirical results are obtained from a structural model from the empirical growth literature modified to include environmental 'bads'. Results suggest that richer countries exhibit technical progress in a way that economizes on carbon dioxide emissions but that poorer countries do not. Furthermore, there is no indication that the growth process is leading poorer countries to move towards the adoption of the same pollution-ameliorating technology as characterizes richer countries.
54 citations
••
TL;DR: In this paper, the effect of stochastic measurement error (gauge imprecision) on the performance of Shewhart-type X-S control charts was examined and shown that gauge imprecise may seriously affect the ability of the chart to detect process disturbances quickly or, depending on the point in time when the error occurs, the probability of erroneously signalling an out-of-control process state.
Abstract: This paper examines the effect of stochastic measurement error (gauge imprecision) on the performance of Shewhart-type X-S control charts. It is shown that gauge imprecision may seriously affect the ability of the chart to detect process disturbances quickly or, depending on the point in time when the error occurs, the probability of erroneously signalling an out-of-control process state.
50 citations
••
TL;DR: In this paper, the concept of chain sampling is extended to variables inspection when the standard deviation of the normally distributed characteristic is known, and a discussion of the shape of the known sigma single-sampling variables plan is given.
Abstract: Summary This paper extends the concept of chain sampling to variables inspection when the standard deviation of the normally distributed characteristic is known. A discussion of the shape of the known sigma single-sampling variables plan is given. The chain sampling plan for variables inspection will be useful when testing is costly or destructive.
44 citations
••
TL;DR: In this paper, the authors present reliability sampling plans for the two-parameter exponential distribution under progressive censoring, which are quite useful to practitioners, because they provide savings in resources and in total test time.
Abstract: This paper presents reliability sampling plans for the two-parameter exponential distribution under progressive censoring. These sampling plans are quite useful to practitioners, because they provide savings in resources and in total test time. Furthermore, they off er the flexibility to remove functioning test specimens from further testing at various stages of the experimentation. In the construction of these sampling plans, the operating characteristic curve is derived using the exact distributional properties of maximum likelihood estimators. An example is given to illustrate the application of the proposed sampling plans.
40 citations
••
TL;DR: In this article, the authors proposed a generalization of Tomizawa's measures by using the average of the power divergence of Cressie and Read, or the averaging of the diversity index of Patil and Taillie.
Abstract: For square contingency tables that have nominal categories, Tomizawa considered two kinds of measure to represent the degree of departure from symmetry. This paper proposes a generalization of those measures. The proposed measure is expressed by using the average of the power divergence of Cressie and Read, or the average of the diversity index of Patil and Taillie. Special cases of the proposed measure include Tomizawa's measures. The proposed measure would be useful for comparing the degree of departure from symmetry in several tables.
39 citations
••
TL;DR: In this paper, a non-linear model for examining genotypic responses across an array of environments is contrasted with the joint regression formulation, and a rigorous approach to hypothesis testing using the conditional error principle is demonstrated.
Abstract: Summary A non-linear model for examining genotypic responses across an array of environments is contrasted with the 'joint regression' formulation, and a rigorous approach to hypothesis testing using the conditional error principle is demonstrated. The model is extended to cater for situations where single straight-line response patterns fail to characterize genotypic behaviors over an environmental array: a combination of two straight lines, with slope in below-average and in above-average environments, is offered as the 1 2 simplest representation of convex and concave patterns. A protocol for classifying genotypes according to the results of hypothesis tests, i.e. H( = ) and H( = = = 1), is 1 2 1 2 presented . A doubly desirable response pattern is convex ( 1> ). 1 2
••
TL;DR: In this article, the expected experiment times for Weibull-distributed lifetimes under type II progressive censoring, with the numbers of removals being random, were investigated.
Abstract: Summary This paper considers the expected experiment times for Weibull-distributed lifetimes under type II progressive censoring, with the numbers of removals being random. The formula to compute the expected experiment times is given. A detailed numerical study of this expected time is carried out for different combinations of model parameters. Furthermore, the ratio of the expected experiment time under this type of progressive censoring to the expected experiment time under complete sampling is studied.
••
TL;DR: In this article, the authors consider the case in which stress levels are changed at a finite rate, and develop two types of ALT plan under the assumptions of exponential lifetimes of test units and type I censoring.
Abstract: Summary Most of the previous work on optimal design of accelerated life test (ALT) plans has assumed instantaneous changes in stress levels, which may not be possible or desirable in practice, because of the limited capability of test equipment, possible stress shocks or the presence of undesirable failure modes. We consider the case in which stress levels are changed at a finite rate, and develop two types of ALT plan under the assumptions of exponential lifetimes of test units and type I censoring. One type of plan is the modified step-stress ALT plan, and the other type is the modified constant-stress ALT plan. These two plans are compared in terms of the asymptotic variance of the maximum likelihood estimator of the log mean lifetime for the use condition (i.e. avar\[ln (0)]). Computational results indicate that, for both types of plan, avar\[ln (0)] is not sensitive to the stress-increasing rate R, if R is greater than or equal to 10, say, in the standardized scale. This implies that the proposed str...
••
TL;DR: In this article, the authors have estimated vector autoregression (VAR), BVAR, and vector error-correction models (VECMs) using annual time-series data of South Korea for 1950-94.
Abstract: In this paper, we have estimated vector autoregression (VAR), Bayesian vector autoregression (BVAR) and vector error-correction models (VECMs) using annual time-series data of South Korea for 1950-94. We find evidence supporting the view that growth of real per-capita income has been aided by income, investment and export growth, as well as government spending and exchange rate policies. The VECMs provide better forecasts of growth than do the VAR and BVAR models for both short-term and long-term predictions.
••
TL;DR: In this paper, a simple non-parametric two-stage procedure based on the sign test and a percentile modified two-sample Wilcoxon test is proposed for continuous distributions about a known center.
Abstract: In recent years, McWilliams and Tajuddin have proposed new and more powerful non-parametric tests of symmetry for continuous distributions about a known center. In this paper, we propose a simple non-parametric two-stage procedure based on the sign test and a percentile-modified two-sample Wilcoxon test. The small-sample properties of this test, Tajuddin's test, McWilliams' test and a modified runs test of Modarres and Gastwirth are investigated in a Monte Carlo simulation study. The simulations indicate that, for a wide variety of asymmetric alternatives in the lambda family, the hybrid test is more powerful than are existing tests in the literature.
••
TL;DR: In this paper, the authors examined the control procedures based on the conforming unit run lengths applied to near-zero-defect processes in the presence of serial correlation and derived control limits.
Abstract: High-yield production processes that involve a low fraction non-conforming are becoming more common, and the limitations of the standard control charting procedures for such processes are well known. This paper examines the control procedures based on the conforming unit run lengths applied to near-zero-defect processes in the presence of serial correlation. Using a correlation binomial model, a few control schemes are investigated and control limits are derived. The results reduce to the traditional case when the measurements are independent. However, it is shown that the false alarm rate cannot be reduced to below the amount of serial correlation present in the process.
••
TL;DR: The multivariate viability index Vrn as mentioned in this paper was introduced as an intuitively appealing measure of the capability potential of a process and is related to the well-known index Cp but has some advantages over it.
Abstract: The viability index Vr is introduced as an intuitively appealing measure of the capability potential of a process. It is related to the well-known index Cp but has some advantages over it. The statistical properties of Vr are readily obtainable and, unlike Cp, it extends naturally to multi-response processes. The multivariate viability index Vrn is defined, discussed and illustrated using an example from the minerals sector.
••
TL;DR: This paper studies the application of genetic algorithms to the construction of exact D-optimal experimental designs for three different types of model and compares that of the modified Fedorov algorithm.
Abstract: This paper studies the application of genetic algorithms to the construction of exact D-optimal experimental designs. The concept of genetic algorithms is introduced in the general context of the problem of finding optimal designs. The algorithm is then applied specifically to finding exact D-optimal designs for three different types of model. The performance of genetic algorithms is compared with that of the modified Fedorov algorithm in terms of computing time and relative efficiency. Finally, potential applications of genetic algorithms to other optimality criteria and to other types of model are discussed, along with some open problems for possible future research.
••
TL;DR: In this paper, a new tightening concept has been incorporated into the single-level continuous sampling plan CSP-1, such that quality degradation will warrant sampling inspection to cease beyond a certain number of sampled items, until new evidence of good quality is established.
Abstract: In this paper, a new tightening concept has been incorporated into the single-level continuous sampling plan CSP-1, such that quality degradation will warrant sampling inspection to cease beyond a certain number of sampled items, until new evidence of good quality is established. The expressions of the performance measures for this new plan, such as the operating characteristic, average outgoing quality and average fraction inspected, are derived using a Markov chain model. The advantage of the tightened CSP-1 plan is that it is possible to lower the average outgoing quality limit.
••
TL;DR: In this paper, a variety of tests of univariate normality are applied, both to the original lead isotope ratios and to transformations of them based on principal component analysis; this is not an optimal approach, but is sufficient in the cases considered to suggest that fields are, in fact, "non-normal".
Abstract: Samples from ore bodies, mined for copper in antiquity, can be characterized by measurements on three lead isotope ratios. Given sufficient samples, it is possible to estimate the lead isotope field-a three-dimensional construct-that characterizes the ore body. For the purposes of estimating the extent of a field, or assessing whether bronze artefacts could have been made using copper from a particular field, it is often assumed that fields have a trivariate normal distribution. Using recently published data, for which the sample sizes are larger than usual, this paper casts doubt on this assumption. A variety of tests of univariate normality are applied, both to the original lead isotope ratios and to transformations of them based on principal component analysis; the paper can be read as a case study in the use of tests of univariate normality for assessing multivariate normality. This is not an optimal approach, but is sufficient in the cases considered to suggest that fields are, in fact, 'non-normal'....
••
TL;DR: In this article, the authors consider quantile regression to combine quantile estimates and show that a combination with zero constant and weights that sum to unity is not necessarily unbiased, and establish necessary and sufficient conditions for unbiasedness of a quantile estimate.
Abstract: A novel proposal for combining forecast distributions is to use quantile regression to combine quantile estimates. We consider the usefulness of the resultant linear combining weights. If the quantile estimates are unbiased, then there is strong intuitive appeal for omitting the constant and constraining the weights to sum to unity in the quantile regression. However, we show that suppressing the constant renders one of the main attractive features of quantile regression invalid. We establish necessary and sufficient conditions for unbiasedness of a quantile estimate, and show that a combination with zero constant and weights that sum to unity is not necessarily unbiased.
••
TL;DR: A design for skip-lot sampling inspection plans with the double-sampling plan as the reference plan is presented, so as to reduce the sample size and produce more efficient plans in return for the same sampling effort.
Abstract: This paper presents a design for skip-lot sampling inspection plans with the double-sampling plan as the reference plan, so as to reduce the sample size and produce more efficient plans in return for the same sampling effort. The efficiency of the proposed plan compared with that of the conventional double-sampling plan is also discussed. The need for smaller acceptance numbers under the plan is highlighted. Methods of selecting the plan indexed by the acceptable quality level and limiting quality level, and by the acceptable quality level and average outgoing quality level are also presented.
••
TL;DR: In this paper, the simple null distributions of T and T* are found for all possible values of k, and k k percentage points are tabulated for k = 1, 2, …, 8.
Abstract: Summary T = \[x + … + x ]/ Sigma x (T*= \[x + … + x ] Sigma x ) is the max k (n- k+ 1 ) (n) i k ( 1 ) (k) i imum likelihood ratio test statistic for k upper ( lower ) outliers in an exponential sample x , …, x . The null distributions of T for k= 1,2 were given by Fisher and by Kimber 1 n k and Stevens , while those of T*(k= 1,2) were given by Lewis and Fieller . In this paper , k the simple null distributions of T and T* are found for all possible values of k, and k k percentage points are tabulated for k= 1, 2, …, 8. In addition , we find a way of determining k, which can reduce the masking or ' swamping ' effects .
••
TL;DR: In this paper, the gamma and normal probability models were used to predict the outcome of horse races, and it was found that all the models tend to overestimate the probability of a horse finishing second or third when the horse had a high probability of such a result, but underestimate the probability when this probability is low.
Abstract: Summary A number of models have been examined for modelling probability based on rankings. Most prominent among these are the gamma and normal probability models. The accuracy of these models in predicting the outcomes of horse races is investigated in this paper. The parameters of these models are estimated by the maximum likelihood method, using the information on win pool fractions. These models are used to estimate the probabilities that race entrants finish second or third in a race. These probabilities are then compared with the corresponding objective probabilities estimated from actual race outcomes. The data are obtained from over 15 000 races. it is found that all the models tend to overestimate the probability of a horse finishing second or third when the horse has a high probability of such a result, but underestimate the probability of a horse finishing second or third when this probability is low.
••
TL;DR: In this article, the authors present two ways of determining k, free from the effects of masking and swamping, when testing upper (lower) outliers in normal samples. But neither of these methods is suitable for the case of multiple outliers.
Abstract: Summary The discordancy test for multiple outliers is complicated by problems of masking and swamping. The key to the settlement of the question lies in the determination of k , i.e. the number of 'contaminants' in a sample. Great efforts have been made to solve this problem in recent years, but no effective method has been developed. In this paper, we present two ways of determining k , free from the effects of masking and swamping, when testing upper (lower) outliers in normal samples. Examples are given to illustrate the methods.
••
TL;DR: In this article, Kunsch's blockwise bootstrap is used to estimate the variability of parameter estimates in a harmonic analysis via block subsampling of residuals from a least-squares fit.
Abstract: We analyze tidal data from Port Mansfield, TX, using Kunsch's blockwise bootstrap in the regression setting. In particular, we estimate the variability of parameter estimates in a harmonic analysis via block subsampling of residuals from a least-squares fit. We see that naive least-squares variance estimates can be either too large or too small, depending on the strength of correlation and the design matrix. We argue that the block bootstrap is a simple, omnibus method of accounting for correlation in a regression model with correlated errors.
••
TL;DR: In this article, the authors derived the canonical form for the arbitrary m-factors in a staggered nested design and applied it to obtain the expectation, variances and covariances of the mean squares.
Abstract: Staggered nested experimental designs are the most popular class of unbalanced nested designs. Using a special notation which covers the particular structure of the staggered nested design, this paper systematically derives the canonical form for the arbitrary m-factors. Under the normality assumption for every random variable, a vector comprising m canonical variables from each experimental unit is normally independently and identically distributed. Every sum of squares used in the analysis of variance (ANOVA) can be expressed as the sum of squares of the corresponding canonical variables. Hence, general formulae for the expectations, variances and covariances of the mean squares are directly obtained from the canonical form. Applying the formulae, the explicit forms of the ANOVA estimators of the variance components and unbiased estimators of the ratios of the variance components are introduced in this paper. The formulae are easily applied to obtain the variances and covariances of any linear combinati...
••
TL;DR: In this article, the authors review the development of adaptive linear estimators and adaptive maximum-likelihood (AML) estimators, which can be used to characterize data sets and provide valuable information regarding the data distribution.
Abstract: There are many statistics which can be used to characterize data sets and provide valuable information regarding the data distribution, even for large samples. Traditional measures, such as skewness and kurtosis, mentioned in introductory statistics courses, are rarely applied. A variety of other measures of tail length, skewness and tail weight have been proposed, which can be used to describe the underlying population distribution. Adaptive statistical procedures change the estimator of location, depending on sample characteristics. The success of these estimators depends on correctly classifying the underlying distribution model. Advocates of adaptive distribution testing propose to proceed by assuming (1) that an appropriate model, say Omega , is such that Omega { Omega , Omega , i i 1 2 … , Omega }, and (2) that the character of the model selection process is statistically k independent of the hypothesis testing. We review the development of adaptive linear estimators and adaptive maximum-likelihood ...
••
TL;DR: In this paper, the optimal symmetric orthogonally blocked designs within this class are determined and it is shown that even better designs are obtained for the asymmetric situation, in which some experimental blends are taken at the vertices of the experimental region.
Abstract: It is often the case in mixture experiments that some of the ingredients, such as additives or flavourings, are included with proportions constrained to lie in a restricted interval, while the majority of the mixture is made up of a particular ingredient used as a filler. The experimental region in such cases is restricted to a parallelepiped in or near one corner of the full simplex region. In this paper, orthogonally blocked designs with two experimental blends on each edge of the constrained region are considered for mixture experiments with three and four ingredients. The optimal symmetric orthogonally blocked designs within this class are determined and it is shown that even better designs are obtained for the asymmetric situation, in which some experimental blends are taken at the vertices of the experimental region. Some examples are given to show how these ideas may be extended to identify good designs in three and four blocks. Finally, an example is included to illustrate how to overcome the problems of collinearity that sometimes occur when fitting quadratic models to experimental data from mixture experiments in which some of the ingredient proportions are restricted to small values.
••
TL;DR: In this paper, the difference in the seeds of the two players was used as the basis for a test statistic, and several models for the underlying probability structure to examine the null distribution and power functions were presented.
Abstract: In a seeded knockout tournament, where teams have some preassigned strength, do we have any assurances that the best team in fact has won? Is there some insight to be gained by considering which teams beat which other teams solely examining the seeds? We pose an answer to these questions by using the difference in the seeds of the two players as the basis for a test statistic. We offer several models for the underlying probability structure to examine the null distribution and power functions and determine these for small tournaments (less than five teams). One structure each for 8 teams and 16 teams is examined, and we conjecture an asymptotic normal distribution for the test statistic.