scispace - formally typeset
Search or ask a question

Showing papers on "Sampling distribution published in 2010"


Journal ArticleDOI
TL;DR: This paper provides an estimator of the covariance matrix of meta-regression coefficients that is applicable when there are clusters of internally correlated estimates and demonstrates that the meta- Regression coefficients are consistent and asymptotically normally distributed and that the robust variance estimator is valid even when the covariates are random.
Abstract: Conventional meta-analytic techniques rely on the assumption that effect size estimates from different studies are independent and have sampling distributions with known conditional variances. The independence assumption is violated when studies produce several estimates based on the same individuals or there are clusters of studies that are not independent (such as those carried out by the same investigator or laboratory). This paper provides an estimator of the covariance matrix of meta-regression coefficients that are applicable when there are clusters of internally correlated estimates. It makes no assumptions about the specific form of the sampling distributions of the effect sizes, nor does it require knowledge of the covariance structure of the dependent estimates. Moreover, this paper demonstrates that the meta-regression coefficients are consistent and asymptotically normally distributed and that the robust variance estimator is valid even when the covariates are random. The theory is asymptotic in the number of studies, but simulations suggest that the theory may yield accurate results with as few as 20-40 studies. Copyright © 2010 John Wiley & Sons, Ltd.

1,261 citations


Book
01 Apr 2010
TL;DR: In this article, the authors discuss the types, sources and methods of collection using Excel for Statistical Analysis and summarise data Pivot Tables and Graphs Descriptive Statistics - Location Measurement - Dispersion and Skewness Measures Basic Probability Concepts Probability Distributions Sampling and Sampling Distributions Confidence Interval Estimation Hypotheses Tests Single Population (Means and Proportions) Hypothese Tests -- Comparison Between Two Populations (means andproportions), Chi-Squared Hypothesed Tests Analysis of Variance: Comparing Multiple
Abstract: Statistics in Management Data -- Types, Sources and Methods of Collection Using Excel for Statistical Analysis Summarising Data Pivot Tables and Graphs Descriptive Statistics -- Location Measures Descriptive Statistics -- Dispersion and Skewness Measures Basic Probability Concepts Probability Distributions Sampling and Sampling Distributions Confidence Interval Estimation Hypotheses Tests Single Population (Means and Proportions) Hypotheses Tests -- Comparison Between Two Populations (Means and Proportions) Chi-Squared Hypotheses Tests Analysis of Variance: Comparing Multiple Population Means Linear Regression and Correlation Analysis Index Numbers: Measuring Business Activity Time Series Analysis: A Forecasting Tool Financial Calculations: Interest, Annuities and NPV.

210 citations


Journal ArticleDOI
TL;DR: Estimation of prevalence of disease, including construction of confidence intervals, is essential in surveys for screening as well as in monitoring disease status, taking into account sensitivity and specificity of the diagnostic test.
Abstract: Estimation of prevalence of disease, including construction of confidence intervals, is essential in surveys for screening as well as in monitoring disease status. In most analyses of survey data it is implicitly assumed that the diagnostic test has a sensitivity and specificity of 100%. However, this assumption is invalid in most cases. Furthermore, asymptotic methods using the normal distribution as an approximation of the true sampling distribution may not preserve the desired nominal confidence level. Here we proposed exact two-sided confidence intervals for the prevalence of disease, taking into account sensitivity and specificity of the diagnostic test. We illustrated the advantage of the methods with results of an extensive simulation study and real-life examples.

178 citations


Journal ArticleDOI
TL;DR: In the DSGE model of Smets and Wouters (2007), for example, which involves a 36-dimensional posterior distribution, it is shown that the autocorrelations of the sampled draws from the TaRB-MH algorithm decay to zero within 30-40 lags for most parameters.

141 citations


Journal ArticleDOI
TL;DR: In this article, nonparametric estimators of sharp bounds on the distribution of the treatement effect of a binary treatment and establish their asymptotic distributions are proposed. But the authors note the possible failure of the standard bootstrap with the same sample size.
Abstract: In this paper, we propose nonparametric estimators of sharp bounds on the distribution of the treatement eect of a binary treatment and establish their asymptotic distributions. We note the possible failure of the standard bootstrap with the same sample size and apply the fewer-than-n bootstrap to making inferences on these bounds. The …nite sample performances of the con…dence intervals for the bounds based on normal critical values, the standard bootstrap, and the fewer-than-n bootstrap are investigated via a simulation study. Finally we establish sharp bounds on the treatment eect distribution when covariates are available.

118 citations


Journal ArticleDOI
TL;DR: The aim of this paper is to provide an asymptotic analysis of the conditional FDH and conditional DEA estimators, which have been applied in the literature without any theoretical background about their statistical properties.
Abstract: Cazals et al. (J. Econom. 106: 1-25, 2002), Daraio and Simar (J. Prod. Anal. 24: 93-121, 2005; Advanced Robust and Nonparametric Methods in Efficiency Analysis, 2007a; J. Prod. Anal. 28: 13-32, 2007b) developed a conditional frontier model which incorporates the environmental factors into measuring the efficiency of a production process in a fully nonparametric setup. They also provided the corresponding nonparametric efficiency measures: conditional FDH estimator, conditional DEA estimator. The two estimators have been applied in the literature without any theoretical background about their statistical properties. The aim of this paper is to provide an asymptotic analysis (i.e. asymptotic consistency and limit sampling distribution) of the conditional FDH and conditional DEA estimators.

108 citations


Book
01 Oct 2010
TL;DR: This book discusses Fitting Statistical Distributions, a new approach to Goodness-of-Fit Assessment, which addresses Conceptual and Practical challenges of Sampling Distributions of the Overlapping Coefficient and Other Similarity Measures.
Abstract: Overview Fitting Statistical Distributions: An Overview The Generalized Lambda Distribution The Generalized Lambda Family of Distributions Fitting Distributions and Data with the GLD via the Method of Moments The Extended GLD System, the EGLD: Fitting by the Method of Moments A Percentile-Based Approach to Fitting Distributions and Data with the GLD Fitting Distributions and Data with the GLD through L-Moments Fitting a GLD Using a Percentile-KS (P-KS) Adequacy Criterion Fitting Mixture Distributions Using a Mixture of GLDs with Computer Code GLD-2: The Bivariate GLD Fitting the GLD with Location and Scale-Free Shape Functionals Statistical Design of Experiments: A Short Review Quantile Distribution Methods Statistical Modeling Based on Quantile Distribution Functions Distribution Fitting with the Quantile Function of Response Modeling Methodology (RMM) Fitting GLDs and Mixture of GLDs to Data Using Quantile Matching Method Fitting GLD to Data Using GLDEX 1.0.4 in R Other Families of Distributions Fitting Distributions and Data with the Johnson System via the Method of Moments Fitting Distributions and Data with the Kappa Distribution through L-Moments and Percentiles Weighted Distributional Lalpha Estimates A Multivariate Gamma Distribution for Linearly Related Proportional Outcomes The Generalized Bootstrap and Monte Carlo Methods The Generalized Bootstrap (GB) and Monte Carlo (MC) Methods The GB: A New Fitting Strategy and Simulation Study Showing Advantage over Bootstrap Percentile Methods GB Confidence Intervals for High Quantiles Assessment of the Quality of Fits Goodness-of-Fit Criteria Based on Observations Quantized by Hypothetical and Empirical Percentiles Evidential Support Continuum (ESC): A New Approach to Goodness-of-Fit Assessment, which Addresses Conceptual and Practical Challenges Estimation of Sampling Distributions of the Overlapping Coefficient and Other Similarity Measures Applications Fitting Statistical Distribution Functions to Small Datasets Mixed Truncated Random Variable Fitting with the GLD, and Applications in Insurance and Inventory Management Distributional Modeling of Pipeline Leakage Repair Costs for a Water Utility Company Use of the GLD in Materials Science, with Examples in Fatigue Lifetime, Fracture Mechanics, Polycrystalline Calculations, and Pitting Corrosion Fitting Statistical Distributions to Data in Hurricane Modeling A Rainfall-Based Model for Predicting the Regional Incidence of Wheat Seed Infection by Stagonospora nodorum in New York Reliability Estimation Using Univariate Dimension Reduction and Extended GLD Statistical Analyses of Environmental Pressure Surrounding Atlantic Tropical Cyclones Simulating Hail Storms Using Simultaneous Efficient Random Number Generators Appendices Programs and Their Documentation Table B-1 for GLD Fits: Method of Moments Table C-1 for GBD Fits: Method of Moments Tables D-1 through D-5 for GLD Fits: Method of Percentiles Tables E-1 through E-5 for GLD Fits: Method of L-Moments Table F-1 for Kappa Distribution Fits: Method of L-Moments Table G-1 for Kappa Distribution Fits: Method of Percentiles Table H-1 for Johnson System Fits in the SU Region: Method of Moments Table I-1 for Johnson System Fits in the SB Region: Method of Moments Table J-1 for p-Values Associated with Kolmogorov-Smirnov Statistics Table K-1 Normal Distribution Percentiles Index References appear at the end of each chapter.

83 citations


Journal ArticleDOI
TL;DR: A universal conditional distribution method for uniform sampling from n-spheres and n-balls is described, based on properties of a family of radially symmetric multivariate distributions, which provides a unifying view on several known algorithms as well as enabling us to construct novel variants.

69 citations


Journal ArticleDOI
TL;DR: In this paper, the bias-corrected DEA estimator is compared with the original conical-hull estimator in terms of median squared error, showing that the rate of convergence is better than for the VRS estimator.
Abstract: Nonparametric data envelopment analysis (DEA) estimators have been widely applied in analysis of productive efficiency. Typically they are defined in terms of convex-hulls of the observed combinations of $\mathrm{inputs}\times\mathrm{outputs}$ in a sample of enterprises. The shape of the convex-hull relies on a hypothesis on the shape of the technology, defined as the boundary of the set of technically attainable points in the $\mathrm{inputs}\times\mathrm{outputs}$ space. So far, only the statistical properties of the smallest convex polyhedron enveloping the data points has been considered which corresponds to a situation where the technology presents variable returns-to-scale (VRS). This paper analyzes the case where the most common constant returns-to-scale (CRS) hypothesis is assumed. Here the DEA is defined as the smallest conical-hull with vertex at the origin enveloping the cloud of observed points. In this paper we determine the asymptotic properties of this estimator, showing that the rate of convergence is better than for the VRS estimator. We derive also its asymptotic sampling distribution with a practical way to simulate it. This allows to define a bias-corrected estimator and to build confidence intervals for the frontier. We compare in a simulated example the bias-corrected estimator with the original conical-hull estimator and show its superiority in terms of median squared error.

62 citations


Journal ArticleDOI
TL;DR: In this paper, the effects of serial correlation of forecasts and observations on the sampling properties of forecast verification statistics have been examined for probability forecasts of dichotomous events, for both serially correlated and temporally independent forecasts, and it has been shown that serial correlation is more robust to serial correlation than that of BS Hypothesis tests based on BSS are more powerful than those based on BS, and substantially so for lower-accuracy forecasts of lower-probability events.
Abstract: Relatively little attention has been given to the effects of serial correlation of forecasts and observations on the sampling properties of forecast verification statistics An assumption of serial independence for low-quality forecasts may be reasonable However, forecasts of sufficient quality for autocorrelated events must themselves be autocorrelated: as quality approaches the limit of perfect forecasts, the forecasts become increasingly similar to the corresponding observations The effects of forecast serial correlation on the sampling properties of the Brier Score (BS) and Brier Skill Score (BSS), for probability forecasts of dichotomous events, are examined here As in other settings, the effect of serial correlation is to inflate the variances of the sampling distributions of the two statistics, so that uncorrected confidence intervals are too narrow, and uncorrected hypothesis tests yield p-values that are too small Expressions are given for ‘effective sample size’ corrections for the sampling variances of both BS and BSS, in which it can be seen that the effects of serial correlation on the sampling variances increase with increasing forecast accuracy, and with decreasing climatological event probability The sampling variance of BSS is more robust to serial correlation than that of BS Hypothesis tests based on BSS are seen to be more powerful (ie more sensitive) than those based on BS, and substantially so for lower-accuracy forecasts of lower-probability events, for both serially correlated and temporally independent forecasts Copyright © 2010 Royal Meteorological Society

62 citations


Journal ArticleDOI
TL;DR: This article considers the simple step-stress model under time constraint when the lifetime distributions of the different risk factors are independently exponentially distributed and derives the maximum likelihood estimators (MLEs) of the unknown mean parameters of theDifferent causes under the assumption of a cumulative exposure model.

Posted Content
TL;DR: In this paper, the authors explore the features of the Student's $t$ statistic in the context of its application to very high dimensional problems, including feature selection and ranking, highly multiple hypothesis testing, and sparse, high dimensional signal detection.
Abstract: Student's $t$ statistic is finding applications today that were never envisaged when it was introduced more than a century ago. Many of these applications rely on properties, for example robustness against heavy tailed sampling distributions, that were not explicitly considered until relatively recently. In this paper we explore these features of the $t$ statistic in the context of its application to very high dimensional problems, including feature selection and ranking, highly multiple hypothesis testing, and sparse, high dimensional signal detection. Robustness properties of the $t$-ratio are highlighted, and it is established that those properties are preserved under applications of the bootstrap. In particular, bootstrap methods correct for skewness, and therefore lead to second-order accuracy, even in the extreme tails. Indeed, it is shown that the bootstrap, and also the more popular but less accurate $t$-distribution and normal approximations, are more effective in the tails than towards the middle of the distribution. These properties motivate new methods, for example bootstrap-based techniques for signal detection, that confine attention to the significant tail of a statistic.

Proceedings Article
21 Jun 2010
TL;DR: This work empirically study conditions under which the active risk estimate is more accurate than a standard risk estimate that draws equally many instances from the test distribution.
Abstract: We address the problem of evaluating the risk of a given model accurately at minimal labeling costs. This problem occurs in situations in which risk estimates cannot be obtained from held-out training data, because the training data are unavailable or do not reflect the desired test distribution. We study active risk estimation processes in which instances are actively selected by a sampling process from a pool of unlabeled test instances and their labels are queried. We derive the sampling distribution that minimizes the estimation error of the active risk estimator when used to select instances from the pool. An analysis of the distribution that governs the estimator leads to confidence intervals. We empirically study conditions under which the active risk estimate is more accurate than a standard risk estimate that draws equally many instances from the test distribution.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of estimating a modeling parameter using a weighted least squares criterion Jd(y, μ) = ∑n i=1 1 3⁄4(ti) ( y(ti), − f(ti, ))2 for given data y by introducing an abstract framework involving generalized measurement procedures characterized by probability measures.
Abstract: We consider the problem of estimating a modeling parameter μ using a weighted least squares criterion Jd(y, μ) = ∑n i=1 1 3⁄4(ti) ( y(ti) − f(ti, μ) )2 for given data y by introducing an abstract framework involving generalized measurement procedures characterized by probability measures. We take an optimal design perspective, the general premise (illustrated via examples) being that in any data collected, the information content with respect to estimating μ may vary considerably from one time measurement to another, and in this regard some measurements may be much more informative than others. We propose mathematical tools which can be used to collect data in an almost optimal way, by specifying the duration and distribution of time sampling in the measurements to be taken, consequently improving the accuracy (i.e., reducing the uncertainty in estimates) of the parameters to be estimated. We recall the concepts of traditional and generalized sensitivity functions and use these to develop a strategy to determine the “optimal” final time T for an experiment; this is based on the time evolution of the sensitivity functions and of the condition number of the Fisher information matrix. We illustrate the role of the sensitivity functions as tools in optimal design of experiments, in particular in finding “best” sampling distributions. Numerical examples are presented throughout to motivate and illustrate the ideas. AMS subject classifications: 62G08, 62H99, 90C31, 65K10, 93B51, 62B10.

Journal ArticleDOI
TL;DR: In this paper, the authors evaluate the sampling distribution in the tail areas of the spatial autocorrelation coefficient defined below, and suggest a suitable approximation to the sample distribution of the coefficient which provides the basis for a reliable test of significance.
Abstract: Various statistics have been proposed in the literature to test for the presence of spatial autocorrelation on a random variable measured in each ‘county’ of a ‘country’. These measures are reviewed in Cliff and Ord [ I ] . In most empirical applications, the researcher will wish to determine the degree of significance of the calculated value of the test coefficient used. Unfortunately, very little is known about the sampling distributions of any of the spatial autocorrelation coefficients, particularly in the tail areas which are critical for significance testing, in the small and moderate sized lattices usually encountered. This is clearly a severe handicap in proposing a reliable test of significance for any of the coefficients. It is the purpose of this paper to evaluate the sampling distribution in the tail areas of the spatial autocorrelation coefficient defined below. A suitable approximation to the sampling distribution of the coefficient is suggested which provides the basis for a reliable test of significance. Monte Carlo methods are used. Consider a county system which comprises n counties. Suppose that a variate, X , has been measured in each of the n counties. Let the value of X in the ith county be xi and define x i = xi 2. The effect of county i on county j is denoted by the weight, wij . Then, the authors have proposed [ I ] , the following measure of spatial autocorrelation between the xi :

Journal ArticleDOI
TL;DR: The Wilcoxon statistics are usually taught as nonparametric alternatives for the 1-and 2-sample Student-t statistics in situations where the data appear to arise from non-normal distributions as discussed by the authors.
Abstract: The Wilcoxon statistics are usually taught as nonparametric alternatives for the 1- and 2-sample Student-t statistics in situations where the data appear to arise from non-normal distributions, or ...

Journal ArticleDOI
TL;DR: In this paper, the authors introduce the bootstrap as a suitable method to estimate the finite sample distribution of the NPMLE under double truncation, which has no explicit form and must be approximated numerically.
Abstract: Doubly truncated data appear in a number of applications, including astronomy and survival analysis. In this paper we review the existing methods to compute the nonparametric maximum likelihood estimator (NPMLE) under double truncation, which has no explicit form and must be approximated numerically. We introduce the bootstrap as a suitable method to estimate the finite sample distribution of the NPMLE under double truncation. The performance of the bootstrap is investigated in a simulation study. The nonstandard case in which the right- and left-truncation times determine each other is covered. As an illustration, nonparametric estimation and inference on the birth process and the age at diagnosis for childhood cancer in North Portugal is considered.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the case where the most common constant returns-to-scale (CRS) hypothesis is assumed and derived the asymptotic sampling distribution of the DEA estimator.
Abstract: Non-parametric data envelopment analysis (DEA) estimators have been widely applied in analysis of productive efficiency. Typically they are defined in terms of convex-hulls of the observed combinations of inputs × outputs in a sample of enterprises. The shape of the convex-hull relies on hypothesis on the shape of the technology, defined as the boundary of the set of technically attainable points in the inputs × outputs space. So far, only the statistical properties of the smallest convex polyhedron enveloping the data points has been considered, which corresponds to a situation where the technology presents varying returns-to-scale (VRS). This paper analyzes the case where the most common constant returns-to-scale (CRS) hypothesis is assumed. Here the DEA is defined as the smallest conical-hull with vertex at the origin enveloping the cloud of observed points. In this paper we determine the asymptotic properties of this estimator, showing that the rate of convergence is better than for the VRS estimator. We derive also its asymptotic sampling distribution, with a practical way to simulate it. This allows to define a bias-corrected estimator and to build confidence intervals for the frontier. We compare in a simulated example the bias-corrected estimator with the original conical-hull estimator, and show its superiority in terms of median squared error.

Posted Content
TL;DR: The Coupling Optional Polya Tree (COPT) prior as mentioned in this paper is a prior for Bayesian nonparametric analysis, which is based on a random-partition-and-assignment procedure similar to the one that defines the standard optional Polya tree distribution, but has the ability to generate multiple random distributions jointly.
Abstract: Testing and characterizing the difference between two data samples is of fundamental interest in statistics. Existing methods such as Kolmogorov-Smirnov and Cramer-von-Mises tests do not scale well as the dimensionality increases and provides no easy way to characterize the difference should it exist. In this work, we propose a theoretical framework for inference that addresses these challenges in the form of a prior for Bayesian nonparametric analysis. The new prior is constructed based on a random-partition-and-assignment procedure similar to the one that defines the standard optional Polya tree distribution, but has the ability to generate multiple random distributions jointly. These random probability distributions are allowed to "couple", that is to have the same conditional distribution, on subsets of the sample space. We show that this "coupling optional Polya tree" prior provides a convenient and effective way for both the testing of two sample difference and the learning of the underlying structure of the difference. In addition, we discuss some practical issues in the computational implementation of this prior and provide several numerical examples to demonstrate its work.

Journal ArticleDOI
TL;DR: Estimation of the parameters of a certain family of two-parameter lifetime distributions based on progressively Type II right-censored samples (including ordinary Type IIright censoring) is studied, including the Weibull, Gompertz, and Lomax distributions.
Abstract: In this article, estimation of the parameters of a certain family of two-parameter lifetime distributions based on progressively Type II right-censored samples (including ordinary Type II right censoring) is studied. This family, of proportional hazard distributions, includes the Weibull, Gompertz, and Lomax distributions. A type of parameter estimation named inverse estimation is investigated for both parameters. Exact confidence intervals for one of the parameters and generalized confidence intervals for the other are explored; inference for the first parameter can be accomplished by our methodology independently of the unknown value of the other parameter in this family of distributions. Derivation of the estimation method uses properties of order statistics. A simulation study concentrating mainly on the Weibull distribution illustrates the accuracy of these confidence intervals [and the shorter length of our exact confidence interval compared with an alternative of Wu (2002)] and compares inverse est...

Journal ArticleDOI
TL;DR: In this article, the authors investigated the relationship between process parameters and the sampling distribution and showed that for a fixed, the variance of sample estimator is restricted in an interval, and the maximal variance is used in the estimation and testing of the production yield to ensure the level of confidence.
Abstract: Numerous capability indices have been proposed to measure the performance of processes with multiple characteristics. The index provides an exact measure on the production yield of multinormal processes in which the characteristics are mutually independent. In this paper, we thoroughly investigate the relationship between process parameters and the sampling distribution of . Our investigation shows that for a fixed , the variance of sample estimator of is restricted in an interval. For reliability consideration, the maximal variance is used in the estimation and testing of the production yield to ensure the level of confidence. Also, information about sample sizes required for specified precision of estimation and for convergence is determined. At last, we implement a trivariate process with data collected from a plastics manufacturing industrial to demonstrate the practicability of the proposed method in measuring the production yield.

Journal ArticleDOI
TL;DR: In this article, an alternative asymptotic approximation to the sampling distribution of the limited information maximum likelihood estimator and a bias corrected version of the two-stage least squares estimator is derived by allowing the number of instruments and the concentration parameter to grow at the same rate as the sample size.
Abstract: In this paper we derive an alternative asymptotic approximation to the sampling distribution of the limited information maximum likelihood estimator and a bias corrected version of the two-stage least squares estimator. The approximation is obtained by allowing the number of instruments and the concentration parameter to grow at the same rate as the sample size. More specifically, we allow for potentially nonnormal error distributions and obtain the conventional asymptotic distribution and the results of Bekker (1994, Econometrica 62, 657–681) and Bekker and Van der Ploeg (2005, Statistica Neerlandica 59, 139–267) as special cases. The results show that when the error distribution is not normal, in general both the properties of the instruments and the third and fourth moments of the errors affect the asymptotic variance. We compare our findings with those in the recent literature on many and weak instruments.

Journal ArticleDOI
TL;DR: This paper presents, for what is believed to be the first time, the analytical formulation for the joint sampling distribution of the actual and estimated errors of a classification rule, under a general parametric Gaussian assumption.
Abstract: Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This paper presents, for what is believed to be the first time, the analytical formulation for the joint sampling distribution of the actual and estimated errors of a classification rule. The analysis presented here concerns the linear discriminant analysis (LDA) classification rule and the resubstitution and leave-one-out error estimators, under a general parametric Gaussian assumption. Exact results are provided in the univariate case, and a simple method is suggested to obtain an accurate approximation in the multivariate case. It is also shown how these results can be applied in the computation of condition bounds and the regression of the actual error, given the observed error estimate. In contrast to asymptotic results, the analysis presented here is applicable to finite training data. In particular, it applies in the small-sample settings commonly found in genomics and proteomics applications. Numerical examples, which include parameters estimated from actual microarray data, illustrate the analysis throughout.

Journal ArticleDOI
TL;DR: In this paper, a simple algorithm for sampling from the Dickman distribution is given, which is based on coupling from the past with a suitable dominating Markov chain, and it is shown that the algorithm can be used to sample from any distribution.

Journal ArticleDOI
TL;DR: The motivation behind the proposed computational approach test (CAT) is to provide the applied researchers a statistical tool to carry out a comparison of several population means, in a parametric setup, without worrying about the sampling distribution of the inherent test statistic.
Abstract: This paper deals with testing the equality of several homoscedastic normal population means. We introduce a newly developed computational approach test (CAT), which is essentially a parametric bootstrap method, and discuss its merits and demerits. In the process of studying the CAT’s usefulness, we compare it with the traditional one-way ANOVA’s F test as well as the analysis of means (ANOM) method. Further, the model robustness of the above three methods have been studied under the ‘t-model’. The motivation behind the proposed CAT is to provide the applied researchers a statistical tool to carry out a comparison of several population means, in a parametric setup, without worrying about the sampling distribution of the inherent test statistic. The CAT can be used to test the equality of several means when the populations are assumed to be heteroscedastic t-distributions.

Journal Article
Rochowicz1, John A
TL;DR: Various numerical approximation techniques that can be used to analyze data and make inferences about populations from samples and the application of confidence intervals to inferential statistics are provided.
Abstract: Performing a parametric statistical analysis requires the justification of a number of necessary assumptions. If assumptions are not justified research findings are inaccurate and in question. What happens when assumptions are not or cannot be addressed? When a certain statistic has no known sampling distribution what can a researcher do for statistical inference? Options are available for answering these questions and conducting valid research. This paper provides various numerical approximation techniques that can be used to analyze data and make inferences about populations from samples. The application of confidence intervals to inferential statistics is addressed. The analysis of data that is parametric as well as nonparametric is discussed. Bootstrapping analysis for inferential statistics is shown with the application of the Index Function and the use of macros and the Data Analysis Toolpak on the EXCEL spreadsheet. A variety of interesting observations are described.

Journal ArticleDOI
TL;DR: Zerbet and Nikulin this paper presented the new statistic Z k for detecting outliers in exponential distribution and compared this statistic with Dixon's statistic D k. In this article, we extend this approach to gamma distribution and compare the result with Dixon' s statistic.
Abstract: Zerbet and Nikulin presented the new statistic Z k for detecting outliers in exponential distribution. They also compared this statistic with Dixon's statistic D k . In this article, we extend this approach to gamma distribution and compare the result with Dixon's statistic. The results show that the test based on statistic Z k is more powerful than the test based on the Dixon's statistic.

Journal ArticleDOI
TL;DR: On the 2009 AP © Statistics Exam, students were asked to create a statistic to measure skewness in a distribution as mentioned in this paper, and several popular student responses were explored and evaluated.
Abstract: On the 2009 AP © Statistics Exam, students were asked to create a statistic to measure skewness in a distribution. This paper explores several of the most popular student responses and evaluates which statistic performs best when sampling from various skewed populations.

Proceedings Article
06 Dec 2010
TL;DR: Conditions under which active estimates of Fα-measures are more accurate than estimates based on instances sampled from the test distribution are explored.
Abstract: We address the problem of estimating the Fα-measure of a given model as accurately as possible on a fixed labeling budget. This problem occurs whenever an estimate cannot be obtained from held-out training data; for instance, when data that have been used to train the model are held back for reasons of privacy or do not reflect the test distribution. In this case, new test instances have to be drawn and labeled at a cost. An active estimation procedure selects instances according to an instrumental sampling distribution. An analysis of the sources of estimation error leads to an optimal sampling distribution that minimizes estimator variance. We explore conditions under which active estimates of Fα-measures are more accurate than estimates based on instances sampled from the test distribution.

Journal ArticleDOI
TL;DR: In this paper, two-sample point and interval predictors of generalized order statistics are obtained when the future sample size is fixed and when it is random, and a general distribution is considered for each of the underlying population and the prior.
Abstract: Bayes two-sample point and interval predictors of generalized order statistics are obtained when the future sample size is fixed and when it is random. A general distribution is considered for each of the underlying population and the prior. Illustrative examples are given in which the underlying population distributions are specialized to Burr type XII and Weibull models. Numerical computations have been carried out for predictions of future ordinary order statistics and ordinary records.