scispace - formally typeset
Search or ask a question

Showing papers on "Mathematical statistics published in 2005"



Book
01 Jan 2005
TL;DR: In this article, the authors present a model selection and validation procedure for order statistics and extreme events. But they do not discuss the model selection procedure for the case of multivariate extremes.
Abstract: Preface. I: DATA, INTRODUCTION AND MOTIVATION. 1. Introduction and Motivation. II: PROBABILISTIC MODELS USEFUL IN EXTREMES. 2. Discrete Probabilistic Models. 3. Continuous Probabilistic Models. III: MODEL ESTIMATION, SELECTION, AND VALIDATION. 4. Model Estimation. 5. Model Selection and Validation. IV: EXACT MODELS FOR ORDER STATISTICS AND EXTREMES. 6. Order Statistics. 7. Point Processes and Exact Models. V: ASYMPTOTIC MODELS FOR EXTREMES AND EXCEEDANCES. 8. Limit Distributions of Order Statistics. 9. Limit Distributions of Exceedances. 10. Multivariate Extremes. Appendix: Statistical Tables. Bibliography. Index.

432 citations


Journal ArticleDOI
TL;DR: A key feature appears to be that the estimate of sparsity adapts to three different zones of estimation, first where the signal is not sparse enough for thresholding to be of benefit, second where an appropriately chosen threshold results in substantially improved estimation, and third where the signals are so sparse that the zero estimate gives the optimum accuracy rate.
Abstract: This paper explores a class of empirical Bayes methods for level-dependent threshold selection in wavelet shrinkage. The prior considered for each wavelet coefficient is a mixture of an atom of probability at zero and a heavy-tailed density. The mixing weight, or sparsity parameter, for each level of the transform is chosen by marginal maximum likelihood. If estimation is carried out using the posterior median, this is a random thresholding procedure; the estimation can also be carried out using other thresholding rules with the same threshold. Details of the calculations needed for implementing the procedure are included. In practice, the estimates are quick to compute and there is software available. Simulations on the standard model functions show excellent performance, and applications to data drawn from various fields of application are used to explore the practical performance of the approach. By using a general result on the risk of the corresponding marginal maximum likelihood approach for a single sequence, overall bounds on the risk of the method are found subject to membership of the unknown function in one of a wide range of Besov classes, covering also the case of f of bounded variation. The rates obtained are optimal for any value of the parameter p in (0,\infty], simultaneously for a wide range of loss functions, each dominating the L_q norm of the \sigmath derivative, with \sigma\ge0 and 0

310 citations


Book
23 Dec 2005
TL;DR: Basic Features of Statistical Analysis and the General Linear Model Multivariate Analysis of Variance Multiple Regression Log-Linear Analysis Logistic Regression Factor Analysis Path Analysis Structural Equation Modelling Time Series Analysis Facet Theory and Smallest Space Analysis Survival or Failure Analysis Repertory Grids
Abstract: Basic Features of Statistical Analysis and the General Linear Model Multivariate Analysis of Variance Multiple Regression Log-Linear Analysis Logistic Regression Factor Analysis Path Analysis Structural Equation Modelling Time Series Analysis Facet Theory and Smallest Space Analysis Survival or Failure Analysis Repertory Grids

221 citations


Journal ArticleDOI
TL;DR: In this paper, a Latin square with five treatments randomized to a 5×5 array of plots is presented, where variance components have to be estimated for row, column and treatment effects.
Abstract: Bayesian inference and fixed and random effects. Professor Gelman writes “Bayesians see analysis of variance as an inflexible classical method.” He adopts a hierarchical Bayesian framework to “identify ANOVA with the structuring of parameters into batches.” In this framework he sidesteps “the overloaded terms fixed and random” and defines effects “as constant if they are identical for all groups in a population and varying if they are allowed to differ from group to group.” Applying this approach to his first example (a Latin square with five treatments randomized to a 5×5 array of plots), variance components have to be estimated for row, column and treatment effects. In our opinion, his approach provides an insightful connection between analysis of variance and hierarchical modeling. It renders an informative and easy to interpret display of variance components that is a nice alternative for traditional analysis of variance. However, we wonder whether sidestepping the terms fixed and random is always wise. Furthermore, currently his approach is rather descriptive, and does not contain truly Bayesian inference. Both points will be briefly discussed in the sequel. To look into the question of fixed versus random and the use of hierarchical modeling, we carried out a small experiment. We constructed a dataset for the example in Section 2.2.2: 20 machines randomly divided into four treatment groups, with six outcome measures for each machine. We asked a statistician who is very skilled in multilevel analysis to analyze these data. The result was a hierarchical multivariate data structure with six outcomes nested within 20 machines, and the treatments coded as dummy variables at the machine level. Variance components were estimated for machines and measures. The treatment effects were tested by constraining all treatments to be equal and using a likelihood-ratio test. Comparing this procedure with the discussion of this example in Gelman’s paper shows that this is not what he had in mind. It certainly contradicts

179 citations



Journal ArticleDOI
TL;DR: In this paper, the authors studied the goodness-of-fit testing of single-index models and obtained asymptotically distribution-free maximin tests for a large class of local alternatives.
Abstract: In this paper we study goodness-of-fit testing of single-index models. The large sample behavior of certain score-type test statistics is investigated. As a by-product, we obtain asymptotically distribution-free maximin tests for a large class of local alternatives. Furthermore, characteristic function based goodness-of-fit tests are proposed which are omnibus and able to detect peak alternatives. Simulation results indicate that the approximation through the limit distribution is acceptable already for moderate sample sizes. Applications to two real data sets are illustrated.

153 citations


01 Jan 2005
TL;DR: Yeah, reviewing a books introduction to mathematical statistics 6th edition could add your close contacts listings to the list of recommended books.
Abstract: Yeah, reviewing a books introduction to mathematical statistics 6th edition could add your close contacts listings. This is just one of the solutions for you to be successful. As understood, carrying out does not recommend that you have extraordinary points. Comprehending as with ease as treaty even more than further will offer each success. next-door to, the proclamation as with ease as acuteness of this introduction to mathematical statistics 6th edition can be taken as capably as picked to act. Page Url

151 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide a thorough mathematical examination of the limiting arguments building on the orientation of Heffernan and Tawn [J. R. Stat. Soc. Ser. B Stat. A. 66 (2004) 497--546] which allows examination of distributional tails other than the joint tail.
Abstract: Models based on assumptions of multivariate regular variation and hidden regular variation provide ways to describe a broad range of extremal dependence structures when marginal distributions are heavy tailed. Multivariate regular variation provides a rich description of extremal dependence in the case of asymptotic dependence, but fails to distinguish between exact independence and asymptotic independence. Hidden regular variation addresses this problem by requiring components of the random vector to be simultaneously large but on a smaller scale than the scale for the marginal distributions. In doing so, hidden regular variation typically restricts attention to that part of the probability space where all variables are simultaneously large. However, since under asymptotic independence the largest values do not occur in the same observation, the region where variables are simultaneously large may not be of primary interest. A different philosophy was offered in the paper of Heffernan and Tawn [J. R. Stat. Soc. Ser. B Stat. Methodol. 66 (2004) 497--546] which allows examination of distributional tails other than the joint tail. This approach used an asymptotic argument which conditions on one component of the random vector and finds the limiting conditional distribution of the remaining components as the conditioning variable becomes large. In this paper, we provide a thorough mathematical examination of the limiting arguments building on the orientation of Heffernan and Tawn [J. R. Stat. Soc. Ser. B Stat. Methodol. 66 (2004) 497--546]. We examine the conditions required for the assumptions made by the conditioning approach to hold, and highlight simililarities and differences between the new and established methods.

128 citations


Book
14 Jun 2005
TL;DR: In this article, a calculus-based approach to probability, statistics, and stochastic processes is presented for upper-level undergraduate courses, which can help students to become proficient in all three of these essential topics.
Abstract: This textbook provides a unique, balanced approach to probability, statistics, and stochastic processes. Readers gain a solid foundation in all three fields that serves as a stepping stone to more advanced investigations into each area. This text combines a rigorous, calculus-based development of theory with a more intuitive approach that appeals to readers' sense of reason and logic, anapproach developed through the author's many years of classroom experience.The text begins with three chapters that develop probability theory and introduce the axioms of probability, random variables, and joint distributions. The next two chapters introduce limit theorems and simulation. Also included is a chapter on statistical inference with a section on Bayesian statistics, which is an important, though often neglected, topic for undergraduate-level texts. Markov chains in discrete and continuous time are also discussed within the book.More than 400 examples are interspersed throughout the text to help illustrate concepts and theory and to assist the reader in developing an intuitive sense of the subject. Readers will find many of the examples to be both entertaining and thought provoking. This is also true for the carefully selected problems that appear at the end of each chapter.This book is an excellent text for upper-level undergraduate courses. While many texts treat probability theory and statistical inference or probability theory and stochastic processes, this text enables students to become proficient in all three of these essential topics. For students in science and engineering who may take only one course in probability theory, mastering all three areas will better prepare them to collect, analyze, and characterize data in their chosen fields.

92 citations


Journal ArticleDOI
TL;DR: In this paper, a unified posterior analysis of classes of discrete random probability which identifies and exploits features common to all these models is presented, which circumvents many of the difficult issues involved in Bayesian nonparametric calculus, including a combinatorial component.
Abstract: This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailor-made to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The Poisson disintegration method is based on the formal statement of two results concerning a Laplace functional change of measure and a Poisson Palm/Fubini calculus in terms of random partitions of the integers (1,...,n). The techniques are analogous to, but much more general than, techniques for the Dirichlet process and weighted gamma process developed in {Ann. Statist. 12 (1984) 351-3571 and [Ann. Inst. Statist. Math. 41 (1989) 227-245]. In order to illustrate the flexibility of the approach, large classes of random probability measures and random hazards or intensities which can be expressed as functionals of Poisson random measures are described. We describe a unified posterior analysis of classes of discrete random probability which identifies and exploits features common to all these models. The analysis circumvents many of the difficult issues involved in Bayesian nonparametric calculus, including a combinatorial component. This allows one to focus on the unique features of each process which are characterized via real valued functions h. The applicability of the technique is further illustrated by obtaining explicit posterior expressions for Levy-Cox moving average processes within the general setting of multiplicative intensity models. In addition, novel computational procedures, similar to efficient procedures developed for the Dirichlet process, are briefly discussed for these models.

Book
31 Mar 2005
TL;DR: In this article, the authors present a comprehensive review of linear algebra, including all the proofs, and a variety of mathematical topics and concepts that are used throughout the main text, and Appendix III reviews complex analysis.
Abstract: This book is intended for use in a rigorous introductory PhD level course in econometrics, or in a field course in econometric theory. It covers the measure-theoretical foundation of probability theory, the multivariate normal distribution with its application to classical linear regression analysis, various laws of large numbers, central limit theorems and related results for independent random variables as well as for stationary time series, with applications to asymptotic inference of M-estimators, and maximum likelihood theory. Some chapters have their own appendices containing the more advanced topics and/or difficult proofs. Moreover, there are three appendices with material that is supposed to be known. Appendix I contains a comprehensive review of linear algebra, including all the proofs. Appendix II reviews a variety of mathematical topics and concepts that are used throughout the main text, and Appendix III reviews complex analysis. Therefore, this book is uniquely self-contained.

Book
25 Apr 2005
TL;DR: In this article, the authors introduce the concepts of utility, risk, safety and reliability, and extreme risk in the context of game-theoretic decision-making and game-planning.
Abstract: 1. Uncertainty and decision-making 2. The concept of probability 3. Distributions and expectation 4. The concept of utility 5. Games and optimization 6. Entropy 7. Mathematical aspects 8. Exchangeability and inference 9. Extremes 10. Risk, safety and reliability 11. Data and simulation 12. Conclusion Appendix.

Journal ArticleDOI
TL;DR: In this article, the authors give conditions on the underlying distributions and the parameters on which the generalized order statistics are based, to obtain stochastic comparisons in the stochastically, dispersive, hazard rate, and likelihood ratio orders.
Abstract: In this article, we give several results on (multivariate and univariate) stochastic comparisons of generalized order statistics. We give conditions on the underlying distributions and the parameters on which the generalized order statistics are based, to obtain stochastic comparisons in the stochastic, dispersive, hazard rate, and likelihood ratio orders. Our results generalize some recent results for order statistics, record values, and generalized order statistics and provide some new results for other models such as k-record values and order statistics under multivariate imperfect repair.

Journal ArticleDOI
TL;DR: In this article, an extension of classical \chi^2 goodness-of-fit tests to Bayesian model assessment is described, which essentially involves evaluating Pearson's goodness of fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptotically distributed as a \chi 2 random variable on K-1 degrees of freedom, independently of the dimension of the underlying parameter vector.
Abstract: This article describes an extension of classical \chi^2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involves evaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptotically distributed as a \chi^2 random variable on K-1 degrees of freedom, independently of the dimension of the underlying parameter vector. By examining the posterior distribution of this statistic, global goodness-of-fit diagnostics are obtained. Advantages of these diagnostics include ease of interpretation, computational convenience and favorable power properties. The proposed diagnostics can be used to assess the adequacy of a broad class of Bayesian models, essentially requiring only a finite-dimensional parameter vector and conditionally independent observations.

Book
28 Oct 2005
TL;DR: Practical statisticians will find this book useful in that it is replete with statistical test procedures (both parametric and non-parametric) as well as numerous detailed examples.
Abstract: The highly readable text captures the flavor of a course in mathematical statistics without imposing too much rigor; students can concentrate on the statistical strategies without getting lost in the theory. Students who use this book will be well on their way to thinking like a statistician. Practicing statisticians will find this book useful in that it is replete with statistical test procedures (both parametric and non-parametric) as well as numerous detailed examples. · Comprehensive coverage of descriptive statistics · More detailed treatment of univariate and bivariate probability distributions · Thorough coverage of probability theory with numerous event classifications

Journal ArticleDOI
TL;DR: The paper considers a minimum-entropy parametric estimator that minimizes an estimate of the entropy of the distribution of the residuals of a regression problem and considers a direct approach that does not involve any preliminary √n-consistent estimator.

Journal ArticleDOI
TL;DR: Which ambiguities exist for specifying statistical distributions, and which complications can arise when uncertainty information is transferred from a database to an LCA program, are shown.
Abstract: Statistical information for LCA is increasingly becoming available in databases. At the same time, processing of statistical information is increasingly becoming easier by software for LCA. A practical problem is that there is no unique unambiguous representation for statistical distributions. Representations. This paper discusses the most frequently encountered statistical distributions, their representation in mathematical statistics, EcoSpold and CMLCA, and the relationships between these representations. The distributions. Four statistical distributions are discussed: uniform, triangular, normal and lognormal. Software and examples. An easy to use software tool is available for supporting the conversion steps. Its use is illustrated with a simple example. This paper shows which ambiguities exist for specifying statistical distributions, and which complications can arise when uncertainty information is transferred from a database to an LCA program. This calls for a more extensive standardization of the vocabulary and symbols to express such information. We invite suppliers of software and databases to provide their parameter representations in a clear and unambiguous way and hope that a future revision of the ISO/TS 14048 document will standardize representation and terminology for statistical information.

Posted Content
TL;DR: In this paper, a general procedure is proposed for fitting semiparametric models with estimated weights to two-phase data for Cox regression with stratified case-cohort studies, other complex survey designs and missing data problems.
Abstract: Weighted likelihood, in which one solves Horvitz-Thompson or inverse probability weighted (IPW) versions of the likelihood equations, offers a simple and robust method for fitting models to two phase stratified samples. We consider semiparametric models for which solution of infinite dimensional estimating equations leads to $\sqrt{N}$ consistent and asymptotically Gaussian estimators of both Euclidean and nonparametric parameters. If the phase two sample is selected via Bernoulli (i.i.d.) sampling with known sampling probabilities, standard estimating equation theory shows that the influence function for the weighted likelihood estimator of the Euclidean parameter is the IPW version of the ordinary influence function. By proving weak convergence of the IPW empirical process, and borrowing results on weighted bootstrap empirical processes, we derive a parallel asymptotic expansion for finite population stratified sampling. Whereas the asymptotic variance for Bernoulli sampling involves the within strata second moments of the influence function, for finite population stratified sampling it involves only the within strata variances. The latter asymptotic variance also arises when the observed sampling fractions are used as estimates of those known a priori. A general procedure is proposed for fitting semiparametric models with estimated weights to two phase data. Several of our key results have already been derived for the special case of Cox regression with stratified case-cohort studies, other complex survey designs and missing data problems more generally. This paper is intended to help place this previous work in appropriate context and to pave the way for applications to other models.

Journal ArticleDOI
TL;DR: In this paper, the authors apply an idea of Robbins and Siegmund [Proc. 6th Berkeley Symp. Math. Statist. 4 (1972) 37-41] to construct a class of sequential tests and detection schemes whereby the unknown post-change parameters are estimated.
Abstract: Suppose a process yields independent observations whose distributions belong to a family parameterized by theta E Theta. When the process is in control, the observations are i.i.d. with a known parameter value theta(0). When the process is out of control, the parameter changes. We apply an idea of Robbins and Siegmund [Proc. Sixth Berkeley Symp. Math. Statist. Probab. 4 (1972) 37-41] to construct a class of sequential tests and detection schemes whereby the unknown post-change parameters are estimated. This approach is especially useful in situations where the parametric space is intricate and mixture-type rules are operationally or conceptually difficult to formulate. We exemplify our approach by applying it to the problem of detecting a change in the shape parameter of a Gamma distribution, in both a univariate and a multivariate setting.

Journal ArticleDOI
TL;DR: In this article, the conditional distribution function of a random variable Y given a dependent random d-vector X is estimated under a least-squares criterion, where the unit vector is selected so that the approximation is opti- mal under the least square criterion.
Abstract: Motivated by applications to prediction and forecasting, we sug- gest methods for approximating the conditional distribution function of a random variable Y given a dependent random d-vector X. The idea is to estimate not the distribution of Y |X, but that of Y |� T X, where the unit vectoris selected so that the approximation is opti- mal under a least-squares criterion. We show thatmay be estimated root-n consistently. Furthermore, estimation of the conditional distri- bution function of Y , givenT X, has the same first-order asymptotic properties that it would enjoy ifwere known. The proposed method is illustrated using both simulated and real-data examples, showing its effectiveness for both independent datasets and data from time series. Numerical work corroborates the theoretical result thatcan be estimated particularly accurately.

Journal ArticleDOI
TL;DR: In this article, a new class of nonparametric prior distributions is introduced and studied, which are constructed via normalization of random measures driven by increasing additive processes, and results for the distribution of means under both prior and posterior conditions are presented.
Abstract: This paper introduces and studies a new class of nonparametric prior distributions. Random probability distribution functions are constructed via normalization of random measures driven by increasing additive processes. In particular, we present results for the distribution of means under both prior and posterior conditions and, via the use of strategic latent variables, undertake a full Bayesian analysis. Our class of priors includes the well-known and widely used mixture of a Dirichlet process.

Journal ArticleDOI
TL;DR: The problem is formulated and a solution strategy is presented that uses the α-cut technique in order to transform the problem into a stochastic program with linear partial information on probability distribution (SPI).


Journal ArticleDOI
TL;DR: In this article, the authors established a mathematical framework that formally validates the two-phase "super-population viewpoint" proposed by Hartley and Sielken by defining a product probability space which includes both the design space and the model space.
Abstract: We establish a mathematical framework that formally validates the two-phase "super-population viewpoint" proposed by Hartley and Sielken [Biometrics 31 (1975) 411-422] by defining a product probability space which includes both the design space and the model space. The methodology we develop combines finite population sampling theory and the classical theory of infinite population sampling to account for the underlying processes that produce the data under a unified approach. Our key results are the following: first, if the sample estimators converge in the design law and the model statistics converge in the model, then, under certain conditions, they are asymptotically independent, and they converge jointly in the product space; second, the sample estimating equation estimator is asymptotically normal around a super-population parameter.

Book
15 Sep 2005
TL;DR: Classical methods of statistics as mentioned in this paper, a.k.a. classical method of statistics, classical methods of statistical methods, and classical methods for statistics, کتابخانه دیجیتال جندی شاپور اهواز
Abstract: Classical methods of statistics , Classical methods of statistics , کتابخانه دیجیتال جندی شاپور اهواز

Posted Content
TL;DR: In this article, the authors show that Kolmogorov complexity and such estimators as universal codes can be applied for hypotheses testing in a framework of classical mathematical statistics, and the methods for identity testing and nonparametric testing of serial independence for time series are suggested.
Abstract: We show that Kolmogorov complexity and such its estimators as universal codes (or data compression methods) can be applied for hypotheses testing in a framework of classical mathematical statistics. The methods for identity testing and nonparametric testing of serial independence for time series are suggested.


Book
30 Jun 2005
TL;DR: In this paper, the Central Limit Theorem is used to test whether a single population proportion is larger than two in a single sample, and the Chi-square test is applied to compare more than two proportions.
Abstract: Preface Unit I. Descriptive Statistics 1. Introduction: What Is Statistics All About? 2. Using Polystat to Do Statistical Analysis 3. Presentation of Data 4. Summarizing Data and Using Descriptive Statistics Unit II. Basic Probability and Probability Distributions 5. Basic Probability: Theory and Applications 6. Sampling and the Normal Distribution 7. The Central Limit Theorem Unit III. Hypothesis Testing 8. Introduction to Inferential Statistics 9. Estimating Population Means, Proportions, and Sample Size with Confidence 10. Validating a Hypothesis About a Single Population Mean Using a Sample 11. Validating Hypotheses Between Two Population Means 12. Validating Hypotheses About a Single Population Proportion 13. Validating Hypotheses About Two Population Proportions Unit IV. Measures of Association 14. Comparing More than Two Population Means with Anova 15. Comparing More than Two Population Proportions Using the Chi-Square Test 16. Determining Relationships for Two Variables Using Simple Correlation 17. Measuring Relationships Between Two Variables with Simple Regression Analysis 18. Measuring Multivariate Relationships with Multiple Regression Analysis 19. Planning Statistical Research Appendix Key Formulas

Book
01 Jan 2005
TL;DR: This chapter discusses Descriptive Statistics: Tabular and Graphical Presentations, which aims to introduce a new generation of statisticians to the concept of descriptive statistics.
Abstract: 1. Data and Statistics. 2. Descriptive Statistics: Tabular and Graphical Presentations. 3. Descriptive Statistics: Numerical Measures. 4. Introduction to Probability. 5. Discrete Probability Distributions. 6. Continuous Probability Distributions. 7. Sampling and Sampling Distributions. 8. Interval Estimation. 9. Hypothesis Tests. 10. Comparisons Involving Means, Experimental Design, and Analysis of Variance. 11. Comparisons Involving Proportions and a Test of Independence. 12. Simple Linear Regression. 13. Multiple Regression. Appendix A: References and Bibliography. Appendix B: Tables. Appendix C: Summation Notation. Appendix D: Self-Test Solutions and Answers to Even-Numbered Exercises. Appendix E: Microsoft Office Excel 2010 and Tools for Statistical Analysis. Appendix F: Computing p-Values Using Minitab and Excel.