scispace - formally typeset
Search or ask a question
Author

Kjell A. Doksum

Bio: Kjell A. Doksum is an academic researcher from University of Wisconsin-Madison. The author has contributed to research in topics: Linear model & Estimator. The author has an hindex of 32, co-authored 81 publications receiving 5888 citations. Previous affiliations of Kjell A. Doksum include Columbia University & University of California, Berkeley.


Papers
More filters
Book
01 Jan 1977
TL;DR: In this paper, the authors present a review of basic probability theory and its application in statistical models, goals, and performance criteria, as well as several non-decision theoretic criteria.
Abstract: (NOTE: Each chapter concludes with Problems and Complements, Notes, and References.) 1. Statistical Models, Goals, and Performance Criteria. Data, Models, Parameters, and Statistics. Bayesian Models. The Decision Theoretic Framework. Prediction. Sufficiency. Exponential Families. 2. Methods of Estimation. Basic Heuristics of Estimation. Minimum Contrast Estimates and Estimating Equations. Maximum Likelihood in Multiparameter Exponential Families. Algorithmic Issues. 3. Measures of Performance. Introduction. Bayes Procedures. Minimax Procedures. Unbiased Estimation and Risk Inequalities. Nondecision Theoretic Criteria. 4. Testing and Confidence Regions. Introduction. Choosing a Test Statistic: The Neyman-Pearson Lemma. Uniformly Most Powerful Tests and Monotone Likelihood Ratio Models. Confidence Bounds, Intervals and Regions. The Duality between Confidence Regions and Tests. Uniformly Most Accurate Confidence Bounds. Frequentist and Bayesian Formulations. Prediction Intervals. Likelihood Ratio Procedures. 5. Asymptotic Approximations. Introduction: The Meaning and Uses of Asymptotics. Consistency. First- and Higher-Order Asymptotics: The Delta Method with Applications. Asymptotic Theory in One Dimension. Asymptotic Behavior and Optimality of the Posterior Distribution. 6. Inference in the Multiparameter Case. Inference for Gaussian Linear Models. Asymptotic Estimation Theory in p Dimensions. Large Sample Tests and Confidence Regions. Large Sample Methods for Discrete Data. Generalized Linear Models. Robustness Properties and Semiparametric Models. Appendix A: A Review of Basic Probability Theory. The Basic Model. Elementary Properties of Probability Models. Discrete Probability Models. Conditional Probability and Independence. Compound Experiments. Bernoulli and Multinomial Trials, Sampling with and without Replacement. Probabilities on Euclidean Space. Random Variables and Vectors: Transformations. Independence of Random Variables and Vectors. The Expectation of a Random Variable. Moments. Moment and Cumulant Generating Functions. Some Classical Discrete and Continuous Distributions. Modes of Convergence of Random Variables and Limit Theorems. Further Limit Theorems and Inequalities. Poisson Process. Appendix B: Additional Topics in Probability and Analysis. Conditioning by a Random Variable or Vector. Distribution Theory for Transformations of Random Vectors. Distribution Theory for Samples from a Normal Population. The Bivariate Normal Distribution. Moments of Random Vectors and Matrices. The Multivariate Normal Distribution. Convergence for Random Vectors: Op and Op Notation. Multivariate Calculus. Convexity and Inequalities. Topics in Matrix Theory and Elementary Hilbert Space Theory. Appendix C: Tables. The Standard Normal Distribution. Auxiliary Table of the Standard Normal Distribution. t Distribution Critical Values. X 2 Distribution Critical Values. F Distribution Critical Values. Index.

1,630 citations

Journal ArticleDOI
TL;DR: In this article, the authors considered consistency properties of the Box-Cox estimates (MLE's) of λ and the parameters in the linear model, as well as the asymptotic variances of these estimates.
Abstract: Following Box and Cox (1964), we assume that a transform Z i = h(Yi , λ) of our original data {Yi } satisfies a linear model. Consistency properties of the Box-Cox estimates (MLE's) of λ and the parameters in the linear model, as well as the asymptotic variances of these estimates, are considered. We find that in some structured models such as transformed linear regression with small to moderate error variances, the asymptotic variances of the estimates of the parameters in the linear model are much larger when the transformation parameter λ is unknown than when it is known. In some unstructured models such as transformed one-way analysis of variance with moderate to large error variances, the cost of not knowing λ is moderate to small. The case where the error distribution in the linear model is not normal but actually unknown is considered, and robust methods in the presence of transformations are introduced for this case. Asymptotics and simulation results for the transformed additive two-way ...

522 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that if parameters are allowed to be function valued, there is essentially only one function (i.e., a function that can be defined by the following:
Abstract: Let $X$ and $Y$ be two random variables with continuous distribution functions $F$ and $G$ and means $\mu$ and $\xi$. In a linear model, the crucial property of the contrast $\Delta = \xi - \mu$ is that $X + \Delta =_\mathscr{L} Y$, where $= _\mathscr{L}$ denotes equality in law. When the linear model does not hold, there is no real number $\Delta$ such that $X + \Delta = _\mathscr{L} Y$. However, it is shown that if parameters are allowed to be function valued, there is essentially only one function $\Delta(\bullet)$ such that $X + \Delta(X) = _\mathscr{L} Y$, and this function can be defined by $\Delta(x) = G^{-1}(F(x)) - x$. The estimate $\hat{\Delta}_N(x) = G_n^{-1}(F_m(x)) - x$ of $\Delta(x)$ is considered, where $G_n$ and $F_m$ are the empirical distribution functions. Confidence bands based on this estimate are given and the asymptotic distribution of $\hat{\Delta}_N(\bullet)$ is derived. For general models in analysis of variance, contrasts that can be expressed as sums of differences of means can be replaced by sums of functions of the above kind.

343 citations

Journal ArticleDOI
TL;DR: In this article, the authors consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process W(y) whose distribution changes at certain stress change points to < t l < < < …
Abstract: Variable-stress accelerated life testing trials are experiments in which each of the units in a random sample of units of a product is run under increasingly severe conditions to get information quickly on its life distribution. We consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process W(y) whose distribution changes at certain stress change points to < t l < < …

342 citations

01 Jan 1992
TL;DR: In this paper, the authors consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process whose distribution changes at certain stress change points to < t, <.? < tk. Failure occurs the first time W(y) crosses a critical boundary w.
Abstract: Variable-stress accelerated life testing trials are experiments in which each of the units in a random sample of units of a product is run under increasingly severe conditions to get information quickly on its life distribution. We consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process W(y) whose distribution changes at certain stress change points to < t, < . ? < tk. Continuously increasing stress is also considered. Failure occurs the first time W(y) crosses a critical boundary w. The distribution of time to failure for the models can be represented in terms of time-transformed inverse Gaussian distribution functions, and the parameters in models for experiments with censored data can be estimated using maximum likelihood methods. A common approach to the modeling of failure times for experimental units subject to increased stress at certain stress change points is to assume that the failure times follow a distribution that consists of segments of Weibull distributions with the same shape parameter. Our Wiener-process approach gives an alternative flexible class of time-transformed inverse Gaussian models in which time to failure is modeled in terms of accumulated decay reaching a critical level and in which parametric functions are used to express how higher stresses accelerate the rate of decay and the time to failure. Key parameters such as mean life under normal stress, quantiles of the normal stress distribution, and decay rate under normal and accelerated stress appear naturally in the model. A variety of possible parameterizations of the decay rate leads to flexible modeling. Model fit can be checked by percentage-percentage plots.

298 citations


Cited by
More filters
Book ChapterDOI
TL;DR: The analysis of censored failure times is considered in this paper, where the hazard function is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time.
Abstract: The analysis of censored failure times is considered. It is assumed that on each individual arc available values of one or more explanatory variables. The hazard function (age-specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.

28,264 citations

Proceedings Article
03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

25,546 citations

Journal ArticleDOI
TL;DR: This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.
Abstract: Classification and regression trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. As a result, the partitioning can be represented graphically as a decision tree. Classification trees are designed for dependent variables that take a finite number of unordered values, with prediction error measured in terms of misclassification cost. Regression trees are for dependent variables that take continuous or ordered discrete values, with prediction error typically measured by the squared difference between the observed and predicted values. This article gives an introduction to the subject by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 14-23 DOI: 10.1002/widm.8 This article is categorized under: Technologies > Classification Technologies > Machine Learning Technologies > Prediction Technologies > Statistical Fundamentals

16,974 citations

Journal ArticleDOI
TL;DR: This article proposes methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation, but these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function.
Abstract: With explanatory covariates, the standard analysis for competing risks data involves modeling the cause-specific hazard functions via a proportional hazards assumption Unfortunately, the cause-specific hazard function does not have a direct interpretation in terms of survival probabilities for the particular failure type In recent years many clinicians have begun using the cumulative incidence function, the marginal failure probabilities for a particular cause, which is intuitively appealing and more easily explained to the nonstatistician The cumulative incidence is especially relevant in cost-effectiveness analyses in which the survival probabilities are needed to determine treatment utility Previously, authors have considered methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation However, these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function In this article we pro

11,109 citations

Journal ArticleDOI
TL;DR: Convergence of Probability Measures as mentioned in this paper is a well-known convergence of probability measures. But it does not consider the relationship between probability measures and the probability distribution of probabilities.
Abstract: Convergence of Probability Measures. By P. Billingsley. Chichester, Sussex, Wiley, 1968. xii, 253 p. 9 1/4“. 117s.

5,689 citations