Home
/
Authors
/
Kjell A. Doksum

Author

Kjell A. Doksum

Other affiliations: Columbia University, University of California, Berkeley

Bio: Kjell A. Doksum is an academic researcher from University of Wisconsin-Madison. The author has contributed to research in topics: Linear model & Estimator. The author has an hindex of 32, co-authored 81 publications receiving 5888 citations. Previous affiliations of Kjell A. Doksum include Columbia University & University of California, Berkeley.

Papers published on a yearly basis

2021
2019
2017
2016
2015
2014
2013
2012
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1998
1997
1996
1995
1994
1993
1992
1990
1989
1988
1987
1986
1985
1984
1983
1981
1978
1977
1976
1974
1971
1969
1967
1966

Papers

PDF

Open Access

More filters

Book•

Mathematical Statistics: Basic Ideas and Selected Topics

[...]

Peter J. Bickel, Kjell A. Doksum

01 Jan 1977

TL;DR: In this paper, the authors present a review of basic probability theory and its application in statistical models, goals, and performance criteria, as well as several non-decision theoretic criteria.

...read moreread less

Abstract: (NOTE: Each chapter concludes with Problems and Complements, Notes, and References.) 1. Statistical Models, Goals, and Performance Criteria. Data, Models, Parameters, and Statistics. Bayesian Models. The Decision Theoretic Framework. Prediction. Sufficiency. Exponential Families. 2. Methods of Estimation. Basic Heuristics of Estimation. Minimum Contrast Estimates and Estimating Equations. Maximum Likelihood in Multiparameter Exponential Families. Algorithmic Issues. 3. Measures of Performance. Introduction. Bayes Procedures. Minimax Procedures. Unbiased Estimation and Risk Inequalities. Nondecision Theoretic Criteria. 4. Testing and Confidence Regions. Introduction. Choosing a Test Statistic: The Neyman-Pearson Lemma. Uniformly Most Powerful Tests and Monotone Likelihood Ratio Models. Confidence Bounds, Intervals and Regions. The Duality between Confidence Regions and Tests. Uniformly Most Accurate Confidence Bounds. Frequentist and Bayesian Formulations. Prediction Intervals. Likelihood Ratio Procedures. 5. Asymptotic Approximations. Introduction: The Meaning and Uses of Asymptotics. Consistency. First- and Higher-Order Asymptotics: The Delta Method with Applications. Asymptotic Theory in One Dimension. Asymptotic Behavior and Optimality of the Posterior Distribution. 6. Inference in the Multiparameter Case. Inference for Gaussian Linear Models. Asymptotic Estimation Theory in p Dimensions. Large Sample Tests and Confidence Regions. Large Sample Methods for Discrete Data. Generalized Linear Models. Robustness Properties and Semiparametric Models. Appendix A: A Review of Basic Probability Theory. The Basic Model. Elementary Properties of Probability Models. Discrete Probability Models. Conditional Probability and Independence. Compound Experiments. Bernoulli and Multinomial Trials, Sampling with and without Replacement. Probabilities on Euclidean Space. Random Variables and Vectors: Transformations. Independence of Random Variables and Vectors. The Expectation of a Random Variable. Moments. Moment and Cumulant Generating Functions. Some Classical Discrete and Continuous Distributions. Modes of Convergence of Random Variables and Limit Theorems. Further Limit Theorems and Inequalities. Poisson Process. Appendix B: Additional Topics in Probability and Analysis. Conditioning by a Random Variable or Vector. Distribution Theory for Transformations of Random Vectors. Distribution Theory for Samples from a Normal Population. The Bivariate Normal Distribution. Moments of Random Vectors and Matrices. The Multivariate Normal Distribution. Convergence for Random Vectors: Op and Op Notation. Multivariate Calculus. Convexity and Inequalities. Topics in Matrix Theory and Elementary Hilbert Space Theory. Appendix C: Tables. The Standard Normal Distribution. Auxiliary Table of the Standard Normal Distribution. t Distribution Critical Values. X 2 Distribution Critical Values. F Distribution Critical Values. Index.

...read moreread less

1,630 citations

Journal Article•DOI•

An Analysis of Transformations Revisited

[...]

Peter J. Bickel, Kjell A. Doksum

01 Jun 1981-Journal of the American Statistical Association

TL;DR: In this article, the authors considered consistency properties of the Box-Cox estimates (MLE's) of λ and the parameters in the linear model, as well as the asymptotic variances of these estimates.

...read moreread less

Abstract: Following Box and Cox (1964), we assume that a transform Z i = h(Yi , λ) of our original data {Yi } satisfies a linear model. Consistency properties of the Box-Cox estimates (MLE's) of λ and the parameters in the linear model, as well as the asymptotic variances of these estimates, are considered. We find that in some structured models such as transformed linear regression with small to moderate error variances, the asymptotic variances of the estimates of the parameters in the linear model are much larger when the transformation parameter λ is unknown than when it is known. In some unstructured models such as transformed one-way analysis of variance with moderate to large error variances, the cost of not knowing λ is moderate to small. The case where the error distribution in the linear model is not normal but actually unknown is considered, and robust methods in the presence of transformations are introduced for this case. Asymptotics and simulation results for the transformed additive two-way ...

...read moreread less

522 citations

Journal Article•DOI•

Empirical Probability Plots and Statistical Inference for Nonlinear Models in the Two-Sample Case

[...]

Kjell A. Doksum

01 Mar 1974-Annals of Statistics

TL;DR: In this paper, it was shown that if parameters are allowed to be function valued, there is essentially only one function (i.e., a function that can be defined by the following:

...read moreread less

Abstract: Let $X$ and $Y$ be two random variables with continuous distribution functions $F$ and $G$ and means $\mu$ and $\xi$. In a linear model, the crucial property of the contrast $\Delta = \xi - \mu$ is that $X + \Delta =_\mathscr{L} Y$, where $= _\mathscr{L}$ denotes equality in law. When the linear model does not hold, there is no real number $\Delta$ such that $X + \Delta = _\mathscr{L} Y$. However, it is shown that if parameters are allowed to be function valued, there is essentially only one function $\Delta(\bullet)$ such that $X + \Delta(X) = _\mathscr{L} Y$, and this function can be defined by $\Delta(x) = G^{-1}(F(x)) - x$. The estimate $\hat{\Delta}_N(x) = G_n^{-1}(F_m(x)) - x$ of $\Delta(x)$ is considered, where $G_n$ and $F_m$ are the empirical distribution functions. Confidence bands based on this estimate are given and the asymptotic distribution of $\hat{\Delta}_N(\bullet)$ is derived. For general models in analysis of variance, contrasts that can be expressed as sums of differences of means can be replaced by sums of functions of the above kind.

...read moreread less

343 citations

Journal Article•DOI•

Models for variable-stress accelerated life testing experiments based on Wiener processes and the inverse Gaussian distribution

[...]

Kjell A. Doksum¹, Arnljot Hoyland²•Institutions (2)

University of California, Berkeley¹, Norwegian Institute of Technology²

01 Feb 1992-Technometrics

TL;DR: In this article, the authors consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process W(y) whose distribution changes at certain stress change points to < t l < < < …

...read moreread less

342 citations

Models for Variable-Stress Accelerated Life Testing Experiments Based on Wiener Processes and the Inverse

[...]

Kjell A. Doksum, Arnljot Hoyland

01 Jan 1992

TL;DR: In this paper, the authors consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process whose distribution changes at certain stress change points to < t, <.? < tk. Failure occurs the first time W(y) crosses a critical boundary w.

...read moreread less

Abstract: Variable-stress accelerated life testing trials are experiments in which each of the units in a random sample of units of a product is run under increasingly severe conditions to get information quickly on its life distribution. We consider a fatigue failure model in which accumulated decay is governed by a continuous Gaussian process W(y) whose distribution changes at certain stress change points to < t, < . ? < tk. Continuously increasing stress is also considered. Failure occurs the first time W(y) crosses a critical boundary w. The distribution of time to failure for the models can be represented in terms of time-transformed inverse Gaussian distribution functions, and the parameters in models for experiments with censored data can be estimated using maximum likelihood methods. A common approach to the modeling of failure times for experimental units subject to increased stress at certain stress change points is to assume that the failure times follow a distribution that consists of segments of Weibull distributions with the same shape parameter. Our Wiener-process approach gives an alternative flexible class of time-transformed inverse Gaussian models in which time to failure is modeled in terms of accumulated decay reaching a critical level and in which parametric functions are used to express how higher stresses accelerate the rate of decay and the time to failure. Key parameters such as mean life under normal stress, quantiles of the normal stress distribution, and decay rate under normal and accelerated stress appear naturally in the model. A variety of possible parameterizations of the decay rate leads to flexible modeling. Model fit can be checked by percentage-percentage plots.

...read moreread less

298 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17

Collapse

Cited by

PDF

Open Access

More filters

Book Chapter•DOI•

Regression Models and Life-Tables

[...]

David Cox¹•Institutions (1)

Imperial College London¹

01 Jan 1972-Journal of the royal statistical society series b-methodological

TL;DR: The analysis of censored failure times is considered in this paper, where the hazard function is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time.

...read moreread less

Abstract: The analysis of censored failure times is considered. It is assumed that on each individual arc available values of one or more explanatory variables. The hazard function (age-specific failure rate) is taken to be a function of the explanatory variables and unknown regression coefficients multiplied by an arbitrary and unknown function of time. A conditional likelihood is obtained, leading to inferences about the unknown regression coefficients. Some generalizations are outlined.

...read moreread less

28,264 citations

Proceedings Article•

Latent Dirichlet Allocation

[...]

David M. Blei¹, Andrew Y. Ng¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

03 Jan 2001

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

...read moreread less

25,546 citations

Journal Article•DOI•

Classification and regression trees

[...]

Wei-Yin Loh¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jan 2011-Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery

TL;DR: This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.

...read moreread less

Abstract: Classification and regression trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. As a result, the partitioning can be represented graphically as a decision tree. Classification trees are designed for dependent variables that take a finite number of unordered values, with prediction error measured in terms of misclassification cost. Regression trees are for dependent variables that take continuous or ordered discrete values, with prediction error typically measured by the squared difference between the observed and predicted values. This article gives an introduction to the subject by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 14-23 DOI: 10.1002/widm.8 This article is categorized under: Technologies > Classification Technologies > Machine Learning Technologies > Prediction Technologies > Statistical Fundamentals

...read moreread less

16,974 citations

Journal Article•DOI•

A Proportional Hazards Model for the Subdistribution of a Competing Risk

[...]

Jason P. Fine¹, Robert Gray²•Institutions (2)

University of Wisconsin-Madison¹, Harvard University²

01 Jun 1999-Journal of the American Statistical Association

TL;DR: This article proposes methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation, but these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function.

...read moreread less

Abstract: With explanatory covariates, the standard analysis for competing risks data involves modeling the cause-specific hazard functions via a proportional hazards assumption Unfortunately, the cause-specific hazard function does not have a direct interpretation in terms of survival probabilities for the particular failure type In recent years many clinicians have begun using the cumulative incidence function, the marginal failure probabilities for a particular cause, which is intuitively appealing and more easily explained to the nonstatistician The cumulative incidence is especially relevant in cost-effectiveness analyses in which the survival probabilities are needed to determine treatment utility Previously, authors have considered methods for combining estimates of the cause-specific hazard functions under the proportional hazards formulation However, these methods do not allow the analyst to directly assess the effect of a covariate on the marginal probability function In this article we pro

...read moreread less

11,109 citations

Journal Article•DOI•

Convergence of Probability Measures

[...]

J. F. C. Kingman¹•Institutions (1)

University of Sussex¹

01 Nov 1969-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: Convergence of Probability Measures as mentioned in this paper is a well-known convergence of probability measures. But it does not consider the relationship between probability measures and the probability distribution of probabilities.

...read moreread less

Abstract: Convergence of Probability Measures. By P. Billingsley. Chichester, Sussex, Wiley, 1968. xii, 253 p. 9 1/4“. 117s.

...read moreread less

5,689 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse