scispace - formally typeset
Search or ask a question
Author

Peter J. Huber

Bio: Peter J. Huber is an academic researcher from University of California. The author has contributed to research in topics: Robustness (computer science) & Minimax. The author has an hindex of 20, co-authored 29 publications receiving 17338 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a new approach toward a theory of robust estimation is presented, which treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators that are asyptotically most robust (in a sense to be specified) among all translation invariant estimators.
Abstract: This paper contains a new approach toward a theory of robust estimation; it treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators—intermediaries between sample mean and sample median—that are asymptotically most robust (in a sense to be specified) among all translation invariant estimators. For the general background, see Tukey (1960) (p. 448 ff.)

5,628 citations

01 Jan 1967
TL;DR: In this paper, the authors prove consistency and asymptotic normality of maximum likelihood estimators under weaker conditions than usual, such that the true distribution underlying the observations belongs to the parametric family defining the estimator, and the regularity conditions do not involve the second and higher derivatives of the likelihood function.
Abstract: This paper proves consistency and asymptotic normality of maximum likelihood (ML) estimators under weaker conditions than usual. In particular, (i) it is not assumed that the true distribution underlying the observations belongs to the parametric family defining the ML estimator, and (ii) the regularity conditions do not involve the second and higher derivatives of the likelihood function. The need for theorems on asymptotic normality of ML estimators subject to (i) and (ii) becomes apparent in connection with robust estimation problems; for instance, if one tries to extend the author's results on robust estimation of a location parameter [4] to multivariate and other more general estimation problems. Wald's classical consistency proof [6] satisfies (ii) and can easily be modified to show that the ML estimator is consistent also in case (i), that is, it converges to the 0o characterized by the property E(logf(x, 0) log f(x, O0)) < 0 for 0 . Oo, where the expectation is taken with respect to the true underlying distribution. Asymptotic normality is more troublesome. Daniels [1] proved asymptotic normality subject to (ii), but unfortunately he overlooked that a crucial step in his proof (the use of the central limit theorem in (4.4)) is incorrect without condition (2.2) of Linnik [5]; this condition seems to be too restrictive for many purposes. In section 4 we shall prove asymptotic normality, assuming that the ML estimator is consistent. For the sake of completeness, sections 2 and 3 contain, therefore, two different sets of sufficient conditions for consistency. Otherwise, these sections are independent of each other. Section 5 presents two examples.

5,339 citations

Journal ArticleDOI
TL;DR: In this paper, a formal power series expansion of the initial terms of a power-series expansion with respect to the number of observations has been proposed, in most cases down to 4 observations per parameter.
Abstract: Maximum likelihood type robust estimates of regression are defined and their asymptotic properties are investigated both theoretically and empirically. Perhaps the most important new feature is that the number $p$ of parameters is allowed to increase with the number $n$ of observations. The initial terms of a formal power series expansion (essentially in powers of $p/n$) show an excellent agreement with Monte Carlo results, in most cases down to 4 observations per parameter.

2,221 citations

01 Jan 2011
TL;DR: Spectral Methods in Fluid DynamicsNumerical Methods for Partial Differential Equations (PDE): Theory and Applications of Spectral Methods: Theory and ApplicatonsSpectral methods for Incompressible Viscous FlowAdvances in Numerical Analysis: Nonlinear partial differential equations and dynamical systemsSpectral method using Multivariate polynomials on the Unit Ball as discussed by the authors.
Abstract: Spectral Methods in Fluid DynamicsNumerical Methods for Partial Differential EquationsNumerical Analysis of Partial Differential EquationsNumerical analysis of spectral methods : theory and applicationsSpectral Methods And Their ApplicationsA Brief Introduction to Numerical AnalysisA First Course in the Numerical Analysis of Differential Equations South Asian EditionConvergence of Spectral Methods for Hyperbolic Initial-boundary Value SystemsReview of Some Approximation Operators for the Numerical Analysis of Spectral MethodsSpectral Methods in MATLABA Modified Spectral Method in Phase SpaceThe Birth of Numerical AnalysisSpectral Methods for Non-Standard Eigenvalue ProblemsPartial Differential EquationsNumerical Analysis of Spectral MethodsNumerical Analysis of Partial Differential Equations Using Maple and MATLABSpectral MethodsSpectral Methods for NonStandard Eigenvalue ProblemsAn Introduction to the Numerical Analysis of Spectral MethodsSpectral Methods in Time for Parabolic ProblemsSpectral Methods in Chemistry and PhysicsA First Course in the Numerical Analysis of Differential Equations South Asian EditionSummary of Research in Applied Mathematics, Numerical Analysis and Computer Science at the Institute for Computer Applications in Science and EngineeringNumerical AnalysisSpectral Methods for Compressible Flow ProblemsA First Course in the Numerical Analysis of Differential EquationsSummary of Research in Applied Mathematics, Numerical Analysis, and Computer SciencesA Theoretical Introduction to Numerical AnalysisNumerical AnalysisRiemann-Hilbert Problems, Their Numerical Solution, and the Computation of Nonlinear Special FunctionsSpectral MethodsSpectral Methods for Uncertainty QuantificationSpectral Methods and Their ApplicationsNumerical Analysis of Spectral Methods: Theory and ApplicatonsSpectral Methods for Incompressible Viscous FlowAdvances in Numerical Analysis: Nonlinear partial differential equations and dynamical systemsSpectral Methods Using Multivariate Polynomials on the Unit BallA First Course in the Numerical Analysis of Differential EquationsFundamentals of Engineering Numerical AnalysisSpectral Methods for Time-Dependent Problems

1,425 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Abstract: The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as the procedure for statistical model identification. The classical maximum likelihood estimation procedure is reviewed and a new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced. When there are several competing models the MAICE is defined by the model and the maximum likelihood estimates of the parameters which give the minimum of AIC defined by AIC = (-2)log-(maximum likelihood) + 2(number of independently adjusted parameters within the model). MAICE provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The practical utility of MAICE in time series analysis is demonstrated with some numerical examples.

47,133 citations

Book
01 Jan 2001
TL;DR: This is the essential companion to Jeffrey Wooldridge's widely-used graduate text Econometric Analysis of Cross Section and Panel Data (MIT Press, 2001).
Abstract: The second edition of this acclaimed graduate text provides a unified treatment of two methods used in contemporary econometric research, cross section and data panel methods. By focusing on assumptions that can be given behavioral content, the book maintains an appropriate level of rigor while emphasizing intuitive thinking. The analysis covers both linear and nonlinear models, including models with dynamics and/or individual heterogeneity. In addition to general estimation frameworks (particular methods of moments and maximum likelihood), specific linear and nonlinear methods are covered in detail, including probit and logit models and their multivariate, Tobit models, models for count data, censored and missing data schemes, causal (or treatment) effects, and duration analysis. Econometric Analysis of Cross Section and Panel Data was the first graduate econometrics text to focus on microeconomic data structures, allowing assumptions to be separated into population and sampling assumptions. This second edition has been substantially updated and revised. Improvements include a broader class of models for missing data problems; more detailed treatment of cluster problems, an important topic for empirical researchers; expanded discussion of "generalized instrumental variables" (GIV) estimation; new coverage (based on the author's own recent research) of inverse probability weighting; a more complete framework for estimating treatment effects with panel data, and a firmly established link between econometric approaches to nonlinear panel data and the "generalized estimating equation" literature popular in statistics and other fields. New attention is given to explaining when particular econometric methods can be applied; the goal is not only to tell readers what does work, but why certain "obvious" procedures do not. The numerous included exercises, both theoretical and computer-based, allow the reader to extend methods covered in the text and discover new insights.

28,298 citations

Journal ArticleDOI
TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
Abstract: Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest-descent minimization. A general gradient descent “boosting” paradigm is developed for additive expansions based on any fitting criterion.Specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification. Special enhancements are derived for the particular case where the individual additive components are regression trees, and tools for interpreting such “TreeBoost” models are presented. Gradient boosting of regression trees produces competitive, highly robust, interpretable procedures for both regression and classification, especially appropriate for mining less than clean data. Connections between this approach and the boosting methods of Freund and Shapire and Friedman, Hastie and Tibshirani are discussed.

17,764 citations

Book
23 May 2011
TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.
Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

17,433 citations