scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the American Statistical Association in 2002"


Journal ArticleDOI
TL;DR: This work reviews a general methodology for model-based clustering that provides a principled statistical approach to important practical questions that arise in cluster analysis, such as how many clusters are there, which clustering method should be used, and how should outliers be handled.
Abstract: Cluster analysis is the automated search for groups of related observations in a dataset. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures, and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as how many clusters are there, which clustering method should be used, and how should outliers be handled. We review a general methodology for model-based clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, minefield detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology and discuss recent development...

4,123 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider forecasting a single time series when there are many predictors (N) and time series observations (T), and they show that the difference between the feasible forecasts and the infeasible forecasts constructed using the actual values of the factors converges in probability to 0 as both N and T grow large.
Abstract: This article considers forecasting a single time series when there are many predictors (N) and time series observations (T). When the data follow an approximate factor model, the predictors can be summarized by a small number of indexes, which we estimate using principal components. Feasible forecasts are shown to be asymptotically efficient in the sense that the difference between the feasible forecasts and the infeasible forecasts constructed using the actual values of the factors converges in probability to 0 as both N and T grow large. The estimated factors are shown to be consistent, even in the presence of time variation in the factor model.

2,866 citations


Journal ArticleDOI
TL;DR: Different discrimination methods for the classification of tumors based on gene expression data include nearest-neighbor classifiers, linear discriminant analysis, and classification trees, which are applied to datasets from three recently published cancer gene expression studies.
Abstract: A reliable and precise classification of tumors is essential for successful diagnosis and treatment of cancer. cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies increasingly used in cancer research. By allowing the monitoring of expression levels in cells for thousands of genes simultaneously, microarray experiments may lead to a more complete understanding of the molecular variations among tumors and hence to a finer and more informative classification. The ability to successfully distinguish between tumor classes (already known or yet to be discovered) using gene expression data is an important aspect of this novel approach to cancer classification. This article compares the performance of different discrimination methods for the classification of tumors based on gene expression data. The methods include nearest-neighbor classifiers, linear discriminant analysis, and classification trees. Recent machine learning approaches, such as bagging and boosting, are also considere...

2,810 citations


Journal ArticleDOI
TL;DR: In this article, Modelling Extremal Events for Insurance and Finance is discussed. But the authors focus on the modeling of extreme events for insurance and finance, and do not consider the effects of cyber-attacks.
Abstract: (2002). Modelling Extremal Events for Insurance and Finance. Journal of the American Statistical Association: Vol. 97, No. 457, pp. 360-360.

2,729 citations


Journal ArticleDOI
TL;DR: The Psychology of Survey Response as discussed by the authors ) is a popular survey response method for survey responses. But it is not suitable for large-scale surveys, and it cannot handle large numbers of responses.
Abstract: (2002). The Psychology of Survey Response. Journal of the American Statistical Association: Vol. 97, No. 457, pp. 358-359.

1,434 citations


Journal ArticleDOI
TL;DR: This paper extended the Cox Modeling Survival Data (CDS) model by extending the Cox model and showed that the model can be used to model survival data in a variety of scenarios.
Abstract: (2002). Modeling Survival Data: Extending the Cox Model. Journal of the American Statistical Association: Vol. 97, No. 457, pp. 353-354.

982 citations


Journal ArticleDOI
TL;DR: The Wavelet Methods for Time Series Analysis (WMSSA) as discussed by the authors is a wavelet-based method for time series analysis, which is based on wavelet wavelet analysis.
Abstract: (2002). Wavelet Methods for Time Series Analysis. Journal of the American Statistical Association: Vol. 97, No. 457, pp. 362-363.

843 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose general classes of nonseparable, stationary covariance functions for spatiotemporal random processes, which are directly in the space-time domain and do not depend on closed-form Fourier inversions.
Abstract: Geostatistical approaches to spatiotemporal prediction in environmental science, climatology, meteorology, and related fields rely on appropriate covariance models. This article proposes general classes of nonseparable, stationary covariance functions for spatiotemporal random processes. The constructions are directly in the space–time domain and do not depend on closed-form Fourier inversions. The model parameters can be associated with the data's spatial and temporal structures, respectively; and a covariance model with a readily interpretable space–time interaction parameter is fitted to wind data from Ireland.

753 citations


Journal ArticleDOI
TL;DR: In this paper, the authors give an overview of the statistical issues associated with combining such data for modeling and inference, drawing on work from geography, ecology, agriculture, geology, and statistics.
Abstract: Global positioning systems (GPSs) and geographical information systems (GISs) have been widely used to collect and synthesize spatial data from a variety of sources. New advances in satellite imagery and remote sensing now permit scientists to access spatial data at several different resolutions. The Internet facilitates fast and easy data acquisition. In any one study, several different types of data may be collected at differing scales and resolutions, at different spatial locations, and in different dimensions. Many statistical issues are associated with combining such data for modeling and inference. This article gives an overview of these issues and the approaches for integrating such disparate data, drawing on work from geography, ecology, agriculture, geology, and statistics. Emphasis is on state-of-the-art statistical solutions to this complex and important problem.

630 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of assessing the distributional consequences of a treatment on some outcome variable of interest when treatment intake is (possibly) nonrandomized, but there is a binary-instrument available for the researcher.
Abstract: This article considers the problem of assessing the distributional consequences of a treatment on some outcome variable of interest when treatment intake is (possibly) nonrandomized, but there is a binaryinstrument available for the researcher. Such a scenario is common in observational studies and in randomized experiments with imperfect compliance. One possible approach to this problem is to compare the counterfactual cumulative distribution functions of the outcome with and without the treatment. This article shows how to estimate these distributions using instrumental variable methods and a simple bootstrap procedure is proposed to test distributional hypotheses, such as equality of distributions, first-order and second-order stochastic dominance. These tests and estimators are applied to the study of the effects of veteran status on the distribution of civilian earnings. The results show a negative effect of military service during the Vietnam era that appears to be concentrated on the lower tail of ...

614 citations


Journal ArticleDOI
TL;DR: This article is concerned with objective estimation of the spatial intensity function of the background earthquake occurrences from an earthquake catalog that includes numerous clustered events in space and time, and also with an algorithm for producing declustered catalogs from the original catalog.
Abstract: This article is concerned with objective estimation of the spatial intensity function of the background earthquake occurrences from an earthquake catalog that includes numerous clustered events in space and time, and also with an algorithm for producing declustered catalogs from the original catalog. A space-time branching process model (the ETAS model) is used for describing how each event generates offspring events. It is shown that the background intensity function can be evaluated if the total spatial seismicity intensity and the branching structure can be estimated. In fact, the whole space-time process is split into two subprocesses, the background events and the clustered events. The proposed algorithm combines a parametric maximum likelihood estimate for the clustering structures using the space-time ETAS model and a nonparametric estimate of the background seismicity that we call a variable weighted kernel estimate. To demonstrate the present methods, we estimate the background seismic activities...

Journal ArticleDOI
TL;DR: In this paper, Probabilistic Risk Analysis: Foundations and Methods is presented, with a focus on probabilistic risk analysis in the context of risk analysis for statistical risk analysis.
Abstract: (2002). Probabilistic Risk Analysis: Foundations and Methods. Journal of the American Statistical Association: Vol. 97, No. 459, pp. 925-925.

Journal ArticleDOI
TL;DR: It is shown how recursive computing allows the statistically efficient use of MCMC output when estimating the hidden states, and the use of log-likelihood for assessing MCMC convergence is illustrated.
Abstract: Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods used in practice (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) can be improved by incorporating established recursive algorithms. The most important of these is a set of forward-backward recursions calculating conditional distributions of the hidden states given observed data and model parameters. I show how to use the recursive algorithms in an MCMC context and demonstrate mathematical and empirical results showing a Gibbs sampler using the forward-backward recursions mixes more rapidly than another sampler often used for HMMs. Iintroduce an augmented variables technique for obtaining unique state labels in HMMs and finite mixture models. I show how recursive computing allows the statistically efficient use of MCMC output when estimating the hidden states. I...


Journal ArticleDOI
TL;DR: In this article, the authors proposed penalized spline (P-spline) estimation of η 0(·) in partially linear single-index models, where the mean function has the form η0(α0Tx) + β 0Tz.
Abstract: Single-index models are potentially important tools for multivariate nonparametric regression. They generalize linear regression by replacing the linear combination α0Tx with a nonparametric component, η0(α0Tx), where η0(·) is an unknown univariate link function. By reducing the dimensionality from that of a general covariate vector x to a univariate index α0Tx, single-index models avoid the so-called “curse of dimensionality.” We propose penalized spline (P-spline) estimation of η0(·) in partially linear single-index models, where the mean function has the form η0(α0Tx) + β 0Tz. The P-spline approach offers a number of advantages over other fitting methods for single-index models. All parameters in the P-spline single-index model can be estimated simultaneously by penalized nonlinear least squares. As a direct least squares fitting method, our approach is rapid and computationally stable. Standard nonlinear least squares software can be used. Moreover, joint inference for η0(·), α0, and β0 is possible by...

Journal ArticleDOI
TL;DR: In this article, the convergence of Distri butions of Likelihood Ratio has been discussed, and the authors propose a method to construct a set of limit laws for Likelihood Ratios.
Abstract: 1 Introduction.- 2 Experiments, Deficiencies, Distances v.- 2.1 Comparing Risk Functions.- 2.2 Deficiency and Distance between Experiments.- 2.3 Likelihood Ratios and Blackwell's Representation.- 2.4 Further Remarks on the Convergence of Distri butions of Likelihood Ratios.- 2.5 Historical Remarks.- 3 Contiguity - Hellinger Transforms.- 3.1 Contiguity.- 3.2 Hellinger Distances, Hellinger Transforms.- 3.3 Historical Remarks.- 4 Gaussian Shift and Poisson Experiments.- 4.1 Introduction.- 4.2 Gaussian Experiments.- 4.3 Poisson Experiments.- 4.4 Historical Remarks.- 5 Limit Laws for Likelihood Ratios.- 5.1 Introduction.- 5.2 Auxiliary Results.- 5.2.1 Lindeberg's Procedure.- 5.2.2 Levy Splittings.- 5.2.3 Paul Levy's Symmetrization Inequalities.- 5.2.4 Conditions for Shift-Compactness.- 5.2.5 A Central Limit Theorem for Infinitesimal Arrays.- 5.2.6 The Special Case of Gaussian Limits.- 5.2.7 Peano Differentiable Functions.- 5.3 Limits for Binary Experiments.- 5.4 Gaussian Limits.- 5.5 Historical Remarks.- 6 Local Asymptotic Normality.- 6.1 Introduction.- 6.2 Locally Asymptotically Quadratic Families.- 6.3 A Method of Construction of Estimates.- 6.4 Some Local Bayes Properties.- 6.5 Invariance and Regularity.- 6.6 The LAMN and LAN Conditions.- 6.7 Additional Remarks on the LAN Conditions.- 6.8 Wald's Tests and Confidence Ellipsoids.- 6.9 Possible Extensions.- 6.10 Historical Remarks.- 7 Independent, Identically Distributed Observations.- 7.1 Introduction.- 7.2 The Standard i.i.d. Case: Differentiability in Quadratic Mean.- 7.3 Some Examples.- 7.4 Some Nonparametric Considerations.- 7.5 Bounds on the Risk of Estimates.- 7.6 Some Cases Where the Number of Observations Is Random.- 7.7 Historical Remarks.- 8 On Bayes Procedures.- 8.1 Introduction.- 8.2 Bayes Procedures Behave Nicely.- 8.3 The Bernstein-von Mises Phenomenon.- 8.4 A Bernstein-von Mises Result for the i.i.d. Case.- 8.5 Bayes Procedures Behave Miserably.- 8.6 Historical Remarks.- Author Index.

Journal ArticleDOI
TL;DR: In this article, the authors review the literature and present methodologies in terms of coverage probability for all of the aforementioned measurements when the target values are fixed and when the error structure is homogenous or heterogeneous.
Abstract: Measurements of agreement are needed to assess the acceptability of a new or generic process, methodology, and formulation in areas of laboratory performance, instrument or assay validation, method comparisons, statistical process control, goodness of fit, and individual bioequivalence. In all of these areas, one needs measurements that capture a large proportion of data that are within a meaningful boundary from target values. Target values can be considered random (measured with error) or fixed (known), depending on the situation. Various meaningful measures to cope with such diverse and complex situations have become available only in the last decade. These measures often assume that the target values are random. This article reviews the literature and presents methodologies in terms of “coverage probability.” In addition, analytical expressions are introduced for all of the aforementioned measurements when the target values are fixed and when the error structure is homogenous or heterogeneous (proport...

Journal ArticleDOI
TL;DR: Analysis of Time Series Structure: SSA and Related Techniques provides a careful, lucid description of its general theory and methodology, and offers an outstanding opportunity to obtain a working knowledge of why, when, and how SSA works.
Abstract: (2002). Analysis of Time Series Structure: SSA and Related Techniques. Journal of the American Statistical Association: Vol. 97, No. 460, pp. 1207-1208.

Journal ArticleDOI
TL;DR: The emphasis of this work is on how to define the concept of importance in an unambiguous way and how to assess it in the simultaneous occurrence of correlated input factors and non-additive models.
Abstract: This article deals with global quantitative sensitivity analysis of the Level E model, a computer code used in safety assessment for nuclear waste disposal. The Level E code has been the subject of two international benchmarks of risk assessment codes and Monte Carlo methods and is well known in the literature. We discuss the Level E model with reference to two different settings. In the first setting, the objective is to find the input factor that drives most of the output variance. In the second setting, we strive to achieve a preestablished reduction in the variance of the model output by fixing the smallest number of factors. The emphasis of this work is on how to define the concept of importance in an unambiguous way and how to assess it in the simultaneous occurrence of correlated input factors and non-additive models.

Journal ArticleDOI
TL;DR: In this paper, a new methodology to extend hidden Markov models to the spatial domain, and use this class of models to analyze spatial heterogeneity of count data on a rare phenomenon is presented.
Abstract: We present new methodology to extend hidden Markov models to the spatial domain, and use this class of models to analyze spatial heterogeneity of count data on a rare phenomenon This situation occurs commonly in many domains of application, particularly in disease mapping We assume that the counts follow a Poisson model at the lowest level of the hierarchy, and introduce a finite-mixture model for the Poisson rates at the next level The novelty lies in the model for allocation to the mixture components, which follows a spatially correlated process, the Potts model, and in treating the number of components of the spatial mixture as unknown Inference is performed in a Bayesian framework using reversible jump Markov chain Monte Carlo The model introduced can be viewed as a Bayesian semiparametric approach to specifying flexible spatial distribution in hierarchical models Performance of the model and comparison with an alternative well-known Markov random field specification for the Poisson rates are de

Journal ArticleDOI
TL;DR: In this article, the authors analyzed a national data base of air pollution and mortality for the 88 largest U.S. cities for the period 1987-1994, to estimate relative rates of mortality associated with airborne particulate matter smaller than 10 microns (PM10) and the form of the relationship between PM10 concentration and mortality.
Abstract: We analyzed a national data base of air pollution and mortality for the 88 largest U.S. cities for the period 1987–1994, to estimate relative rates of mortality associated with airborne particulate matter smaller than 10 microns (PM10) and the form of the relationship between PM10 concentration and mortality. To estimate city-specific relative rates of mortality associated with PM10, we built log-linear models that included nonparametric adjustments for weather variables and longer term trends. To estimate PM10 mortality dose-response curves, we modeled the logarithm of the expected value of daily mortality as a function of PM10 using natural cubic splines with unknown numbers and locations of knots. We also developed spatial models to investigate the heterogeneity of relative mortality rates and of the shapes of PM10 mortality dose-response curves across cities and geographical regions. To determine whether variability in effect estimates can be explained by city-specific factors, we explored the depende...

Journal ArticleDOI
TL;DR: In this article, the Blinder-Oaxaca (B-O) method is used to decompose the mean intergroup difference in a given variable into the portion attributable to differences in the distribution of one or more explanatory variables and the part due to the difference in the conditional expectation function.
Abstract: Many applications involve a decomposition of the mean intergroup difference in a given variable into the portion attributable to differences in the distribution of one or more explanatory variables and that due to differences in the conditional expectation function. This article notes two interrelated reasons why the Blinder–Oaxaca (B–O) method—the approach most commonly used in the literature—may yield misleading results. We suggest a natural solution that both provides a more reliable answer to the original problem and affords a richer examination of the sources of intergroup differences in the variable of interest. The conventional application of the B–O method requires a parametric assumption about the form of the conditional expectation function. Furthermore, it often uses estimates based on that functional form to extrapolate outside the range of the observed explanatory variables. We show that misspecification of the conditional expectation function is likely to result in nontrivial errors in infer...

Journal ArticleDOI
TL;DR: In this paper, the estimation of conditional quantiles of counts is studied and it is shown that it is possible to smooth the data in a way that allows inference to be performed using standard quantile regression techniques.
Abstract: This article studies the estimation of conditional quantiles of counts Given the discreteness of the data, some smoothness must be artificially imposed on the problem We show that it is possible to smooth the data in a way that allows inference to be performed using standard quantile regression techniques The performance and implementation of the estimators are illustrated by simulations and an application

Journal ArticleDOI
TL;DR: This text is a revision of the book by Arnold, Costillo, and Sarabia (1992), but with much more depth than the original, and comprises a lively overview of conditionally speciŽ ed models of the conditional distribution.
Abstract: of the conditional distribution speciŽ cations. Chapters 8 and 10 extend these methods from two to more dimensions. Chapter 9 investigates estimation in conditionally speciŽ ed models. Chapter 11 considers models speciŽ ed by conditioning on events speciŽ ed by one variable exceeding a value rather than equaling a value, and Chapter 12 considers models for extreme-value data. Chapter 13 extends conditional speciŽ cation to Bayesian analysis. Chapter 14 describes the related simultaneous-equation models, and Chapter 15 ties in some additional topics. An appendix describes methods of simulation from conditionally speciŽ ed models. Chapters 1–4, plus Chapters 9 and 13, comprise a lively overview of conditionally speciŽ ed models. The remainder of the text constitutes a detailed catalog of results speciŽ c to different conditional distributions. Although this catalog is certainly of value, the reader desiring a briefer and less detailed introduction to the subject might skip the remainder at Ž rst reading. This text is a revision of the book by Arnold, Costillo, and Sarabia (1992). The current version is of similar breadth, but with much more depth than the original. The text is clearly written and accessible with relatively few mathematical prerequisites. I found surprisingly few typographical errors; the authors are to be congratulated for this. In a few cases, regularity conditions for results are not given in full. Generally, this causes little confusion, although something does appear to be missing in the statement of Aczél’s key theorem (Theorem 1.3). Fortunately, most of the results in the sequel are derived from corollaries to this theorem, and the corollaries are stated more precisely. I noted few gaps in the material covered. The only area that I thought was insufŽ ciently represented was application to Markov chain Monte Carlo. Conditional speciŽ cation is particularly important in Gibbs sampling. I believe that many practitioners would beneŽ t from a discussion of the issues involved in these sampling schemes. Each chapter contains numerous exercises. These exercises appear to be at an appropriate level for a graduate course in statistics, and appear to provide appropriate reinforcement for the material in the preceding chapters.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a simple three-step estimator for censored quantile regression models with a separation restriction on the censoring probability, which is asymptotically as efficient as the celebrated Powell's censored least absolute deviation estimator.
Abstract: This article suggests very simple three-step estimators for censored quantile regression models with a separation restriction on the censoring probability. The estimators are theoretically attractive (i.e., asymptotically as efficient as the celebrated Powell's censored least absolute deviation estimator). At the same time, they are conceptually simple and have trivial computational expenses. They are especially useful in samples of small size or models with many regressors, with desirable finite-sample properties and small bias. The separation restriction costs a small reduction of generality relative to the canonical censored regression quantile model, yet its main plausible features remain intact. The estimator can also be used to estimate a large class of traditional models, including the normal Amemiya–Tobin model and many accelerated failure and proportional hazard models. We illustrate the approach with an extramarital affairs example and contrast our findings with those of Fair.

Journal ArticleDOI
TL;DR: In this paper, the error distribution in the standard linear model is modeled as a mixture of absolutely continuous polya trees constrained to have median 0.1 error, and the predictive error density has a derivative everywhere except 0.
Abstract: We model the error distribution in the standard linear model as a mixture of absolutely continuous Polya trees constrained to have median 0. By considering a mixture, we smooth out the partitioning effects of a simple Polya tree and the predictive error density has a derivative everywhere except 0. The error distribution is centered around a standard parametric family of distributions and thus may be viewed as a generalization of standard models in which important, data-driven features, such as skewness and multimodality, are allowed. By marginalizing the Polya tree, exact inference is possible up to Markov chain Monte Carlo error.

Journal ArticleDOI
TL;DR: In this paper, the unconditional nonparametric maximum likelihood estimator (NPMLE) of the unbiased survivor function in the presence of random left truncation of the survival times is evaluated.
Abstract: When survival data arise from prevalent cases ascertained through a cross-sectional study, it is well known that the survivor function corresponding to these data is length biased and different from the survivor function derived from incident cases. Length-biased data have been treated both unconditionally and conditionally in the literature. In the latter case, where length bias is viewed as being induced by random left truncation of the survival times, the truncating distribution is assumed to be unknown. Conditioning on the observed truncation times hence causes very little loss of information. In many instances, however, it can be supposed that the truncating distribution is uniform, and it has been pointed out that under these circumstances, an unconditional analysis will be more informative. There are no results in the current literature that give the asymptotic properties of the unconditional nonparametric maximum likelihood estimator (NPMLE) of the unbiased survivor function in the presence of cen...

Journal ArticleDOI
TL;DR: In this article, the authors present a method that extends the flexibility of adaptive designs to the number of interim analyses and to the choice of decision boundaries based on a recursive application of the two-stage combination tests for p values.
Abstract: We present a method that extends the flexibility of adaptive designs to the number of interim analyses and to the choice of decision boundaries. At each stage of the trial, the design of the next stage can be determined using all of the information gathered so far. Additionally, one can specify the next stage as the final stage or can plan a further interim analysis. The method is based on a recursive application of the two-stage combination tests for p values. The crucial point is the appropriate definition of a p value function that combines p values from two separate stages to a single p value. Formally, we start with a two-stage combination test. However, the p value of the second stage can be replaced by the p value of a further combination test. This applies if the experimenter decides in the first interim analysis to perform another interim analysis, thereby extending the design to at least three stages. Obviously, this can be also done in a recursive way in the following interim analyses. The test...

Journal ArticleDOI
TL;DR: A data-driven method to identify parsimony in the covariance matrix of longitudinal data and to exploit any such parsimony to produce a statistically efficient estimator of the covariances matrix is proposed.
Abstract: This article proposes a data-driven method to identify parsimony in the covariance matrix of longitudinal data and to exploit any such parsimony to produce a statistically efficient estimator of the covariance matrix The approach parameterizes the covariance matrix through the Cholesky decomposition of its inverse For longitudinal data, this is a one-step-ahead predictive representation, and the Cholesky factor is likely to have off-diagonal elements that are zero or close to zero A hierarchical Bayesian model is used to identify any such zeros in the Cholesky factor, similar to approaches that have been successful in Bayesian variable selection The model is estimated using a Markov chain Monte Carlo sampling scheme that is computationally efficient and can be applied to covariance matrices of high dimension It is demonstrated through simulations that the proposed method compares favorably in terms of statistical efficiency with a highly regarded competing approach The estimator is applied to three

Journal ArticleDOI
TL;DR: The authors proposed an adaptive model selection procedure that uses a data-adaptive complexity penalty based on a concept of generalized degrees of freedom, combining the benefit of a class of nonadaptive procedures, approximates the best performance of this class of procedures across a variety of different situations.
Abstract: Most model selection procedures use a fixed penalty penalizing an increase in the size of a model. These nonadaptive selection procedures perform well only in one type of situation. For instance, Bayesian information criterion (BIC) with a large penalty performs well for “small” models and poorly for “large” models, and Akaike's information criterion (AIC) does just the opposite. This article proposes an adaptive model selection procedure that uses a data-adaptive complexity penalty based on a concept of generalized degrees of freedom. The proposed procedure, combining the benefit of a class of nonadaptive procedures, approximates the best performance of this class of procedures across a variety of different situations. This class includes many well-known procedures, such as AIC, BIC, Mallows's Cp, and risk inflation criterion (RIC). The proposed procedure is applied to wavelet thresholding in nonparametric regression and variable selection in least squares regression. Simulation results and an asymptotic...