Author

# Thomas C. M. Lee

Other affiliations: The Chinese University of Hong Kong, University of Chicago, Macquarie University ...read more

Bio: Thomas C. M. Lee is an academic researcher from University of California, Davis. The author has contributed to research in topics: Smoothing & Frequentist inference. The author has an hindex of 27, co-authored 131 publications receiving 2484 citations. Previous affiliations of Thomas C. M. Lee include The Chinese University of Hong Kong & University of Chicago.

##### Papers published on a yearly basis

##### Papers

More filters

••

TL;DR: This article considers the problem of modeling a class of nonstationary time series using piecewise autoregressive (AR) processes, and the minimum description length principle is applied to compare various segmented AR fits to the data.

Abstract: This article considers the problem of modeling a class of nonstationary time series using piecewise autoregressive (AR) processes. The number and locations of the piecewise AR segments, as well as the orders of the respective AR processes, are assumed unknown. The minimum description length principle is applied to compare various segmented AR fits to the data. The goal is to find the “best” combination of the number of segments, the lengths of the segments, and the orders of the piecewise AR processes. Such a “best” combination is implicitly defined as the optimizer of an objective function, and a genetic algorithm is implemented to solve this difficult optimization problem. Numerical results from simulation experiments and real data analyses show that the procedure has excellent empirical properties. The segmentation of multivariate time series is also considered. Assuming that the true underlying model is a segmented autoregression, this procedure is shown to be consistent for estimating the location of...

418 citations

••

TL;DR: The generalized fiducial inference (GFI) as mentioned in this paper generalizes the idea of Fisher's approach by transferring randomness from the data to the parameter space using an inverse of a data-generating equation without the use of Bayes' theorem.

Abstract: R. A. Fisher, the father of modern statistics, proposed the idea of fiducial inference during the first half of the 20th century. While his proposal led to interesting methods for quantifying uncertainty, other prominent statisticians of the time did not accept Fisher’s approach as it became apparent that some of Fisher’s bold claims about the properties of fiducial distribution did not hold up for multi-parameter problems. Beginning around the year 2000, the authors and collaborators started to reinvestigate the idea of fiducial inference and discovered that Fisher’s approach, when properly generalized, would open doors to solve many important and difficult inference problems. They termed their generalization of Fisher’s idea as generalized fiducial inference (GFI). The main idea of GFI is to carefully transfer randomness from the data to the parameter space using an inverse of a data-generating equation without the use of Bayes’ theorem. The resulting generalized fiducial distribution (GFD) can ...

182 citations

••

TL;DR: In this paper, an iterative estimation procedure for performing functional principal component analysis is proposed, which aims at functional or longitudinal data where the repeated measurements from the same subject are correlated, and the resulting data after iteration are theoretically shown to be asymptotically equivalent (in probability) to a set of independent data.

Abstract: Summary. We propose an iterative estimation procedure for performing functional principal component analysis. The procedure aims at functional or longitudinal data where the repeated measurements from the same subject are correlated. An increasingly popular smoothing approach, penalized spline regression, is used to represent the mean function. This allows straightforward incorporation of covariates and simple implementation of approximate inference procedures for coefficients. For the handling of the within-subject correlation, we develop an iterative procedure which reduces the dependence between the repeated measurements that are made for the same subject. The resulting data after iteration are theoretically shown to be asymptotically equivalent (in probability) to a set of independent data. This suggests that the general theory of penalized spline regression that has been developed for independent data can also be applied to functional data.The effectiveness of the proposed procedure is demonstrated via a simulation study and an application to yeast cell cycle gene expression data.

146 citations

••

TL;DR: In this article, the problem of detecting break points for a nonstation-ary time series is considered, where the time series follows a parametric nonlinear time-series model in which the parameters may change values at fixed times.

Abstract: This article considers the problem of detecting break points for a nonstation- ary time series. Specifically, the time series is assumed to follow a parametric nonlinear time- series model in which the parameters may change values at fixed times. In this formulation, the number and locations of the break points are assumed unknown. The minimum description length (MDL) is used as a criterion for estimating the number of break points, the locations of break points and the parametric model in each segment. The best segmentation found by minimizing MDL is obtained using a genetic algorithm. The implementation of this approach is illustrated using generalized autoregressive conditionally heteroscedastic (GARCH) models, stochastic volatility models and generalized state-space models as the parametric model for the segments. Empirical results show good performance of the estimates of the number of breaks and their locations for these various models.

96 citations

••

TL;DR: A simulation study of several smoothing parameter selection methods, including two so-called risk estimation methods, finds that the popular method, generalized cross-validation, was outperformed by another method, an improved Akaike Information criterion, that shares the same assumptions and computational complexity.

Abstract: Smoothing splines are a popular method for performing nonparametric regression. Most important in the implementation of this method is the choice of the smoothing parameter. This article provides a simulation study of several smoothing parameter selection methods, including two so-called risk estimation methods. To the best of the author's knowledge, the empirical performances of these two risk estimation methods have never been reported in the literature. Empirical conclusions from and recommendations based on the simulation results will be provided. One noteworthy empirical observation is that the popular method, generalized cross-validation, was outperformed by another method, an improved Akaike Information criterion, that shares the same assumptions and computational complexity.

80 citations

##### Cited by

More filters

••

6,278 citations

••

TL;DR: It is concluded that multiple Imputation for Nonresponse in Surveys should be considered as a legitimate method for answering the question of why people do not respond to survey questions.

Abstract: 25. Multiple Imputation for Nonresponse in Surveys. By D. B. Rubin. ISBN 0 471 08705 X. Wiley, Chichester, 1987. 258 pp. £30.25.

3,216 citations

••

TL;DR: This work considers the problem of detecting multiple changepoints in large data sets and introduces a new method for finding the minimum of such cost functions and hence the optimal number and location of changepoints that has a computational cost which is linear in the number of observations.

Abstract: In this article, we consider the problem of detecting multiple changepoints in large datasets. Our focus is on applications where the number of changepoints will increase as we collect more data: for example, in genetics as we analyze larger regions of the genome, or in finance as we observe time series over longer periods. We consider the common approach of detecting changepoints through minimizing a cost function over possible numbers and locations of changepoints. This includes several established procedures for detecting changing points, such as penalized likelihood and minimum description length. We introduce a new method for finding the minimum of such cost functions and hence the optimal number and location of changepoints that has a computational cost, which, under mild conditions, is linear in the number of observations. This compares favorably with existing methods for the same problem whose computational cost can be quadratic or even cubic. In simulation studies, we show that our new method can...

1,647 citations