scispace - formally typeset
Search or ask a question

Showing papers by "Trevor Hastie published in 2002"


Journal ArticleDOI
TL;DR: The method of “nearest shrunken centroids” identifies subsets of genes that best characterize each class, which was highly efficient in finding genes for classifying small round blue cell tumors and leukemias.
Abstract: We have devised an approach to cancer class prediction from gene expression profiling, based on an enhancement of the simple nearest prototype (centroid) classifier. We shrink the prototypes and hence obtain a classifier that is often more accurate than competing methods. Our method of "nearest shrunken centroids" identifies subsets of genes that best characterize each class. The technique is general and can be used in many other classification problems. To demonstrate its effectiveness, we show that the method was highly efficient in finding genes for classifying small round blue cell tumors and leukemias.

2,954 citations


Journal ArticleDOI
TL;DR: A series of papers prepared within the framework of an international workshop entitled: Advances in GLMs /GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6 � /11 August 2001 are introduced.

2,006 citations


Journal ArticleDOI
TL;DR: In this paper, the authors evaluated the rate of progression of cartilage loss in the knee joint using magnetic resonance imaging (MRI) and evaluated potential risk factors for more rapid loss.
Abstract: Objective To evaluate the rate of progression of cartilage loss in the knee joint using magnetic resonance imaging (MRI) and to evaluate potential risk factors for more rapid cartilage loss. Methods We evaluated baseline and followup MRIs of the knees in 43 patients (minimum time interval of 1 year, mean 1.8 years, range 52–285 weeks). Cartilage loss was graded in the anterior, central, and posterior regions of the medial and lateral knee compartments. Knee joints were also evaluated for other pathology. Data were analyzed using analysis of variance models. Results Patients who had sustained meniscal tears showed a higher average rate of progression of cartilage loss (22%) than that seen in those who had intact menisci (14.9%) (P ≤ 0.018). Anterior cruciate ligament (ACL) tears had a borderline significant influence (P ≤ 0.06) on the progression of cartilage pathology. Lesions located in the central region of the medial compartment were more likely to progress to more advanced cartilage pathology (progression rate 28%; P ≤ 0.003) than lesions in the anterior (19%; P ≤ 0.564) and posterior (17%; P ≤ 0.957) regions or lesions located in the lateral compartment (average progression rate 15%; P ≤ 0.707). Lesions located in the anterior region of the lateral compartment showed less progression of cartilage degradation (6%; P ≤ 0.001). No specific grade of lesion identified at baseline had a predilection for more rapid cartilage loss (P ≤ 0.93). Conclusion MRI can detect interval cartilage loss in patients over a short period (<2 years). The presence of meniscal and ACL tears was associated with more rapid cartilage loss. Cartilage lesions located in the central region of the medial compartment showed more rapid progression of cartilage loss than cartilage lesions in the anterior and posterior portions of the medial compartment. The findings in this study suggest that patients entering clinical trials investigating antiarthritis regimens may need to be randomized based on location of the lesion.

226 citations


Journal ArticleDOI
TL;DR: T7 based linear amplification reproducibly generates amplified RNA that closely approximates original sample for gene expression profiling using cDNA microarrays that is not affected by decreasing the amount of input total RNA in the 0.3–3 micrograms range.
Abstract: Background T7 based linear amplification of RNA is used to obtain sufficient antisense RNA for microarray expression profiling. We optimized and systematically evaluated the fidelity and reproducibility of different amplification protocols using total RNA obtained from primary human breast carcinomas and high-density cDNA microarrays.

142 citations


Journal ArticleDOI
TL;DR: Evidence is provided that the function of the hypothalamic-pituitary-adrenal axis may have an independent association with behavioral problems in children with fragile X syndrome.

120 citations


Proceedings Article
01 Jan 2002
TL;DR: This work presents a simple direct approach for solving the ICA problem, using density estimation and maximum likelihood, given a candidate orthogonal frame, using a semi-parametric density estimate based on cubic splines.
Abstract: We present a simple direct approach for solving the ICA problem, using density estimation and maximum likelihood. Given a candidate orthogonal frame, we model each of the coordinates using a semi-parametric density estimate based on cubic splines. Since our estimates have two continuous derivatives, we can easily run a second order search for the frame parameters. Our method performs very favorably when compared to state-of-the-art techniques.

66 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of choosing the smoothing parameter is addressed in the framework of a mixed-effects model, whose assumptions ensure that the resulting estimator is unbiased, and a likelihood-ratio-type test statistic is proposed.
Abstract: When using smoothing splines to estimate a function, the user faces the problem of choosing the smoothing parameter. Several techniques are available for selecting this parameter according to certain optimality criteria. Here, we take a different point of view and we propose a technique for choosing between two alternatives, for example allowing for two different levels of degrees of freedom. The problem is addressed in the framework of a mixed‐effects model, whose assumptions ensure that the resulting estimator is unbiased. A likelihood‐ratio‐type test statistic is proposed, and its exact distribution is derived. Tests of linearity and overall effect follow directly. We then extend this idea to additive models where it provides a more attractive alternative than multi‐parameter optimisation, and where it gives exact distributional results that can be used in an analysis‐of‐deviance‐type approach. Examples on real data and a simulation study of level and power complete the paper.

63 citations


Book ChapterDOI
24 Jun 2002
TL;DR: A new approach for classification is proposed, called the import vector machine, which is built on kernel logistic regression (KLR), and it is shown on some examples that the IVM performs as well as the SVM in binary classification.
Abstract: The support vector machine is known for its excellent performance in binary classification, i.e., the response y ? {-1, 1}, but its appropriate extension to the multi-class case is still an on-going research issue. Another weakness of the SVM is that it only estimates sign[p(x) - 1/2], while the probability p(x) is often of interest itself, where p(x) = P(Y = 1|X = x) is the conditional probability of a point being in class 1 given X = x. We propose a new approach for classification, called the import vector machine, which is built on kernel logistic regression (KLR). We show on some examples that the IVM performs as well as the SVM in binary classification. The IVM can naturally be generalized to the multi-class case. Furthermore, the IVM provides an estimate of the underlying class probabilities. Similar to the "support points" of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This can give the IVM a computational advantage over the SVM, especially when the size of the training data set is large. We illustrate these techniques on some examples, and make connections with boosting, another popular machine-learning method for classification.

14 citations


Book ChapterDOI
01 Jan 2002
TL;DR: A simple approach to class prediction for DNA microarrays, based on a enhancement of the nearest centroid classifier, which improves significantly prediction performance, and identifies a subset of the genes most responsible for class separation.
Abstract: Gene expression arrays pose challenging problems for most traditional supervised learning techniques. We present a discussion of some of the issues involved. We then propose a simple approach to class prediction for DNA microarrays, based on a enhancement of the nearest centroid classifier. Our technique uses soft-thresholded class centroids as prototypes for each class. The shrinkage improves significantly prediction performance, and identifies a subset of the genes most responsible for class separation. The method performs as well or better than competitors from the literature, and is easy to understand and interpret. We illustrate the technique on data from three studies: small round blue cell tumors, leukemia and breast cancer.

3 citations