scispace - formally typeset
Search or ask a question
Author

George Casella

Bio: George Casella is an academic researcher from University of Florida. The author has contributed to research in topics: Gibbs sampling & Prior probability. The author has an hindex of 59, co-authored 310 publications receiving 34727 citations. Previous affiliations of George Casella include Charles III University of Madrid & Cornell University.


Papers
More filters
Book
01 Jan 1999
TL;DR: This new edition contains five completely new chapters covering new developments and has sold 4300 copies worldwide of the first edition (1999).
Abstract: We have sold 4300 copies worldwide of the first edition (1999). This new edition contains five completely new chapters covering new developments.

6,884 citations

Book
24 Apr 1990

6,235 citations

Journal ArticleDOI
TL;DR: The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors.
Abstract: The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors. Gibbs sampling from this posterior is possible using an expanded hierarchy with conjugate normal priors for the regression parameters and independent exponential priors on their variances. A connection with the inverse-Gaussian distribution provides tractable full conditional distributions. The Bayesian Lasso provides interval estimates (Bayesian credible intervals) that can guide variable selection. Moreover, the structure of the hierarchical model provides both Bayesian and likelihood methods for selecting the Lasso parameter. Slight modifications lead to Bayesian versions of other Lasso-related estimation methods, including bridge regression and a robust variant.

2,897 citations

Journal ArticleDOI
TL;DR: A simple explanation of how and why the Gibbs sampler works is given and analytically establish its properties in a simple case and insight is provided for more complicated cases.
Abstract: Computer-intensive algorithms, such as the Gibbs sampler, have become increasingly popular statistical tools, both in applied and theoretical work. The properties of such algorithms, however, may sometimes not be obvious. Here we give a simple explanation of how and why the Gibbs sampler works. We analytically establish its properties in a simple case and provide insight for more complicated cases. There are also a number of examples.

2,656 citations

Journal ArticleDOI
TL;DR: This work is the most comprehensive examination to date of bacterial diversity in soil and suggests that agricultural management of soil may significantly influence the diversity of bacteria and archaea.
Abstract: Estimates of the number of species of bacteria per gram of soil vary between 2000 and 8.3 million (Gans et al., 2005; Schloss and Handelsman, 2006). The highest estimate suggests that the number may be so large as to be impractical to test by amplification and sequencing of the highly conserved 16S rRNA gene from soil DNA (Gans et al., 2005). Here we present the use of high throughput DNA pyrosequencing and statistical inference to assess bacterial diversity in four soils across a large transect of the western hemisphere. The number of bacterial 16S rRNA sequences obtained from each site varied from 26,140 to 53,533. The most abundant bacterial groups in all four soils were the Bacteroidetes, Betaproteobacteria and Alphaproteobacteria. Using three estimators of diversity, the maximum number of unique sequences (operational taxonomic units roughly corresponding to the species level) never exceeded 52,000 in these soils at the lowest level of dissimilarity. Furthermore, the bacterial diversity of the forest soil was phylum rich compared to the agricultural soils, which are species rich but phylum poor. The forest site also showed far less diversity of the Archaea with only 0.009% of all sequences from that site being from this group as opposed to 4%-12% of the sequences from the three agricultural sites. This work is the most comprehensive examination to date of bacterial diversity in soil and suggests that agricultural management of soil may significantly influence the diversity of bacteria and archaea.

1,732 citations


Cited by
More filters
Journal ArticleDOI
Simon Haykin1
TL;DR: Following the discussion of interference temperature as a new metric for the quantification and management of interference, the paper addresses three fundamental cognitive tasks: radio-scene analysis, channel-state estimation and predictive modeling, and the emergent behavior of cognitive radio.
Abstract: Cognitive radio is viewed as a novel approach for improving the utilization of a precious natural resource: the radio electromagnetic spectrum. The cognitive radio, built on a software-defined radio, is defined as an intelligent wireless communication system that is aware of its environment and uses the methodology of understanding-by-building to learn from the environment and adapt to statistical variations in the input stimuli, with two primary objectives in mind: /spl middot/ highly reliable communication whenever and wherever needed; /spl middot/ efficient utilization of the radio spectrum. Following the discussion of interference temperature as a new metric for the quantification and management of interference, the paper addresses three fundamental cognitive tasks. 1) Radio-scene analysis. 2) Channel-state estimation and predictive modeling. 3) Transmit-power control and dynamic spectrum management. This work also discusses the emergent behavior of cognitive radio.

12,172 citations

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined and derive a measure pD for the effective number in a model as the difference between the posterior mean of the deviances and the deviance at the posterior means of the parameters of interest, which is related to other information criteria and has an approximate decision theoretic justification.
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis.

11,691 citations

Journal ArticleDOI
22 Apr 2013-PLOS ONE
TL;DR: The phyloseq project for R is a new open-source software package dedicated to the object-oriented representation and analysis of microbiome census data in R, which supports importing data from a variety of common formats, as well as many analysis techniques.
Abstract: Background The analysis of microbial communities through DNA sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. Results Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. Conclusions The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.

11,272 citations

Book
21 Mar 2002
TL;DR: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data is as discussed by the authors, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced.
Abstract: An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models Multivariate techniques, including classification and ordination, are then introduced Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature The book is supported by a website that provides all data sets, questions for each chapter and links to software

9,509 citations

19 Oct 2012
TL;DR: In this paper, the authors present the likelihood methods for the analysis of cointegration in VAR models with Gaussian errors, seasonal dummies, and constant terms, and show that the asymptotic distribution of the maximum likelihood estimator is mixed Gausssian.
Abstract: Presents the likelihood methods for the analysis of cointegration in VAR models with Gaussian errors, seasonal dummies, and constant terms. Discusses likelihood ratio tests of cointegration rank and find the asymptotic distribution of the test statistics. Shows that the asymptotic distribution of the maximum likelihood estimator is mixed Gausssian.

9,355 citations