Home
/
Authors
/
Anthony D. Long

Author

Anthony D. Long

Other affiliations: University of California, University of California, Berkeley, McMaster University ...read more

Bio: Anthony D. Long is an academic researcher from University of California, Irvine. The author has contributed to research in topics: Population & Quantitative trait locus. The author has an hindex of 40, co-authored 101 publications receiving 9438 citations. Previous affiliations of Anthony D. Long include University of California & University of California, Berkeley.

Topics: Population, Quantitative trait locus, Genome, Genetic variation, Allele ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1996
1995
1994
1993
1992

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes.

[...]

Pierre Baldi¹, Anthony D. Long¹•Institutions (1)

University of California, Irvine¹

01 Jun 2001-Bioinformatics

TL;DR: A Bayesian probabilistic framework for microarray data analysis is developed that derives point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes.

...read moreread less

Abstract: Motivation: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. Results: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t-test, provide a systematic inference approach that compares favorably with simple t-test or fold methods, and partly compensate for the lack of replication.

...read moreread less

1,763 citations

Journal Article•DOI•

The Molecular Diversity of Adaptive Convergence

[...]

Olivier Tenaillon¹, Olivier Tenaillon², Olivier Tenaillon³, Alejandra Rodríguez-Verdugo³, Rebecca L. Gaut³, Pamela McDonald³, Albert F. Bennett³, Anthony D. Long³, Brandon S. Gaut³ - Show less +5 more•Institutions (3)

Paris Diderot University¹, French Institute of Health and Medical Research², University of California, Irvine³

27 Jan 2012-Science

TL;DR: The pervasive presence of epistasis among beneficial mutations was inferred, which shaped adaptive trajectories into at least two distinct pathways involving mutations either in the RNA polymerase complex or the termination factor rho.

...read moreread less

Abstract: To estimate the number and diversity of beneficial mutations, we experimentally evolved 115 populations of Escherichia coli to 42.2°C for 2000 generations and sequenced one genome from each population. We identified 1331 total mutations, affecting more than 600 different sites. Few mutations were shared among replicates, but a strong pattern of convergence emerged at the level of genes, operons, and functional complexes. Our experiment uncovered a set of primary functional targets of high temperature, but we estimate that many other beneficial mutations could contribute to similar adaptive outcomes. We inferred the pervasive presence of epistasis among beneficial mutations, which shaped adaptive trajectories into at least two distinct pathways involving mutations either in the RNA polymerase complex or the termination factor rho.

...read moreread less

735 citations

Journal Article•DOI•

Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.)

[...]

Maud I. Tenaillon¹, Mark C. Sawkins, Anthony D. Long, Rebecca L. Gaut, John Doebley, Brandon S. Gaut - Show less +2 more•Institutions (1)

University of California, Irvine¹

31 Jul 2001-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A comparison of genetic diversity between the landrace and inbred samples showed that inbreds retained 77% of the level of diversity of landraces, on average, suggesting that genome-wide surveys for association analyses require SNPs every 100–200 bp.

...read moreread less

Abstract: We measured sequence diversity in 21 loci distributed along chromosome 1 of maize (Zea mays ssp. mays L.). For each locus, we sequenced a common sample of 25 individuals representing 16 exotic landraces and nine U.S. inbred lines. The data indicated that maize has an average of one single nucleotide polymorphism (SNP) every 104 bp between two randomly sampled sequences, a level of diversity higher than that of either humans or Drosophila melanogaster. A comparison of genetic diversity between the landrace and inbred samples showed that inbreds retained 77% of the level of diversity of landraces, on average. In addition, Tajima's D values suggest that the frequency distribution of polymorphisms in inbreds was skewed toward fewer rare variants. Tests for selection were applied to all loci, and deviations from neutrality were detected in three loci. Sequence diversity was heterogeneous among loci, but there was no pattern of diversity along the genetic map of chromosome 1. Nonetheless, diversity was correlated (r = 0.65) with sequence-based estimates of the recombination rate. Recombination in our sample was sufficient to break down linkage disequilibrium among SNPs. Intragenic linkage disequilibrium declines within 100-200 bp on average, suggesting that genome-wide surveys for association analyses require SNPs every 100-200 bp.

...read moreread less

692 citations

Journal Article•DOI•

The Power of Association Studies to Detect the Contribution of Candidate Genetic Loci to Variation in Complex Traits

[...]

Anthony D. Long¹, Charles H. Langley•Institutions (1)

University of California, Irvine¹

01 Aug 1999-Genome Research

TL;DR: Estimates of 4Nc for a number of gene regions and human populations will be of use in determining the density of SNPs that are likely to be required for successful association studies.

...read moreread less

Abstract: The statistical power of five association study test statistics (two haplotype-based tests, two marker-based tests, and the Transmission Disequilibrium Test-Q5) to detect single nucleotide polymorphism (SNP)/phenotype associations in a linkage-disequilibrium-based candidate gene scan employing a number of SNPs is examined. Power is estimated as a function of realistic parameters expected to affect the likelihood of detecting a significant association: the number of SNPs examined, the scaled recombination size of the region examined, the proportion of variance in the trait attributable to a hidden causative polymorphism within the region, and the number of individuals or families examined. For the different combinations of parameter values, power is estimated from a large number of realizations of a simulated coalescent describing a single random mating population with mutation, random genetic drift, and recombination. This explicit population genetics model results in a distribution of DNA marker heterozygosities and linkage disequilibria that are likely to resemble those expected in actual population samples. The study concludes that (1) marker-based permutation tests are more powerful than simple haplotype-based tests, (2) there is sufficient power to detect the presence of causative polymorphisms of small effect if on the order of 500 individuals are sampled, (3) greater power is achieved by increasing the sample size than by increasing the number of polymorphisms, (4) association studies are generally more powerful than transmission disequilibrium-based tests, and (5) for the range of parameters considered association studies have a low repeatability unless sample sizes are on the order of 500 individuals. Estimates of 4Nc for a number of gene regions and human populations will be of use in determining the density of SNPs that are likely to be required for successful association studies.

...read moreread less

455 citations

Journal Article•DOI•

Genome-wide analysis of a long-term evolution experiment with Drosophila

[...]

Molly K. Burke¹, Joseph P. Dunham², Parvin Shahrestani¹, Kevin R. Thornton¹, Michael R. Rose¹, Anthony D. Long¹ - Show less +2 more•Institutions (2)

University of California, Irvine¹, University of Southern California²

30 Sep 2010-Nature

TL;DR: In this article, the authors present whole-genome resequencing data from Drosophila melanogaster populations that have experienced over 600 generations of laboratory selection for accelerated development.

...read moreread less

Abstract: Experimental evolution systems allow the genomic study of adaptation, and so far this has been done primarily in asexual systems with small genomes, such as bacteria and yeast. Here we present whole-genome resequencing data from Drosophila melanogaster populations that have experienced over 600 generations of laboratory selection for accelerated development. Flies in these selected populations develop from egg to adult ∼20% faster than flies of ancestral control populations, and have evolved a number of other correlated phenotypes. On the basis of 688,520 intermediate-frequency, high-quality single nucleotide polymorphisms, we identify several dozen genomic regions that show strong allele frequency differentiation between a pooled sample of five replicate populations selected for accelerated development and pooled controls. On the basis of resequencing data from a single replicate population with accelerated development, as well as single nucleotide polymorphism data from individual flies from each replicate population, we infer little allele frequency differentiation between replicate populations within a selection treatment. Signatures of selection are qualitatively different than what has been observed in asexual species; in our sexual populations, adaptation is not associated with 'classic' sweeps whereby newly arising, unconditionally advantageous mutations become fixed. More parsimonious explanations include 'incomplete' sweep models, in which mutations have not had enough time to fix, and 'soft' sweep models, in which selection acts on pre-existing, common genetic variants. We conclude that, at least for life history characters such as development time, unconditionally advantageous alleles rarely arise, are associated with small net fitness gains or cannot fix because selection coefficients change over time.

...read moreread less

450 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments

[...]

Gordon K. Smyth¹•Institutions (1)

Walter and Eliza Hall Institute of Medical Research¹

12 Feb 2004-Statistical Applications in Genetics and Molecular Biology

TL;DR: The hierarchical model of Lonnstedt and Speed (2002) is developed into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples and the moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom.

...read moreread less

Abstract: The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.

...read moreread less

11,864 citations

Journal Article•

Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.

[...]

Fumio Tajima¹•Institutions (1)

Kyushu University¹

30 Oct 1989-Genomics

TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

...read moreread less

11,521 citations

Journal Article•

Human biochemical genetics

[...]

Grüneberg H

01 Jul 1960-The Eugenics Review

TL;DR: For the next few weeks the course is going to be exploring a field that’s actually older than classical population genetics, although the approach it’ll be taking to it involves the use of population genetic machinery.

...read moreread less

Abstract: So far in this course we have dealt entirely with the evolution of characters that are controlled by simple Mendelian inheritance at a single locus. There are notes on the course website about gametic disequilibrium and how allele frequencies change at two loci simultaneously, but we didn’t discuss them. In every example we’ve considered we’ve imagined that we could understand something about evolution by examining the evolution of a single gene. That’s the domain of classical population genetics. For the next few weeks we’re going to be exploring a field that’s actually older than classical population genetics, although the approach we’ll be taking to it involves the use of population genetic machinery. If you know a little about the history of evolutionary biology, you may know that after the rediscovery of Mendel’s work in 1900 there was a heated debate between the “biometricians” (e.g., Galton and Pearson) and the “Mendelians” (e.g., de Vries, Correns, Bateson, and Morgan). Biometricians asserted that the really important variation in evolution didn’t follow Mendelian rules. Height, weight, skin color, and similar traits seemed to

...read moreread less

9,847 citations

Journal Article•DOI•

Survey of clustering algorithms

[...]

Rui Xu¹, Donald C. Wunsch¹•Institutions (1)

Missouri University of Science and Technology¹

01 May 2005-IEEE Transactions on Neural Networks

TL;DR: Clustering algorithms for data sets appearing in statistics, computer science, and machine learning are surveyed, and their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts are illustrated.

...read moreread less

Abstract: Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.

...read moreread less

5,744 citations

Journal Article•DOI•

A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species

[...]

Robert J. Elshire¹, Jeffrey C. Glaubitz¹, Qi-ying Sun¹, Jesse Poland², Ken Kawamoto¹, Edward S. Buckler¹, Edward S. Buckler², Sharon E. Mitchell¹ - Show less +4 more•Institutions (2)

Cornell University¹, United States Department of Agriculture²

04 May 2011-PLOS ONE

TL;DR: A procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs) is reported, which is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches.

...read moreread less

Abstract: Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs). This approach is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two to three fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species with high levels of genetic diversity. The GBS procedure is demonstrated with maize (IBM) and barley (Oregon Wolfe Barley) recombinant inbred populations where roughly 200,000 and 25,000 sequence tags were mapped, respectively. An advantage in species like barley that lack a complete genome sequence is that a reference map need only be developed around the restriction sites, and this can be done in the process of sample genotyping. In such cases, the consensus of the read clusters across the sequence tagged sites becomes the reference. Alternatively, for kinship analyses in the absence of a reference genome, the sequence tags can simply be treated as dominant markers. Future application of GBS to breeding, conservation, and global species and population surveys may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.

...read moreread less

5,163 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse