Home
/
Authors
/
Shawn Leonard

Author

Shawn Leonard

Other affiliations: Florida State University

Bio: Shawn Leonard is an academic researcher from Washington University in St. Louis. The author has contributed to research in topics: Genome & Salmonella enterica. The author has an hindex of 8, co-authored 10 publications receiving 8173 citations. Previous affiliations of Shawn Leonard include Florida State University.

Topics: Genome, Salmonella enterica, Euchromatin, Heterochromatin, Gene ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The B73 Maize Genome: Complexity, Diversity, and Dynamics

[...]

Patrick S. Schnable¹, Doreen Ware², Robert S. Fulton³, Joshua C. Stein² +156 more•Institutions (18)

20 Nov 2009-Science

TL;DR: The sequence of the maize genome reveals it to be the most complex genome known to date and the correlation of methylation-poor regions with Mu transposon insertions and recombination and how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state is reported.

...read moreread less

Abstract: We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.

...read moreread less

3,761 citations

Journal Article•DOI•

The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes.

[...]

Helen Skaletsky¹, Tomoko Kuroda-Kawaguchi¹, Patrick Minx², Holland S. Cordum², LaDeana W. Hillier², Laura G. Brown¹, Sjoerd Repping, Tatyana Pyntikova¹, Johar Ali², Tamberlyn Bieri², Asif T. Chinwalla², Andrew Delehaunty², Kim D. Delehaunty², Hui Du², Ginger A. Fewell², Lucinda Fulton², Robert S. Fulton², Tina Graves², Shunfang Hou², Philip Latrielle², Shawn Leonard², Elaine R. Mardis², Rachel Maupin², John Douglas Mcpherson², Tracie L. Miner², William E. Nash², Christine Nguyen², Philip Ozersky², Kymberlie H. Pepin², Susan M. Rock², Tracy Rohlfing², Kelsi Scott², Brian Schultz², Cindy Strong², Aye Mon Tin-Wollam², Shiaw-Pyng Yang², Robert H. Waterston², Richard K. Wilson², Steve Rozen¹, David C. Page¹ - Show less +36 more•Institutions (2)

Massachusetts Institute of Technology¹, Washington University in St. Louis²

19 Jun 2003-Nature

TL;DR: The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length, and is a mosaic of heterochromatic sequences and three classes of euchromatics sequences: X-transposed, X-degenerate and ampliconic.

...read moreread less

Abstract: The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length. Here, we report that the MSY is a mosaic of heterochromatic sequences and three classes of euchromatic sequences: X-transposed, X-degenerate and ampliconic. These classes contain all 156 known transcription units, which include 78 protein-coding genes that collectively encode 27 distinct proteins. The X-transposed sequences exhibit 99% identity to the X chromosome. The X-degenerate sequences are remnants of ancient autosomes from which the modern X and Y chromosomes evolved. The ampliconic class includes large regions (about 30% of the MSY euchromatin) where sequence pairs show greater than 99.9% identity, which is maintained by frequent gene conversion (non-reciprocal transfer). The most prominent features here are eight massive palindromes, at least six of which contain testis genes.

...read moreread less

2,022 citations

Journal Article•DOI•

Complete genome sequence of Salmonella enterica serovar Typhimurium LT2

[...]

Michael McClelland, Kenneth E. Sanderson¹, John Spieth², Sandra W. Clifton², Phil Latreille², Laura Courtney², Steffen Porwollik, Johar Ali², Mike Dante², Feiyu Du², Shunfang Hou², Dan Layman², Shawn Leonard², Christine Nguyen², Kelsi Scott², Andrea Holmes², Neenu Grewal², Elizabeth Mulvaney², Ellen E. Ryan², Hui Sun², Liliana Florea³, Liliana Florea⁴, Webb Miller³, Tamberlyn Stoneking², Michael Nhan², Robert H. Waterston², Richard K. Wilson² - Show less +23 more•Institutions (4)

University of Calgary¹, Washington University in St. Louis², Pennsylvania State University³, Celera Corporation⁴

25 Oct 2001-Nature

TL;DR: The distribution of close homologues of S. typhimurium LT2 genes in eight related enterobacteria was determined using previously completed genomes of three related bacteria, sample sequencing of both S. enterica serovar Paratyphi A and Klebsiella pneumoniae as mentioned in this paper.

...read moreread less

Abstract: Salmonella enterica subspecies I, serovar Typhimurium (S. typhimurium), is a leading cause of human gastroenteritis, and is used as a mouse model of human typhoid fever. The incidence of non-typhoid salmonellosis is increasing worldwide, causing millions of infections and many deaths in the human population each year. Here we sequenced the 4,857-kilobase (kb) chromosome and 94-kb virulence plasmid of S. typhimurium strain LT2. The distribution of close homologues of S. typhimurium LT2 genes in eight related enterobacteria was determined using previously completed genomes of three related bacteria, sample sequencing of both S. enterica serovar Paratyphi A (S. paratyphi A) and Klebsiella pneumoniae, and hybridization of three unsequenced genomes to a microarray of S. typhimurium LT2 genes. Lateral transfer of genes is frequent, with 11% of the S. typhimurium LT2 genes missing from S. enterica serovar Typhi (S. typhi), and 29% missing from Escherichia coli K12. The 352 gene homologues of S. typhimurium LT2 confined to subspecies I of S. enterica-containing most mammalian and bird pathogens-are useful for studies of epidemiology, host specificity and pathogenesis. Most of these homologues were previously unknown, and 50 may be exported to the periplasm or outer membrane, rendering them accessible as therapeutic or vaccine targets.

...read moreread less

1,850 citations

Journal Article•DOI•

Comparison of genome degradation in Paratyphi A and Typhi, human-restricted serovars of Salmonella enterica that cause typhoid.

[...]

Michael McClelland, Kenneth E. Sanderson¹, Sandra W. Clifton², Phil Latreille², Steffen Porwollik, Aniko Sabo², Rekha Meyer², Tamberlyn Bieri², Phil Ozersky², Michael D. McLellan², C Richard Harkins², Chunyan Wang², Christine Nguyen², Amy Berghoff², Glendoria Elliott², Sara Kohlberg², Cindy Strong², Feiyu Du², Jason Carter², Colin Kremizki², Dan Layman², Shawn Leonard², Hui Sun², Lucinda Fulton², William E. Nash², Tracie L. Miner², Patrick Minx², Kim D. Delehaunty², Catrina Fronick², Vincent Magrini², Michael Nhan², Wesley C. Warren², Liliana Florea³, John Spieth², Richard K. Wilson² - Show less +31 more•Institutions (3)

University of Calgary¹, Washington University in St. Louis², Applied Biosystems³

07 Nov 2004-Nature Genetics

TL;DR: The sequence and microarray analysis of the Paratyphi A genome indicates that it is similar to the Typhi genome but suggests that it has a more recent evolutionary origin.

...read moreread less

Abstract: Salmonella enterica serovars often have a broad host range, and some cause both gastrointestinal and systemic disease. But the serovars Paratyphi A and Typhi are restricted to humans and cause only systemic disease. It has been estimated that Typhi arose in the last few thousand years. The sequence and microarray analysis of the Paratyphi A genome indicates that it is similar to the Typhi genome but suggests that it has a more recent evolutionary origin. Both genomes have independently accumulated many pseudogenes among their approximately 4,400 protein coding sequences: 173 in Paratyphi A and approximately 210 in Typhi. The recent convergence of these two similar genomes on a similar phenotype is subtly reflected in their genotypes: only 30 genes are degraded in both serovars. Nevertheless, these 30 genes include three known to be important in gastroenteritis, which does not occur in these serovars, and four for Salmonella-translocated effectors, which are normally secreted into host cells to subvert host functions. Loss of function also occurs by mutation in different genes in the same pathway (e.g., in chemotaxis and in the production of fimbriae).

...read moreread less

392 citations

Journal Article•DOI•

Evaluation of 16s rDNA-based community profiling for human microbiome research

[...]

Doyle V. Ward¹, Dirk Gevers¹, Georgia Giannoukos¹, Ashlee M. Earl¹, Barbara A. Methé², Erica Sodergren³, Michael Feldgarden¹, Dawn Ciulla¹, Diana Tabbaa¹, Cesar Arze⁴, Elizabeth L. Appelbaum³, Leigh Aird¹, Scott Anderson¹, Tulin Ayvaz⁵, Edward A. Belter³, Monika Bihan², Toby Bloom¹, Jonathan Crabtree⁴, Laura Courtney³, Lynn K. Carmichael³, David J. Dooling³, Rachel L. Erlich¹, Candace N. Farmer³, Lucinda Fulton³, Robert S. Fulton³, Hongyu Gao³, John Gill², Brian J. Haas¹, Lisa Hemphill⁵, Otis Hall³, Susanna Hamilton¹, Theresa A. Hepburn¹, Niall J. Lennon¹, Vandita Joshi⁵, Cristyn Kells¹, Christie Kovar⁵, Divya Kalra⁵, Kelvin Li², Lora Lewis⁵, Shawn Leonard³, Donna M. Muzny⁵, Elaine R. Mardis³, Kathie A. Mihindukulasuriya³, Vincent Magrini³, Michelle O'Laughlin³, Craig Pohl³, Xiang Qin⁵, Keenan Ross¹, Matthew C. Ross⁵, Yu Hui A. Rogers², Navjeet Singh⁶, Yue Shang⁵, Katarzyna Wilczek-Boney⁵, Jennifer R. Wortman⁴, Kim C. Worley⁵, Bonnie P. Youmans, Shibu Yooseph², Yanjiao Zhou³, Patrick D. Schloss⁷, Richard K. Wilson³, Richard A. Gibbs⁵, Karen E. Nelson², George M. Weinstock³, Todd Z. DeSantis⁶, Joseph F. Petrosino⁵, Sarah K. Highlander⁵, Bruce W. Birren¹ - Show less +63 more•Institutions (7)

Broad Institute¹, J. Craig Venter Institute², Washington University in St. Louis³, University of Maryland, Baltimore⁴, Baylor College of Medicine⁵, Lawrence Berkeley National Laboratory⁶, University of Michigan⁷

13 Jun 2012-PLOS ONE

TL;DR: The data production protocols used for this work are those used by the participating centers to produce 16S rDNA sequence for the Human Microbiome Project, and these results can be informative for interpreting the large body of clinical 16s rDNA data produced for this project.

...read moreread less

Abstract: The Human Microbiome Project will establish a reference data set for analysis of the microbiome of healthy adults by surveying multiple body sites from 300 people and generating data from over 12,000 samples. To characterize these samples, the participating sequencing centers evaluated and adopted 16S rDNA community profiling protocols for ABI 3730 and 454 FLX Titanium sequencing. In the course of establishing protocols, we examined the performance and error characteristics of each technology, and the relationship of sequence error to the utility of 16S rDNA regions for classification- and OTU-based analysis of community structure. The data production protocols used for this work are those used by the participating centers to produce 16S rDNA sequence for the Human Microbiome Project. Thus, these results can be informative for interpreting the large body of clinical 16S rDNA data produced for this project.

...read moreread less

285 citations

Cited by

PDF

Open Access

More filters

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

GSVA: gene set variation analysis for microarray and RNA-seq data.

[...]

Sonja Hänzelmann, Robert Castelo¹, Justin Guinney²•Institutions (2)

Pompeu Fabra University¹, Sage Bionetworks²

16 Jan 2013-BMC Bioinformatics

TL;DR: This work introduces Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner and constitutes a starting point to build pathway-centric models of biology.

...read moreread less

Abstract: Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org .

...read moreread less

6,125 citations

Journal Article•DOI•

A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species

[...]

Robert J. Elshire¹, Jeffrey C. Glaubitz¹, Qi-ying Sun¹, Jesse Poland², Ken Kawamoto¹, Edward S. Buckler², Edward S. Buckler¹, Sharon E. Mitchell¹ - Show less +4 more•Institutions (2)

Cornell University¹, United States Department of Agriculture²

04 May 2011-PLOS ONE

TL;DR: A procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs) is reported, which is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches.

...read moreread less

Abstract: Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs). This approach is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two to three fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species with high levels of genetic diversity. The GBS procedure is demonstrated with maize (IBM) and barley (Oregon Wolfe Barley) recombinant inbred populations where roughly 200,000 and 25,000 sequence tags were mapped, respectively. An advantage in species like barley that lack a complete genome sequence is that a reference map need only be developed around the restriction sites, and this can be done in the process of sample genotyping. In such cases, the consensus of the read clusters across the sequence tagged sites becomes the reference. Alternatively, for kinship analyses in the absence of a reference genome, the sequence tags can simply be treated as dominant markers. Future application of GBS to breeding, conservation, and global species and population surveys may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.

...read moreread less

5,163 citations

Journal Article•DOI•

voom: precision weights unlock linear model analysis tools for RNA-seq read counts

[...]

Charity W. Law¹, Charity W. Law², Yunshun Chen², Yunshun Chen¹, Wei Shi², Wei Shi¹, Gordon K. Smyth¹, Gordon K. Smyth² - Show less +4 more•Institutions (2)

University of Melbourne¹, Walter and Eliza Hall Institute of Medical Research²

03 Feb 2014-Genome Biology

TL;DR: New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments, and the voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline.

...read moreread less

Abstract: New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.

...read moreread less

4,475 citations

Journal Article•DOI•

The COG database: an updated version includes eukaryotes

[...]

Roman L. Tatusov¹, Natalie D. Fedorova¹, John D. Jackson¹, Aviva R. Jacobs¹, Boris Kiryutin¹, Eugene V. Koonin¹, Dmitri M. Krylov¹, Raja Mazumder², Sergei L. Mekhedov¹, Anastasia N. Nikolskaya², B Sridhar Rao¹, Sergei Smirnov¹, Alexander V. Sverdlov¹, Sona Vasudevan¹, Yuri I. Wolf¹, Jodie J. Yin¹, Darren A. Natale² - Show less +13 more•Institutions (2)

National Institutes of Health¹, Georgetown University Medical Center²

11 Sep 2003-BMC Bioinformatics

TL;DR: A major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes is described and is expected to be a useful platform for functional annotation of newlysequenced genomes, including those of complex eukARYotes, and genome-wide evolutionary studies.

...read moreread less

Abstract: The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after euk aryotic o rthologous g roups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The euk aryotic o rthologous g roups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.

...read moreread less

4,167 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse