Home
/
Authors
/
James E. Galagan

Author

James E. Galagan

Other affiliations: Broad Institute, Massachusetts Institute of Technology, Stanford University

Bio: James E. Galagan is an academic researcher from Boston University. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 57, co-authored 102 publications receiving 42515 citations. Previous affiliations of James E. Galagan include Broad Institute & Massachusetts Institute of Technology.

Topics: Genome, Gene, Fungal genetics, Mycobacterium tuberculosis, Biosensor ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
1986
1985

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Initial sequencing and analysis of the human genome.

[...]

Eric S. Lander¹, Lauren Linton¹, Bruce W. Birren¹, Chad Nusbaum¹ +245 more•Institutions (29)

15 Feb 2001-Nature

TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

...read moreread less

Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

...read moreread less

22,269 citations

Journal Article•DOI•

The genome sequence of the filamentous fungus Neurospora crassa

[...]

James E. Galagan¹, Sarah E. Calvo¹, Katherine A. Borkovich², Eric U. Selker³, Nick O. Read⁴, David B. Jaffe¹, William Fitzhugh⁵, Li-Jun Ma¹, Serge Smirnov¹, Seth Purcell¹, Bushra Rehman¹, Timothy Elkins¹, Reinhard Engels¹, Shunguang Wang¹, Cydney B. Nielsen¹, Jonathan Butler¹, Matthew G. Endrizzi¹, Dayong Qui¹, Peter Ianakiev¹, Deborah Bell-Pedersen⁶, Mary Anne Nelson⁷, Margaret Werner-Washburne⁷, Claude P. Selitrennikoff⁸, John A. Kinsey⁹, Edward L. Braun¹⁰, Alex Zelter⁴, Alex Zelter¹¹, Ulrich Schulte¹², Gregory O. Kothe³, Gregory Jedd¹³, Werner Mewes¹⁴, Chuck Staben¹⁵, Edward M. Marcotte¹⁶, David Greenberg¹⁷, Alice Roy¹, Karen Foley¹, Jerome Naylor¹, Nicole Stange-Thomann¹, Robert Barrett¹, Sante Gnerre¹, Michael Kamal¹, Manolis Kamvysselis¹, Evan Mauceli¹, Cord Bielke¹⁴, Stephen Rudd, Dmitrij Frishman, Svetlana Krystofova², Carolyn G. Rasmussen¹⁸, Robert L. Metzenberg¹⁹, David D. Perkins¹⁹, Scott Kroken¹⁸, Carlo Cogoni²⁰, Giuseppe Macino²⁰, David E. A. Catcheside²¹, Weixi Li¹⁵, Robert J. Pratt⁶, Stephen A. Osmani²², Colin P.C. DeSouza²², Louise Glass¹⁸, Marc J. Orbach²³, J. Andrew Berglund³, Rodger B. Voelker³, Oded Yarden¹¹, Michael Plamann²⁴, Stephan Seiler²⁴, Jay C. Dunlap²⁵, Alan Radford²⁶, Rodolfo Aramayo⁶, Donald O. Natvig⁷, Lisa A. Alex²⁷, Gertrud Mannhaupt¹⁴, Daniel J. Ebbole⁶, Michael Freitag³, Ian T. Paulsen¹⁷, Matthew S. Sachs²⁸, Eric S. Lander¹, Chad Nusbaum¹, Bruce W. Birren¹ - Show less +74 more•Institutions (28)

Massachusetts Institute of Technology¹, University of California, Riverside², University of Oregon³, University of Edinburgh⁴, Celera Corporation⁵, Texas A&M University⁶, University of New Mexico⁷, University of Colorado Denver⁸, University of Kansas⁹, University of Florida¹⁰, Hebrew University of Jerusalem¹¹, University of Düsseldorf¹², Rockefeller University¹³, Technische Universität München¹⁴, University of Kentucky¹⁵, University of Texas at Austin¹⁶, J. Craig Venter Institute¹⁷, University of California, Berkeley¹⁸, University of California, Los Angeles¹⁹, Sapienza University of Rome²⁰, Flinders University²¹, Ohio State University²², University of Arizona²³, University of Missouri–Kansas City²⁴, Dartmouth College²⁵, University of Leeds²⁶, California State Polytechnic University, Pomona²⁷, Oregon Health & Science University²⁸

24 Apr 2003-Nature

TL;DR: A high-quality draft sequence of the N. crassa genome is reported, suggesting that RIP has had a profound impact on genome evolution, greatly slowing the creation of new genes through genomic duplication and resulting in a genome with an unusually low proportion of closely related genes.

...read moreread less

Abstract: Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes—more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis of the gene set yields insights into unexpected aspects of Neurospora biology including the identification of genes potentially associated with red light photobiology, genes implicated in secondary metabolism, and important differences in Ca21 signalling as compared with plants and animals. Neurospora possesses the widest array of genome defence mechanisms known for any eukaryotic organism, including a process unique to fungi called repeat-induced point mutation (RIP). Genome analysis suggests that RIP has had a profound impact on genome evolution, greatly slowing the creation of new genes through genomic duplication and resulting in a genome with an unusually low proportion of closely related genes.

...read moreread less

1,659 citations

Journal Article•DOI•

The genome sequence of the rice blast fungus Magnaporthe grisea

[...]

Ralph A. Dean¹, Nicholas J. Talbot², Daniel J. Ebbole³, Mark L. Farman⁴, Thomas K. Mitchell¹, Marc J. Orbach⁵, Michael R. Thon³, Resham Kulkarni⁶, Resham Kulkarni¹, Jin-Rong Xu⁷, Huaqin Pan¹, Nick D. Read⁸, Yong-Hwan Lee⁹, Ignazio Carbone¹, Doug Brown¹, Yeonyee Oh¹, Nicole M. Donofrio¹, Jun Seop Jeong¹, Darren M. Soanes², Slavica Djonovic³, Elena A. Kolomiets³, Cathryn J. Rehmeyer⁴, Weixi Li⁴, Michael W. Harding⁵, Soonok Kim⁹, Marc-Henri Lebrun¹⁰, Heidi U. Böhnert¹⁰, Sean J. Coughlan¹¹, Jonathan Butler¹², Sarah E. Calvo¹², Li-Jun Ma¹², Robert Nicol¹², Seth Purcell¹², Chad Nusbaum¹², James E. Galagan¹², Bruce W. Birren¹² - Show less +32 more•Institutions (12)

North Carolina State University¹, University of Exeter², Texas A&M University³, University of Kentucky⁴, University of Arizona⁵, Research Triangle Park⁶, Purdue University⁷, University of Edinburgh⁸, Seoul National University⁹, Bayer¹⁰, Agilent Technologies¹¹, Broad Institute¹²

21 Apr 2005-Nature

TL;DR: The draft sequence of the M. grisea genome is reported, reflecting the clonal nature of this fungus imposed by widespread rice cultivation and analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease.

...read moreread less

Abstract: Magnaporthe grisea is the most destructive pathogen of rice worldwide and the principal model organism for elucidating the molecular basis of fungal disease of plants. Here, we report the draft sequence of the M. grisea genome. Analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease. The genome encodes a large and diverse set of secreted proteins, including those defined by unusual carbohydrate-binding domains. This fungus also possesses an expanded family of G-protein-coupled receptors, several new virulence-associated genes and large suites of enzymes involved in secondary metabolism. Consistent with a role in fungal pathogenesis, the expression of several of these genes is upregulated during the early stages of infection-related development. The M. grisea genome has been subject to invasion and proliferation of active transposable elements, reflecting the clonal nature of this fungus imposed by widespread rice cultivation.

...read moreread less

1,520 citations

Journal Article•DOI•

Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

[...]

Li-Jun Ma¹, H. Charlotte van der Does², Katherine A. Borkovich³, Jeffrey J. Coleman⁴, Marie Josée Daboussi⁵, Antonio Di Pietro⁶, Marie Dufresne⁵, Michael Freitag⁷, Manfred Grabherr¹, Bernard Henrissat⁸, Petra M. Houterman², Seogchan Kang⁹, Won-Bo Shim¹⁰, Charles P. Woloshuk¹¹, Xiaohui Xie¹², Jin-Rong Xu¹¹, John F. Antoniw¹³, Scott E. Baker¹⁴, B. H. Bluhm¹¹, Andrew Breakspear¹⁵, Daren W. Brown¹⁶, Robert A. E. Butchko¹⁶, Sinéad B. Chapman¹, Richard M.R. Coulson, Pedro M. Coutinho⁸, Etienne Danchin¹⁷, Etienne Danchin¹⁸, Andrew C. Diener¹⁹, Liane R. Gale¹⁵, Donald M. Gardiner²⁰, Stephen A. Goff⁴, Kim E. Hammond-Kosack¹³, Karen Hilburn¹⁵, Aurélie Hua-Van⁵, Wilfried Jonkers², Kemal Kazan²⁰, Chinnappa D. Kodira¹, Michael Koehrsen¹, Lokesh Kumar¹, Yong-Hwan Lee²¹, Liande Li³, Liande Li²², John M. Manners²⁰, Diego Miranda-Saavedra²³, Mala Mukherjee¹⁰, Gyungsoon Park³, Jongsun Park²¹, Sook Young Park⁹, Sook Young Park²¹, Robert H. Proctor¹⁶, Aviv Regev¹, M. Carmen Ruiz-Roldán⁶, Divya Sain³, Sharadha Sakthikumar¹, Sean M. Sykes¹, David C. Schwartz²⁴, B. Gillian Turgeon²⁵, Ilan Wapinski¹, Olen C. Yoder, Sarah Young¹, Qiandong Zeng¹, Shiguo Zhou²⁴, James E. Galagan¹, Christina A. Cuomo¹, H. Corby Kistler¹⁵, Martijn Rep² - Show less +62 more•Institutions (25)

Broad Institute¹, University of Amsterdam², University of California, Riverside³, University of Arizona⁴, Université Paris-Saclay⁵, University of Córdoba (Spain)⁶, Oregon State University⁷, Aix-Marseille University⁸, Pennsylvania State University⁹, Texas A&M University¹⁰, Purdue University¹¹, University of California, Irvine¹², Rothamsted Research¹³, Pacific Northwest National Laboratory¹⁴, University of Minnesota¹⁵, United States Department of Agriculture¹⁶, Centre national de la recherche scientifique¹⁷, University of Texas Southwestern Medical Center¹⁸, University of California, Los Angeles¹⁹, Commonwealth Scientific and Industrial Research Organisation²⁰, Seoul National University²¹, University of Texas at Dallas²², University of Cambridge²³, University of Wisconsin-Madison²⁴, Cornell University²⁵

18 Mar 2010-Nature

TL;DR: Comparison of genomes of three phenotypically diverse Fusarium species revealed lineage-specific genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome, putting the evolution of fungal pathogenicity into a new perspective.

...read moreread less

Abstract: Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum f. sp. lycopersici. Our analysis revealed lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity, indicative of horizontal acquisition. Experimentally, we demonstrate the transfer of two LS chromosomes between strains of F. oxysporum, converting a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in F. oxysporum. These findings put the evolution of fungal pathogenicity into a new perspective.

...read moreread less

1,386 citations

Journal Article•DOI•

Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus

[...]

William C. Nierman¹, William C. Nierman², Arnab Pain³, Michael J. Anderson⁴, Jennifer R. Wortman¹, Jennifer R. Wortman², H. Stanley Kim², H. Stanley Kim¹, Javier Arroyo⁵, Matthew Berriman³, Keietsu Abe⁶, David B. Archer⁷, Clara Bermejo⁵, Joan W. Bennett⁸, Paul Bowyer⁴, Dan Chen¹, Dan Chen², Matthew Collins³, Richard Coulsen, Robert L. Davies³, Paul S. Dyer⁷, Mark L. Farman⁹, Nadia Fedorova², Nadia Fedorova¹, Natalie D. Fedorova², Natalie D. Fedorova¹, T. Feldblyum¹, T. Feldblyum², Reinhard Fischer¹⁰, Nigel Fosker³, Audrey Fraser³, José Luis García¹¹, María Josefa Marcos García¹², Ariette Goble³, Gustavo H. Goldman¹³, Katsuya Gomi⁶, Sam Griffith-Jones³, R. Gwilliam³, Brian J. Haas¹, Brian J. Haas², Hubertus Haas¹⁴, David Harris³, H. Horiuchi¹⁵, Jiaqi Huang¹, Jiaqi Huang², Sean Humphray³, Javier Jiménez¹², Nancy P. Keller¹⁵, H. Khouri², H. Khouri¹, Katsuhiko Kitamoto¹⁶, Tetsuo Kobayashi¹⁷, Sven Konzack¹⁰, Resham Kulkarni¹, Resham Kulkarni², Toshitaka Kumagai¹⁸, Anne Lafton¹⁹, Jean-Paul Latgé¹⁹, Weixi Li⁹, Angela Lord³, Charles Lu¹, Charles Lu², William H. Majoros¹, William H. Majoros², Gregory S. May²⁰, Bruce L. Miller²¹, Yasmin Ali Mohamoud¹, Yasmin Ali Mohamoud², María Molina⁵, Michel Monod²², Isabelle Mouyna¹⁹, Stephanie Mulligan², Stephanie Mulligan¹, Lee Murphy³, Susan O'Neil³, Ian T. Paulsen², Ian T. Paulsen¹, Miguel A. Peñalva¹¹, Mihaela Pertea¹, Mihaela Pertea², Claire Price³, Bethan L. Pritchard⁴, Michael A. Quail³, Ester Rabbinowitsch³, Neil Rawlins³, Marie Adele Rajandream³, Utz Reichard²³, Hubert Renauld³, Geoffrey D. Robson⁴, Santiago Rodríguez de Córdoba¹¹, José Manuel Rodríguez-Peña⁵, Catherine M. Ronning², Catherine M. Ronning¹, Simon Rutter³, Steven L. Salzberg¹, Steven L. Salzberg², Miguel del Nogal Sánchez¹², Juan C. Sánchez-Ferrero¹¹, David L. Saunders³, Kathy Seeger³, Rob Squares³, S. Squares³, Michio Takeuchi²⁴, Fredj Tekaia¹⁹, Geoffrey Turner²⁵, Carlos R. Vázquez de Aldana¹², J. Weidman¹, J. Weidman², Owen White¹, Owen White², John Woodward³, Jae-Hyuk Yu¹⁵, Claire M. Fraser¹, Claire M. Fraser², James E. Galagan²⁶, Kiyoshi Asai¹⁸, Masayuki Machida¹⁸, Neil Hall¹, Neil Hall³, Bart Barrell³, David W. Denning⁴ - Show less +117 more•Institutions (26)

J. Craig Venter Institute¹, Washington University in St. Louis², Wellcome Trust Sanger Institute³, University of Manchester⁴, Complutense University of Madrid⁵, Tohoku University⁶, University of Nottingham⁷, Tulane University⁸, University of Kentucky⁹, Max Planck Society¹⁰, Spanish National Research Council¹¹, University of Salamanca¹², University of São Paulo¹³, Innsbruck Medical University¹⁴, University of Wisconsin-Madison¹⁵, University of Tokyo¹⁶, Nagoya University¹⁷, National Institute of Advanced Industrial Science and Technology¹⁸, Pasteur Institute¹⁹, University of Texas MD Anderson Cancer Center²⁰, University of Idaho²¹, University of Lausanne²², University of Göttingen²³, Tokyo University of Agriculture and Technology²⁴, University of Sheffield²⁵, Broad Institute²⁶

22 Dec 2005-Nature

TL;DR: The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus and revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype.

...read moreread less

Abstract: Aspergillus fumigatus is exceptional among microorganisms in being both a primary and opportunistic pathogen as well as a major allergen. Its conidia production is prolific, and so human respiratory tract exposure is almost constant. A. fumigatus is isolated from human habitats and vegetable compost heaps. In immunocompromised individuals, the incidence of invasive infection can be as high as 50% and the mortality rate is often about 50% (ref. 2). The interaction of A. fumigatus and other airborne fungi with the immune system is increasingly linked to severe asthma and sinusitis. Although the burden of invasive disease caused by A. fumigatus is substantial, the basic biology of the organism is mostly obscure. Here we show the complete 29.4-megabase genome sequence of the clinical isolate Af293, which consists of eight chromosomes containing 9,926 predicted genes. Microarray analysis revealed temperature-dependent expression of distinct sets of genes, as well as 700 A. fumigatus genes not present or significantly diverged in the closely related sexual species Neosartorya fischeri, many of which may have roles in the pathogenicity phenotype. The Af293 genome sequence provides an unparalleled resource for the future understanding of this remarkable fungus.

...read moreread less

1,356 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Pfam protein families database

[...]

Marco Punta¹, Penny Coggill¹, Ruth Y. Eberhardt¹, Jaina Mistry¹, John Tate¹, Chris Boursnell¹, Ningze Pang¹, Kristoffer Forslund¹, Goran Ceric¹, Jody Clements¹, Andreas Heger¹, Liisa Holm¹, Erik L. L. Sonnhammer¹, Sean R. Eddy¹, Alex Bateman¹, Robert D. Finn¹ - Show less +12 more•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jan 2000-Nucleic Acids Research

TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.

...read moreread less

Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

...read moreread less

14,075 citations

Journal Article•DOI•

The sequence of the human genome.

[...]

J. Craig Venter¹, Mark Raymond Adams¹, Eugene W. Myers¹, Peter W. Li¹ +269 more•Institutions (12)

16 Feb 2001-Science

TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.

...read moreread less

Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

...read moreread less

12,098 citations

Journal Article•DOI•

Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets

[...]

Benjamin P. Lewis¹, Christopher B. Burge¹, David P. Bartel¹•Institutions (1)

Massachusetts Institute of Technology¹

14 Jan 2005-Cell

TL;DR: In a four-genome analysis of 3' UTRs, approximately 13,000 regulatory relationships were detected above the estimate of false-positive predictions, thereby implicating as miRNA targets more than 5300 human genes, which represented 30% of the gene set.

...read moreread less

11,624 citations

Journal Article•DOI•

The Human Genome Browser at UCSC

[...]

W. James Kent¹, Charles W. Sugnet¹, Terrence S. Furey¹, Krishna M. Roskin¹, Tom H. Pringle, Alan M. Zahler¹, and David Haussler¹ - Show less +3 more•Institutions (1)

University of California, Santa Cruz¹

01 Jun 2002-Genome Research

TL;DR: A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu.

...read moreread less

Abstract: As vertebrate genome sequences near completion and research refocuses to their analysis, the issue of effective genome annotation display becomes critical. A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu. This browser displays assembly contigs and gaps, mRNA and expressed sequence tag alignments, multiple gene predictions, cross-species homologies, single nucleotide polymorphisms, sequence-tagged sites, radiation hybrid data, transposon repeats, and more as a stack of coregistered tracks. Text and sequence-based searches provide quick and precise access to any region of specific interest. Secondary links from individual features lead to sequence details and supplementary off-site databases. One-half of the annotation tracks are computed at the University of California, Santa Cruz from publicly available sequence data; collaborators worldwide provide the rest. Users can stably add their own custom tracks to the browser for educational or research purposes. The conceptual and technical framework of the browser, its underlying MYSQL database, and overall use are described. The web site currently serves over 50,000 pages per day to over 3000 different users.

...read moreread less

9,605 citations

Journal Article•DOI•

Velvet: Algorithms for de novo short read assembly using de Bruijn graphs

[...]

Daniel R. Zerbino¹, Ewan Birney¹•Institutions (1)

European Bioinformatics Institute¹

01 May 2008-Genome Research

TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.

...read moreread less

Abstract: We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.

...read moreread less

9,389 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse