Home
/
Authors
/
Tracey Chillingworth

Author

Tracey Chillingworth

Other affiliations: Wellcome Trust

Bio: Tracey Chillingworth is an academic researcher from Wellcome Trust Sanger Institute. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 28, co-authored 32 publications receiving 24155 citations. Previous affiliations of Tracey Chillingworth include Wellcome Trust.

Topics: Genome, Gene, Mycobacterium tuberculosis, Genomics, Plasmid ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence

[...]

Stewart T. Cole, Roland Brosch, Julian Parkhill¹, Thierry Garnier, Carol Churcher¹, David Harris¹, Stephen V. Gordon, Karin Eiglmeier, S. Gas, Clifton E. Barry², Fredj Tekaia, K. Badcock¹, D. Basham¹, D. Brown¹, Tracey Chillingworth¹, R. Connor¹, Robert L. Davies¹, K. Devlin¹, Theresa Feltwell¹, S. Gentles¹, N. Hamlin¹, S. Holroyd¹, T. Hornsby¹, Kay Jagels¹, Anders Krogh³, J. McLean¹, Sharon Moule¹, Lee Murphy¹, K. Oliver¹, J. Osborne¹, Michael A. Quail¹, Marie-Adèle Rajandream¹, Jane Rogers¹, S. Rutter¹, K. Seeger¹, Jason Skelton¹, Rob Squares¹, S. Squares¹, John Sulston¹, K. Taylor¹, Sally Whitehead¹, Bart Barrell¹ - Show less +38 more•Institutions (3)

Wellcome Trust¹, National Institutes of Health², Technical University of Denmark³

11 Jun 1998-Nature

TL;DR: The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve the understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions.

...read moreread less

Abstract: Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.

...read moreread less

7,779 citations

Journal Article•DOI•

The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences

[...]

Julian Parkhill¹, Brendan W. Wren², Karen Mungall¹, Julian M. Ketley³, Carol Churcher¹, D. Basham¹, Tracey Chillingworth¹, Robert L. Davies¹, Theresa Feltwell¹, S. Holroyd¹, Kay Jagels¹, Andrey V. Karlyshev², Sharon Moule¹, Mark J. Pallen⁴, Charles W. Penn⁵, Michael A. Quail¹, Marie-Adèle Rajandream¹, Kim Rutherford¹, A. H. M. van Vliet⁶, Sally Whitehead¹, Bart Barrell¹ - Show less +17 more•Institutions (6)

Wellcome Trust¹, University of London², University of Leicester³, Queen's University Belfast⁴, University of Birmingham⁵, VU University Amsterdam⁶

10 Feb 2000-Nature

TL;DR: The genome sequence of C. jejuni NCTC11168 is reported, finding short homopolymeric runs of nucleotides were commonly found in genes encoding the biosynthesis or modification of surface structures, or in closely linked genes of unknown function.

...read moreread less

Abstract: Campylobacter jejuni, from the delta-epsilon group of proteobacteria, is a microaerophilic, Gram-negative, flagellate, spiral bacterium—properties it shares with the related gastric pathogen Helicobacter pylori. It is the leading cause of bacterial food-borne diarrhoeal disease throughout the world1. In addition, infection with C. jejuni is the most frequent antecedent to a form of neuromuscular paralysis known as Guillain–Barre syndrome2. Here we report the genome sequence of C. jejuni NCTC11168. C. jejuni has a circular chromosome of 1,641,481 base pairs (30.6% G+C) which is predicted to encode 1,654 proteins and 54 stable RNA species. The genome is unusual in that there are virtually no insertion sequences or phage-associated sequences and very few repeat sequences. One of the most striking findings in the genome was the presence of hypervariable sequences. These short homopolymeric runs of nucleotides were commonly found in genes encoding the biosynthesis or modification of surface structures, or in closely linked genes of unknown function. The apparently high rate of variation of these homopolymeric tracts may be important in the survival strategy of C. jejuni.

...read moreread less

1,979 citations

Journal Article•DOI•

The genome sequence of Schizosaccharomyces pombe

[...]

Valerie Wood¹, R. Gwilliam¹, Marie-Adèle Rajandream¹, M. Lyne¹, Rachel Lyne¹, A. Stewart², J. Sgouros², N. Peat², Jacqueline Hayles², Stephen Baker¹, D. Basham¹, Sharen Bowman¹, Karen Brooks¹, D. Brown¹, Steve D.M. Brown¹, Tracey Chillingworth¹, Carol Churcher¹, Mark O. Collins¹, R. Connor¹, Ann Cronin¹, P. Davis¹, Theresa Feltwell¹, Andrew G. Fraser¹, S. Gentles¹, Arlette Goble¹, N. Hamlin¹, David Harris¹, J. Hidalgo¹, Geoffrey M. Hodgson¹, S. Holroyd¹, T. Hornsby¹, S. Howarth¹, Elizabeth J. Huckle¹, Sarah E. Hunt¹, Kay Jagels¹, Kylie R. James¹, L. Jones¹, Matthew Jones¹, S. Leather¹, S. McDonald¹, J. McLean¹, P. Mooney¹, Sharon Moule¹, Karen Mungall¹, Lee Murphy¹, D. Niblett¹, C. Odell¹, Karen Oliver¹, Susan O'Neil¹, D. Pearson¹, Michael A. Quail¹, Ester Rabbinowitsch¹, Kim Rutherford¹, Simon Rutter¹, David L. Saunders¹, Kathy Seeger¹, Sarah Sharp¹, Jason Skelton¹, Mark Simmonds¹, R. Squares¹, S. Squares¹, K. Stevens¹, K. Taylor¹, Ruth Taylor¹, Adrian Tivey¹, S. Walsh¹, T. Warren¹, S. Whitehead¹, John Woodward¹, Guido Volckaert³, Rita Aert³, Johan Robben³, B. Grymonprez³, I. Weltjens³, E. Vanstreels³, Michael A. Rieger, M. Schafer, S. Muller-Auer, C. Gabel, M. Fuchs, C. Fritzc, E. Holzer, D. Moestl, H. Hilbert, K. Borzym⁴, I. Langer⁴, Alfred Beck⁴, Hans Lehrach⁴, Richard Reinhardt⁴, Thomas M. Pohl⁵, P. Eger⁵, Wolfgang Zimmermann, H. Wedler, R. Wambutt, Bénédicte Purnelle⁶, André Goffeau⁶, Edouard Cadieu⁷, Stéphane Dréano⁷, Stéphanie Gloux⁷, Valerie Lelaure⁷, Stéphanie Mottier⁷, Francis Galibert⁷, Stephen J. Aves⁸, Z. Xiang⁸, Cherryl Hunt⁸, Karen Moore⁸, S. M. Hurst⁸, M. Lucas⁹, M. Rochet⁹, Claude Gaillardin⁹, Victor A. Tallada¹⁰, Victor A. Tallada¹¹, Andrés Garzón¹¹, Andrés Garzón¹⁰, G. Thode¹¹, Rafael R. Daga¹⁰, Rafael R. Daga¹¹, L. Cruzado¹¹, Juan Jimenez¹¹, Juan Jimenez¹⁰, Miguel del Nogal Sánchez¹², F. del Rey¹², J. Benito¹², Angel Domínguez¹², José L. Revuelta¹², Sergio Moreno¹², John Armstrong¹³, Susan L. Forsburg¹⁴, L. Cerrutti¹, Todd M. Lowe¹⁵, W. R. McCombie¹⁶, Ian T. Paulsen¹⁷, Judith A. Potashkin¹⁸, G. V. Shpakovski¹⁹, David W. Ussery²⁰, Bart Barrell¹, Paul Nurse² - Show less +133 more•Institutions (20)

Wellcome Trust Sanger Institute¹, London Research Institute², Katholieke Universiteit Leuven³, Max Planck Society⁴, GATC Biotech⁵, Université catholique de Louvain⁶, Centre national de la recherche scientifique⁷, University of Exeter⁸, Institut national agronomique Paris Grignon⁹, Pablo de Olavide University¹⁰, University of Málaga¹¹, University of Salamanca¹², University of Sussex¹³, Salk Institute for Biological Studies¹⁴, Stanford University¹⁵, Cold Spring Harbor Laboratory¹⁶, TigerLogic¹⁷, Rosalind Franklin University of Medicine and Science¹⁸, Russian Academy of Sciences¹⁹, Technical University of Denmark²⁰

21 Feb 2002-Nature

TL;DR: The genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote, is sequenced and highly conserved genes important for eukARYotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing are identified.

...read moreread less

Abstract: We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.

...read moreread less

1,686 citations

Journal Article•DOI•

Massive gene decay in the leprosy bacillus

[...]

Stewart T. Cole¹, Karin Eiglmeier¹, Julian Parkhill², Keith D. James², Nicholas R. Thomson², Paul R. Wheeler³, Nadine Honoré¹, Thierry Garnier¹, Carol Churcher², David Harris², Karen Mungall², D. Basham², D. Brown², Tracey Chillingworth², R. Connor², Robert L. Davies², K. Devlin², Stephanie Duthoy¹, Theresa Feltwell², Audrey Fraser², N. Hamlin², S. Holroyd², T. Hornsby², Kay Jagels², Céline Lacroix¹, J. Maclean², Sharon Moule², Lee Murphy², K. Oliver², Michael A. Quail², Marie-Adèle Rajandream², Kim Rutherford², S. Rutter², K. Seeger², Sylvie Simon¹, Mark Simmonds², Jason Skelton², Rob Squares², S. Squares², K. Stevens², K. Taylor², Sally Whitehead², J. R. Woodward², Bart Barrell² - Show less +40 more•Institutions (3)

Pasteur Institute¹, Wellcome Trust², Veterinary Laboratories Agency³

22 Feb 2001-Nature

TL;DR: Comparing the 3.27-megabase genome sequence of an armadillo-derived Indian isolate of the leprosy bacillus with that of Mycobacterium tuberculosis provides clear explanations for these properties and reveals an extreme case of reductive evolution.

...read moreread less

Abstract: Leprosy, a chronic human neurological disease, results from infection with the obligate intracellular pathogen Mycobacterium leprae, a close relative of the tubercle bacillus. Mycobacterium leprae has the longest doubling time of all known bacteria and has thwarted every effort at culture in the laboratory. Comparing the 3.27-megabase (Mb) genome sequence of an armadillo-derived Indian isolate of the leprosy bacillus with that of Mycobacterium tuberculosis (4.41 Mb) provides clear explanations for these properties and reveals an extreme case of reductive evolution. Less than half of the genome contains functional genes but pseudogenes, with intact counterparts in M. tuberculosis, abound. Genome downsizing and the current mosaic arrangement appear to have resulted from extensive recombination events between dispersed repetitive sequences. Gene deletion and decay have eliminated many important metabolic activities including siderophore production, part of the oxidative and most of the microaerophilic and anaerobic respiratory chains, and numerous catabolic systems and their regulatory circuits.

...read moreread less

1,620 citations

Journal Article•DOI•

Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18

[...]

Julian Parkhill¹, Gordon Dougan², Keith D. James¹, Nicholas R. Thomson¹, Derek Pickard², John Wain², Carol Churcher¹, Karen Mungall¹, Stephen D. Bentley¹, Matthew T. G. Holden¹, Mohammed Sebaihia¹, Stephen Baker¹, D. Basham¹, Karen Brooks¹, Tracey Chillingworth¹, Phillippa L. Connerton², A. Cronin¹, P. Davis¹, Robert L. Davies¹, L. Dowd¹, Nicholas J. White³, Jeremy Farrar³, Theresa Feltwell¹, N. Hamlin¹, Ashraful Haque², Tran Tinh Hien, S. Holroyd¹, Kay Jagels¹, Anders Krogh⁴, Torben Larsen⁴, S. Leather¹, Sharon Moule¹, Peadar O'Gaora², Christopher M. Parry, Michael A. Quail¹, Kim Rutherford¹, Mark Simmonds¹, Jason Skelton¹, K. Stevens¹, Sally Whitehead¹, Bart Barrell¹ - Show less +37 more•Institutions (4)

Wellcome Trust¹, Imperial College London², University of Oxford³, Technical University of Denmark⁴

25 Oct 2001-Nature

TL;DR: The genome sequence is sequenced of a S. typhi (CT18) that is resistant to multiple drugs, revealing the presence of hundreds of insertions and deletions compared with the Escherichia coli genome, ranging in size from single genes to large islands.

...read moreread less

Abstract: Salmonella enterica serovar Typhi (S. typhi) is the aetiological agent of typhoid fever, a serious invasive bacterial disease of humans with an annual global burden of approximately 16 million cases, leading to 600,000 fatalities. Many S. enterica serovars actively invade the mucosal surface of the intestine but are normally contained in healthy individuals by the local immune defence mechanisms. However, S. typhi has evolved the ability to spread to the deeper tissues of humans, including liver, spleen and bone marrow. Here we have sequenced the 4,809,037-base pair (bp) genome of a S. typhi (CT18) that is resistant to multiple drugs, revealing the presence of hundreds of insertions and deletions compared with the Escherichia coli genome, ranging in size from single genes to large islands. Notably, the genome sequence identifies over two hundred pseudogenes, several corresponding to genes that are known to contribute to virulence in Salmonella typhimurium. This genetic degradation may contribute to the human-restricted host range for S. typhi. CT18 harbours a 218,150-bp multiple-drug-resistance incH1 plasmid (pHCM1), and a 106,516-bp cryptic plasmid (pHCM2), which shows recent common ancestry with a virulence plasmid of Yersinia pestis.

...read moreread less

1,211 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A greedy algorithm for aligning DNA sequences.

[...]

Zheng Zhang¹, Scott Schwartz, Lukas Wagner, Webb Miller•Institutions (1)

Pennsylvania State University¹

01 Feb 2000-Journal of Computational Biology

TL;DR: A new greedy alignment algorithm is introduced with particularly good performance and it is shown that it computes the same alignment as does a certain dynamic programming algorithm, while executing over 10 times faster on appropriate data.

...read moreread less

Abstract: For aligning DNA sequences that differ only by sequencing errors, or by equivalent errors from other sources, a greedy algorithm can be much faster than traditional dynamic programming approaches and yet produce an alignment that is guaranteed to be theoretically optimal. We introduce a new greedy alignment algorithm with particularly good performance and show that it computes the same alignment as does a certain dynamic programming algorithm, while executing over 10 times faster on appropriate data. An implementation of this algorithm is currently used in a program that assembles the UniGene database at the National Center for Biotechnology Information.

...read moreread less

4,628 citations

Journal Article•DOI•

Genome sequence of the human malaria parasite Plasmodium falciparum

[...]

Malcolm J. Gardner¹, Neil Hall¹, Eula Fung¹, Owen White¹, Matthew Berriman¹, Richard W. Hyman¹, Jane M. Carlton¹, Arnab Pain¹, Karen E. Nelson¹, Sharen Bowman¹, Ian T. Paulsen¹, Keith D. James¹, Jonathan A. Eisen¹, Kim Rutherford¹, Steven L. Salzberg¹, Alister Craig¹, Sue Kyes¹, Man Suen Chan¹, Vishvanath Nene¹, Shamira J. Shallom¹, Bernard B. Suh¹, Jeremy Peterson¹, Samuel V. Angiuoli¹, Mihaela Pertea¹, Jonathan E. Allen¹, Jeremy D. Selengut¹, Daniel H. Haft¹, Michael W. Mather¹, Akhil B. Vaidya¹, David M. A. Martin¹, Alan H. Fairlamb¹, Martin Fraunholz¹, David S. Roos¹, Stuart A. Ralph¹, Geoffrey I. McFadden¹, Leda M. Cummings¹, G. Mani Subramanian¹, Christopher J. Mungall¹, J. Craig Venter¹, Daniel J. Carucci¹, Stephen L. Hoffman¹, Chris I. Newbold¹, Ronald W. Davis¹, Claire M. Fraser¹, Bart Barrell¹ - Show less +41 more•Institutions (1)

J. Craig Venter Institute¹

03 Oct 2002-Nature

TL;DR: The genome sequence of P. falciparum clone 3D7 is reported, which is the most (A + T)-rich genome sequenced to date and is being exploited in the search for new drugs and vaccines to fight malaria.

...read moreread less

Abstract: The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host-parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.

...read moreread less

4,312 citations

Journal Article•DOI•

Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen.

[...]

Charles K. Stover, X. Q. Pham¹, A. L. Erwin, S. D. Mizoguchi, Paul Warrener, Mark J. Hickey, Fiona S. L. Brinkman², W. O. Hufnagle, D. J. Kowalik, Lagrou Mj, R. L. Garber, L. Goltry, E. Tolentino, S. Westbrock-Wadman, Ying Yuan, L. L. Brody, S. N. Coulter, K. R. Folger, Arnold Kas¹, K. Larbig³, R. Lim¹, Kelly D. Smith¹, David H. Spencer¹, Gane Ka-Shu Wong¹, Z. Wu¹, Ian T. Paulsen⁴, Ian T. Paulsen⁵, Jonathan Reizer⁴, Milton H. Saier⁴, Robert E. W. Hancock², Stephen Lory¹, Maynard V. Olson¹ - Show less +28 more•Institutions (5)

University of Washington¹, University of British Columbia², Hochschule Hannover³, University of California, San Diego⁴, Research Medical Center⁵

31 Aug 2000-Nature

TL;DR: It is proposed that the size and complexity of the P. aeruginosa genome reflect an evolutionary adaptation permitting it to thrive in diverse environments and resist the effects of a variety of antimicrobial substances.

...read moreread less

Abstract: Pseudomonas aeruginosa is a ubiquitous environmental bacterium that is one of the top three causes of opportunistic human infections. A major factor in its prominence as a pathogen is its intrinsic resistance to antibiotics and disinfectants. Here we report the complete sequence of P. aeruginosa strain PAO1. At 6.3 million base pairs, this is the largest bacterial genome sequenced, and the sequence provides insights into the basis of the versatility and intrinsic drug resistance of P. aeruginosa. Consistent with its larger genome size and environmental adaptability, P. aeruginosa contains the highest proportion of regulatory genes observed for a bacterial genome and a large number of genes involved in the catabolism, transport and efflux of organic compounds as well as four potential chemotaxis systems. We propose that the size and complexity of the P. aeruginosa genome reflect an evolutionary adaptation permitting it to thrive in diverse environments and resist the effects of a variety of antimicrobial substances.

...read moreread less

4,220 citations

Journal Article•DOI•

The COG database: an updated version includes eukaryotes

[...]

Roman L. Tatusov¹, Natalie D. Fedorova¹, John D. Jackson¹, Aviva R. Jacobs¹, Boris Kiryutin¹, Eugene V. Koonin¹, Dmitri M. Krylov¹, Raja Mazumder², Sergei L. Mekhedov¹, Anastasia N. Nikolskaya², B Sridhar Rao¹, Sergei Smirnov¹, Alexander V. Sverdlov¹, Sona Vasudevan¹, Yuri I. Wolf¹, Jodie J. Yin¹, Darren A. Natale² - Show less +13 more•Institutions (2)

National Institutes of Health¹, Georgetown University Medical Center²

11 Sep 2003-BMC Bioinformatics

TL;DR: A major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes is described and is expected to be a useful platform for functional annotation of newlysequenced genomes, including those of complex eukARYotes, and genome-wide evolutionary studies.

...read moreread less

Abstract: The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after euk aryotic o rthologous g roups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The euk aryotic o rthologous g roups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies.

...read moreread less

4,167 citations

Journal Article•DOI•

Mauve: multiple alignment of conserved genomic sequence with rearrangements.

[...]

Aaron E. Darling¹, Bob Mau, Frederick R. Blattner, Nicole T. Perna•Institutions (1)

University of Wisconsin-Madison¹

01 Jul 2004-Genome Research

TL;DR: This work presents methods for identification and alignment of conserved genomic DNA in the presence of rearrangements and horizontal transfer and evaluated the quality of Mauve alignments and drawn comparison to other methods through extensive simulations of genome evolution.

...read moreread less

Abstract: As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under consideration. Furthermore, the linear order of these segments may be shuffled among genomes. We present methods for identification and alignment of conserved genomic DNA in the presence of rearrangements and horizontal transfer. Our methods have been implemented in a software package called Mauve. Mauve has been applied to align nine enterobacterial genomes and to determine global rearrangement structure in three mammalian genomes. We have evaluated the quality of Mauve alignments and drawn comparison to other methods through extensive simulations of genome evolution.

...read moreread less

3,741 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse