Genome sequence of the human malaria parasite Plasmodium falciparum
Malcolm J. Gardner,Neil Hall,Eula Fung,Owen White,Matthew Berriman,Richard W. Hyman,Jane M. Carlton,Arnab Pain,Karen E. Nelson,Sharen Bowman,Ian T. Paulsen,Keith D. James,Jonathan A. Eisen,Kim Rutherford,Steven L. Salzberg,Alister Craig,Sue Kyes,Man Suen Chan,Vishvanath Nene,Shamira J. Shallom,Bernard B. Suh,Jeremy Peterson,Samuel V. Angiuoli,Mihaela Pertea,Jonathan E. Allen,Jeremy D. Selengut,Daniel H. Haft,Michael W. Mather,Akhil B. Vaidya,David M. A. Martin,Alan H. Fairlamb,Martin Fraunholz,David S. Roos,Stuart A. Ralph,Geoffrey I. McFadden,Leda M. Cummings,G. Mani Subramanian,Christopher J. Mungall,J. Craig Venter,Daniel J. Carucci,Stephen L. Hoffman,Chris I. Newbold,Ronald W. Davis,Claire M. Fraser,Bart Barrell +44 more
Reads0
Chats0
TLDR
The genome sequence of P. falciparum clone 3D7 is reported, which is the most (A + T)-rich genome sequenced to date and is being exploited in the search for new drugs and vaccines to fight malaria.Abstract:
The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host-parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.read more
Citations
More filters
Journal ArticleDOI
OrthoMCL: identification of ortholog groups for eukaryotic genomes.
TL;DR: OrthoMCL provides a scalable method for constructing orthologous groups across multiple eukaryotic taxa, using a Markov Cluster algorithm to group (putative) orthologs and paralogs.
Journal ArticleDOI
The COG database: an updated version includes eukaryotes
Roman L. Tatusov,Natalie D. Fedorova,John D. Jackson,Aviva R. Jacobs,Boris Kiryutin,Eugene V. Koonin,Dmitri M. Krylov,Raja Mazumder,Sergei L. Mekhedov,Anastasia N. Nikolskaya,B Sridhar Rao,Sergei Smirnov,Alexander V. Sverdlov,Sona Vasudevan,Yuri I. Wolf,Jodie J. Yin,Darren A. Natale +16 more
TL;DR: A major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes is described and is expected to be a useful platform for functional annotation of newlysequenced genomes, including those of complex eukARYotes, and genome-wide evolutionary studies.
Journal ArticleDOI
Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting
Tomas Cermak,Erin L. Doyle,Michelle Christian,Li-Li Wang,Yong Zhang,Clarice Schmidt,Joshua A. Baller,Nikunj V. Somia,Adam J. Bogdanove,Daniel F. Voytas +9 more
TL;DR: A method and reagents for efficiently assembling TALEN constructs with custom repeat arrays are presented and design guidelines based on naturally occurring TAL effectors and their binding sites are described.
Journal ArticleDOI
A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers.
Michael A. Quail,Miriam Smith,Paul Coupland,Thomas D. Otto,Simon R. Harris,Thomas R. Connor,Anna Bertoni,Harold Swerdlow,Yong Gu +8 more
TL;DR: All three fast turnaround sequencers evaluated here were able to generate usable sequence, however there are key differences between the quality of that data and the applications it will support.
Journal ArticleDOI
Sequencing and comparison of yeast species to identify genes and regulatory elements
TL;DR: A comparative analysis of the yeast Saccharomyces cerevisiae based on high-quality draft sequences of three related species, which inferred a putative function for most of these motifs, and provided insights into their combinatorial interactions.
References
More filters
Journal ArticleDOI
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI
The Pfam protein families database
Marco Punta,Penny Coggill,Ruth Y. Eberhardt,Jaina Mistry,John Tate,Chris Boursnell,Ningze Pang,Kristoffer Forslund,Goran Ceric,Jody Clements,Andreas Heger,Liisa Holm,Erik L. L. Sonnhammer,Sean R. Eddy,Alex Bateman,Robert D. Finn +15 more
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Journal ArticleDOI
Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes
TL;DR: A new membrane protein topology prediction method, TMHMM, based on a hidden Markov model is described and validated, and it is discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C-in topologies.
Journal ArticleDOI
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.
TL;DR: This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.
Journal ArticleDOI
Tandem repeats finder: a program to analyze DNA sequences
TL;DR: A new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size is presented and its ability to detect tandem repeats that have undergone extensive mutational change is demonstrated.
Related Papers (5)
A proteomic view of the Plasmodium falciparum life cycle
Laurence Florens,Michael P. Washburn,J. Dale Raine,Robert M. Anthony,Munira Grainger,J. David Haynes,J. David Haynes,J. Kathleen Moch,Nemone Muster,John B. Sacci,John B. Sacci,David L. Tabb,David L. Tabb,Adam A. Witney,Adam A. Witney,Dirk Wolters,Dirk Wolters,Yimin Wu,Malcolm J. Gardner,Anthony A. Holder,Robert E. Sinden,John R. Yates,John R. Yates,Daniel J. Carucci +23 more