Topic

Genome

About: Genome is a research topic. Over the lifetime, 74231 publications have been published within this topic receiving 3819713 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

NOVOPlasty: de novo assembly of organelle genomes from whole genome data

[...]

Nicolas Dierckxsens¹, Patrick Mardulyn¹, Guillaume Smits¹•Institutions (1)

Université libre de Bruxelles¹

24 Oct 2016-Nucleic Acids Research

TL;DR: NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig.

...read moreread less

Abstract: The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty.

...read moreread less

2,008 citations

Journal Article•DOI•

The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences

[...]

Julian Parkhill¹, Brendan W. Wren², Karen Mungall¹, Julian M. Ketley³, Carol Churcher¹, D. Basham¹, Tracey Chillingworth¹, Robert L. Davies¹, Theresa Feltwell¹, S. Holroyd¹, Kay Jagels¹, Andrey V. Karlyshev², Sharon Moule¹, Mark J. Pallen⁴, Charles W. Penn⁵, Michael A. Quail¹, Marie-Adèle Rajandream¹, Kim Rutherford¹, A. H. M. van Vliet⁶, Sally Whitehead¹, Bart Barrell¹ - Show less +17 more•Institutions (6)

Wellcome Trust¹, University of London², University of Leicester³, Queen's University Belfast⁴, University of Birmingham⁵, VU University Amsterdam⁶

10 Feb 2000-Nature

TL;DR: The genome sequence of C. jejuni NCTC11168 is reported, finding short homopolymeric runs of nucleotides were commonly found in genes encoding the biosynthesis or modification of surface structures, or in closely linked genes of unknown function.

...read moreread less

Abstract: Campylobacter jejuni, from the delta-epsilon group of proteobacteria, is a microaerophilic, Gram-negative, flagellate, spiral bacterium—properties it shares with the related gastric pathogen Helicobacter pylori. It is the leading cause of bacterial food-borne diarrhoeal disease throughout the world1. In addition, infection with C. jejuni is the most frequent antecedent to a form of neuromuscular paralysis known as Guillain–Barre syndrome2. Here we report the genome sequence of C. jejuni NCTC11168. C. jejuni has a circular chromosome of 1,641,481 base pairs (30.6% G+C) which is predicted to encode 1,654 proteins and 54 stable RNA species. The genome is unusual in that there are virtually no insertion sequences or phage-associated sequences and very few repeat sequences. One of the most striking findings in the genome was the presence of hypervariable sequences. These short homopolymeric runs of nucleotides were commonly found in genes encoding the biosynthesis or modification of surface structures, or in closely linked genes of unknown function. The apparently high rate of variation of these homopolymeric tracts may be important in the survival strategy of C. jejuni.

...read moreread less

1,979 citations

Journal Article•DOI•

A draft map of the human proteome

[...]

Min-Sik Kim¹, Sneha M. Pinto, Derese Getnet¹, Raja Sekhar Nirujogi, Srikanth S. Manda, Raghothama Chaerkady², Anil K. Madugundu, Dhanashree S. Kelkar, Ruth Isserlin³, Shobhit Jain³, Joji Kurian Thomas, Babylakshmi Muthusamy, Pamela Leal-Rojas¹, Pamela Leal-Rojas⁴, Praveen Kumar, Nandini A. Sahasrabuddhe, Lavanya Balakrishnan, Jayshree Advani, Bijesh George, Santosh Renuse, Lakshmi Dhevi N. Selvan, Arun H. Patil, Vishalakshi Nanjappa, Aneesha Radhakrishnan, Samarjeet Prasad¹, Tejaswini Subbannayya, Rajesh Raju, Manish Kumar, Sreelakshmi K. Sreenivasamurthy, Arivusudar Marimuthu, Gajanan Sathe, Sandip Chavan, Keshava K. Datta, Yashwanth Subbannayya, Apeksha Sahu, Soujanya D. Yelamanchi, Savita Jayaram, Pavithra Rajagopalan, Jyoti Sharma, Krishna R Murthy, Nazia Syed, Renu Goel, Aafaque Ahmad Khan, Sartaj Ahmad, Gourav Dey, Keshav Mudgal⁵, Aditi Chatterjee, Tai-Chung Huang¹, Jun Zhong¹, Xinyan Wu², Patrick G. Shaw¹, Donald Freed¹, Muhammad Saddiq Zahari¹, Kanchan K Mukherjee⁶, Subramanian Shankar⁷, Anita Mahadevan⁸, Henry H N Lam⁹, Chris J. Mitchell¹, Susarla K. Shankar⁸, Parthasarathy Satishchandra⁸, John T. Schroeder¹, Ravi Sirdeshmukh, Anirban Maitra¹, Steven D. Leach¹, Charles G. Drake¹, Marc K. Halushka¹, T. S. Keshava Prasad, Ralph H. Hruban¹, Candace L. Kerr¹, Candace L. Kerr¹⁰, Gary D. Bader³, Christine A. Iacobuzio-Donahue¹, Harsha Gowda, Akhilesh Pandey - Show less +70 more•Institutions (10)

Johns Hopkins University¹, Johns Hopkins University School of Medicine², University of Toronto³, University of La Frontera⁴, Imperial College London⁵, Post Graduate Institute of Medical Education and Research⁶, Armed Forces Medical College⁷, National Institute of Mental Health and Neurosciences⁸, Hong Kong University of Science and Technology⁹, University of Maryland, Baltimore¹⁰

29 May 2014-Nature

TL;DR: A draft map of the human proteome is presented using high-resolution Fourier-transform mass spectrometry to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-c coding RNAs and upstream open reading frames.

...read moreread less

Abstract: The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.

...read moreread less

1,965 citations

Journal Article•DOI•

Genome sequence of the Brown Norway rat yields insights into mammalian evolution

[...]

Richard A. Gibbs¹, George M. Weinstock¹, Michael L. Metzker¹, Donna M. Muzny¹ +239 more•Institutions (35)

01 Apr 2004-Nature

TL;DR: This first comprehensive analysis of the genome sequence of the Brown Norway (BN) rat strain is reported, which is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution.

...read moreread less

Abstract: The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.

...read moreread less

1,964 citations

Journal Article•DOI•

CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes.

[...]

Genís Parra¹, Keith Bradnam¹, Ian F Korf¹•Institutions (1)

University of California, Davis¹

01 May 2007-Bioinformatics

TL;DR: This study reports a computational method, CEGMA (Core Eukaryotic Genes Mapping Approach), for building a highly reliable set of gene annotations in the absence of experimental data, and defines a set of conserved protein families that occur in a wide range of eukaryotes and presents a mapping procedure that accurately identifies their exon-intron structures in a novel genomic sequence.

...read moreread less

Abstract: Motivation The numbers of finished and ongoing genome projects are increasing at a rapid rate, and providing the catalog of genes for these new genomes is a key challenge. Obtaining a set of well-characterized genes is a basic requirement in the initial steps of any genome annotation process. An accurate set of genes is needed in order to learn about species-specific properties, to train gene-finding programs, and to validate automatic predictions. Unfortunately, many new genome projects lack comprehensive experimental data to derive a reliable initial set of genes. Results In this study, we report a computational method, CEGMA (Core Eukaryotic Genes Mapping Approach), for building a highly reliable set of gene annotations in the absence of experimental data. We define a set of conserved protein families that occur in a wide range of eukaryotes, and present a mapping procedure that accurately identifies their exon-intron structures in a novel genomic sequence. CEGMA includes the use of profile-hidden Markov models to ensure the reliability of the gene structures. Our procedure allows one to build an initial set of reliable gene annotations in potentially any eukaryotic genome, even those in draft stages. Availability Software and data sets are available online at http://korflab.ucdavis.edu/Datasets.

...read moreread less

1,963 citations

Collapse

Network Information

Performance

Metrics

95,794

Papers

4,362,306

Citations

No. of papers in the topic in previous years
Year	Papers
2024	2
2023	7,313
2022	14,209
2021	4,955
2020	5,080
2019	4,839

Genome

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics