Showing papers on "Genome published in 2018"

PDF

Open Access

Journal Article•DOI•

High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries.

[...]

Chirag Jain¹, Luis M. Rodriguez-R¹, Adam M. Phillippy², Konstantinos T. Konstantinidis¹, Srinivas Aluru¹ - Show less +1 more•Institutions (2)

Georgia Institute of Technology¹, National Institutes of Health²

30 Nov 2018-Nature Communications

TL;DR: FastANI is developed, a method to compute ANI using alignment-free approximate sequence mapping, and it is shown 95% ANI is an accurate threshold for demarcating prokaryotic species by analyzing about 90,000 proKaryotic genomes.

...read moreread less

Abstract: A fundamental question in microbiology is whether there is continuum of genetic diversity among genomes, or clear species boundaries prevail instead. Whole-genome similarity metrics such as Average Nucleotide Identity (ANI) help address this question by facilitating high resolution taxonomic analysis of thousands of genomes from diverse phylogenetic lineages. To scale to available genomes and beyond, we present FastANI, a new method to estimate ANI using alignment-free approximate sequence mapping. FastANI is accurate for both finished and draft genomes, and is up to three orders of magnitude faster compared to alignment-based approaches. We leverage FastANI to compute pairwise ANI values among all prokaryotic genomes available in the NCBI database. Our results reveal clear genetic discontinuity, with 99.8% of the total 8 billion genome pairs analyzed conforming to >95% intra-species and <83% inter-species ANI values. This discontinuity is manifested with or without the most frequently sequenced species, and is robust to historic additions in the genome databases. Average Nucleotide Identity (ANI) is a robust and useful measure to gauge genetic relatedness between two genomes. Here, the authors develop FastANI, a method to compute ANI using alignment-free approximate sequence mapping, and show 95% ANI is an accurate threshold for demarcating prokaryotic species by analyzing about 90,000 prokaryotic genomes.

...read moreread less

2,176 citations

Journal Article•DOI•

Shifting the limits in wheat research and breeding using a fully annotated reference genome

[...]

Rudi Appels¹, Rudi Appels², Kellye Eversole, Nils Stein³ +204 more•Institutions (45)

17 Aug 2018-Science

TL;DR: This annotated reference sequence of wheat is a resource that can now drive disruptive innovation in wheat improvement, as this community resource establishes the foundation for accelerating wheat research and application through improved understanding of wheat biology and genomics-assisted breeding.

...read moreread less

Abstract: An annotated reference sequence representing the hexaploid bread wheat genome in 21 pseudomolecules has been analyzed to identify the distribution and genomic context of coding and noncoding elements across the A, B, and D subgenomes. With an estimated coverage of 94% of the genome and containing 107,891 high-confidence gene models, this assembly enabled the discovery of tissue- and developmental stage-related coexpression networks by providing a transcriptome atlas representing major stages of wheat development. Dynamics of complex gene families involved in environmental adaptation and end-use quality were revealed at subgenome resolution and contextualized to known agronomic single-gene or quantitative trait loci. This community resource establishes the foundation for accelerating wheat research and application through improved understanding of wheat biology and genomics-assisted breeding.

...read moreread less

2,118 citations

Journal Article•DOI•

Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes

[...]

Jongsik Chun¹, Aharon Oren², Antonio Ventosa³, Henrik Christensen⁴, David R. Arahal⁵, Milton S. da Costa⁶, Alejandro P. Rooney⁷, Hana Yi⁸, Xue-Wei Xu⁹, Sofie E. De Meyer¹⁰, Martha E. Trujillo¹¹ - Show less +7 more•Institutions (11)

Seoul National University¹, Hebrew University of Jerusalem², University of Seville³, University of Copenhagen⁴, University of Valencia⁵, University of Coimbra⁶, United States Department of Agriculture⁷, Korea University⁸, State Oceanic Administration⁹, Murdoch University¹⁰, University of Salamanca¹¹

01 Jan 2018-International Journal of Systematic and Evolutionary Microbiology

TL;DR: The minimal standards for the quality of genome sequences and how they can be applied for taxonomic purposes are described.

...read moreread less

Abstract: Advancement of DNA sequencing technology allows the routine use of genome sequences in the various fields of microbiology. The information held in genome sequences proved to provide objective and reliable means in the taxonomy of prokaryotes. Here, we describe the minimal standards for the quality of genome sequences and how they can be applied for taxonomic purposes.

...read moreread less

1,908 citations

Journal Article•DOI•

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

[...]

Robert M. Bowers¹, Nikos C. Kyrpides¹, Ramunas Stepanauskas², Miranda Harmon-Smith¹, Devin F. R. Doud¹, T. B. K. Reddy¹, Frederik Schulz¹, Jessica K. Jarett¹, Adam R. Rivers¹, Adam R. Rivers³, Emiley A. Eloe-Fadrosh¹, Susannah G. Tringe⁴, Susannah G. Tringe¹, Natalia Ivanova¹, Alex Copeland¹, Alicia Clum¹, Eric D. Becraft², Rex R. Malmstrom¹, Bruce W. Birren⁵, Mircea Podar⁶, Peer Bork, George M. Weinstock, George M. Garrity⁷, Jeremy A. Dodsworth⁸, Shibu Yooseph⁹, Granger G. Sutton⁹, Frank Oliver Gloeckner¹⁰, Jack A. Gilbert¹¹, William C. Nelson¹², Steven J. Hallam¹³, Sean P. Jungbluth¹, Sean P. Jungbluth¹⁴, Thijs J. G. Ettema¹⁵, Scott Tighe¹⁶, Konstantinos T. Konstantinidis¹⁷, Wen Tso Liu¹⁸, Brett J. Baker¹⁹, Thomas Rattei²⁰, Jonathan A. Eisen²¹, Brian P. Hedlund²², Katherine D. McMahon²³, Noah Fierer²⁴, Rob Knight²⁵, Robert D. Finn²⁶, Guy Cochrane²⁶, Ilene Karsch-Mizrachi²⁷, Gene W. Tyson²⁸, Christian Rinke²⁸, Alla Lapidus²⁹, Folker Meyer¹¹, Pelin Yilmaz¹⁰, Donovan H. Parks²⁸, A. M. Eren, Lynn M. Schriml, Jillian F. Banfield³⁰, Philip Hugenholtz²⁸, Tanja Woyke¹⁰ - Show less +53 more•Institutions (30)

Joint Genome Institute¹, Bigelow Laboratory For Ocean Sciences², United States Department of Agriculture³, University of California, Merced⁴, Broad Institute⁵, Oak Ridge National Laboratory⁶, Michigan State University⁷, California State University, San Bernardino⁸, J. Craig Venter Institute⁹, Max Planck Society¹⁰, Argonne National Laboratory¹¹, Pacific Northwest National Laboratory¹², University of British Columbia¹³, University of Southern California¹⁴, Science for Life Laboratory¹⁵, University of Vermont¹⁶, Georgia Institute of Technology¹⁷, University of Illinois at Urbana–Champaign¹⁸, University of Texas at Austin¹⁹, University of Vienna²⁰, University of California, Davis²¹, University of Nevada, Las Vegas²², University of Wisconsin-Madison²³, Cooperative Institute for Research in Environmental Sciences²⁴, University of California, San Diego²⁵, European Bioinformatics Institute²⁶, National Institutes of Health²⁷, University of Queensland²⁸, Saint Petersburg State University²⁹, University of California, Berkeley³⁰

01 Jul 2018-Nature Biotechnology

TL;DR: Two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences are presented, including the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum information about a Metagenome-Assembled Genomes (MIMAG), including estimates of genome completeness and contamination.

...read moreread less

Abstract: We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.

...read moreread less

1,171 citations

Journal Article•DOI•

MUMmer4: A fast and versatile genome alignment system.

[...]

Guillaume Marçais¹, Arthur L. Delcher², Adam M. Phillippy³, Rachel Coston², Steven L. Salzberg², Aleksey V. Zimin¹, Aleksey V. Zimin² - Show less +3 more•Institutions (3)

University of Maryland, College Park¹, Johns Hopkins University², National Institutes of Health³

26 Jan 2018-PLOS Computational Biology

TL;DR: MUMmer4 is described, a substantially improved version of MUMmer that addresses genome size constraints by changing the 32-bit suffix tree data structure at the core of Mummer to a 48- bit suffix array, and that offers improved speed through parallel processing of input query sequences.

...read moreread less

Abstract: The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer version 3 in 2004, it has been applied to many types of problems including aligning whole genome sequences, aligning reads to a reference genome, and comparing different assemblies of the same genome. Despite its broad utility, MUMmer3 has limitations that can make it difficult to use for large genomes and for the very large sequence data sets that are common today. In this paper we describe MUMmer4, a substantially improved version of MUMmer that addresses genome size constraints by changing the 32-bit suffix tree data structure at the core of MUMmer to a 48-bit suffix array, and that offers improved speed through parallel processing of input query sequences. With a theoretical limit on the input size of 141Tbp, MUMmer4 can now work with input sequences of any biologically realistic length. We show that as a result of these enhancements, the nucmer program in MUMmer4 is easily able to handle alignments of large genomes; we illustrate this with an alignment of the human and chimpanzee genomes, which allows us to compute that the two species are 98% identical across 96% of their length. With the enhancements described here, MUMmer4 can also be used to efficiently align reads to reference genomes, although it is less sensitive and accurate than the dedicated read aligners. The nucmer aligner in MUMmer4 can now be called from scripting languages such as Perl, Python and Ruby. These improvements make MUMer4 one the most versatile genome alignment packages available.

...read moreread less

1,131 citations

Journal Article•DOI•

CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens.

[...]

Jean-Paul Concordet¹, Maximilian Haeussler²•Institutions (2)

University of Paris¹, University of California, Santa Cruz²

02 Jul 2018-Nucleic Acids Research

TL;DR: CRISPOR tries to provide a comprehensive solution from selection, cloning and expression of guide RNA as well as providing primers needed for testing guide activity and potential off-targets.

...read moreread less

Abstract: CRISPOR.org is a web tool for genome editing experiments with the CRISPR-Cas9 system. It finds guide RNAs in an input sequence and ranks them according to different scores that evaluate potential off-targets in the genome of interest and predict on-target activity. The list of genomes is continuously expanded, with more 150 genomes added in the last two years. CRISPOR tries to provide a comprehensive solution from selection, cloning and expression of guide RNA as well as providing primers needed for testing guide activity and potential off-targets. Recent developments include batch design for genome-wide CRISPR and saturation screens, creating custom oligonucleotides for guide cloning and the design of next generation sequencing primers to test for off-target mutations. CRISPOR is available from http://crispor.org, including the full source code of the website and a stand-alone, command-line version.

...read moreread less

864 citations

Journal Article•DOI•

The functions and unique features of long intergenic non-coding RNA.

[...]

Julia D. Ransohoff¹, Yuning Wei¹, Paul A. Khavari¹, Paul A. Khavari²•Institutions (2)

Stanford University¹, Veterans Health Administration²

01 Mar 2018-Nature Reviews Molecular Cell Biology

TL;DR: Long intergenic non-coding RNA genes have diverse features that distinguish them from mRNA-encoding genes and exercise functions such as remodelling chromatin and genome architecture, RNA stabilization and transcription regulation, including enhancer-associated activity.

...read moreread less

Abstract: Long intergenic non-coding RNA (lincRNA) genes have diverse features that distinguish them from mRNA-encoding genes and exercise functions such as remodelling chromatin and genome architecture, RNA stabilization and transcription regulation, including enhancer-associated activity. Some genes currently annotated as encoding lincRNAs include small open reading frames (smORFs) and encode functional peptides and thus may be more properly classified as coding RNAs. lincRNAs may broadly serve to fine-tune the expression of neighbouring genes with remarkable tissue specificity through a diversity of mechanisms, highlighting our rapidly evolving understanding of the non-coding genome.

...read moreread less

829 citations

Journal Article•DOI•

The chromatin accessibility landscape of primary human cancers

[...]

M. Ryan Corces¹, Jeffrey M. Granja¹, Shadi Shams¹, Bryan H. Louie¹, Jose A. Seoane¹, Wanding Zhou², Tiago C. Silva³, Tiago C. Silva⁴, Clarice S. Groeneveld⁵, Christopher K. Wong⁶, Seung Woo Cho¹, Ansuman T. Satpathy¹, Maxwell R. Mumbach¹, Katherine A. Hoadley⁷, A. Gordon Robertson⁸, Nathan C. Sheffield⁹, Ina Felau, Mauro A. A. Castro⁵, Benjamin P. Berman⁴, Louis M. Staudt, Jean C. Zenklusen, Peter W. Laird², Christina Curtis¹, William J. Greenleaf, Howard Y. Chang - Show less +21 more•Institutions (9)

Stanford University¹, Van Andel Institute², University of São Paulo³, Cedars-Sinai Medical Center⁴, Federal University of Paraná⁵, University of California, Santa Cruz⁶, University of North Carolina at Chapel Hill⁷, BC Cancer Agency⁸, University of Virginia⁹

26 Oct 2018-Science

TL;DR: These chromatin accessibility profiles identify cancer- and tissue-specific DNA regulatory elements that enable classification of tumor subtypes with newly recognized prognostic importance, and identify distinct TF activities in cancer based on differences in the inferred patterns of TF-DNA interaction and gene expression.

...read moreread less

Abstract: INTRODUCTION Cancer is one of the leading causes of death worldwide. Although the 2% of the human genome that encodes proteins has been extensively studied, much remains to be learned about the noncoding genome and gene regulation in cancer. Genes are turned on and off in the proper cell types and cell states by transcription factor (TF) proteins acting on DNA regulatory elements that are scattered over the vast noncoding genome and exert long-range influences. The Cancer Genome Atlas (TCGA) is a global consortium that aims to accelerate the understanding of the molecular basis of cancer. TCGA has systematically collected DNA mutation, methylation, RNA expression, and other comprehensive datasets from primary human cancer tissue. TCGA has served as an invaluable resource for the identification of genomic aberrations, altered transcriptional networks, and cancer subtypes. Nonetheless, the gene regulatory landscapes of these tumors have largely been inferred through indirect means. RATIONALE A hallmark of active DNA regulatory elements is chromatin accessibility. Eukaryotic genomes are compacted in chromatin, a complex of DNA and proteins, and only the active regulatory elements are accessible by the cell’s machinery such as TFs. The assay for transposase-accessible chromatin using sequencing (ATAC-seq) quantifies DNA accessibility through the use of transposase enzymes that insert sequencing adapters at these accessible chromatin sites. ATAC-seq enables the genome-wide profiling of TF binding events that orchestrate gene expression programs and give a cell its identity. RESULTS We generated high-quality ATAC-seq data in 410 tumor samples from TCGA, identifying diverse regulatory landscapes across 23 cancer types. These chromatin accessibility profiles identify cancer- and tissue-specific DNA regulatory elements that enable classification of tumor subtypes with newly recognized prognostic importance. We identify distinct TF activities in cancer based on differences in the inferred patterns of TF-DNA interaction and gene expression. Genome-wide correlation of gene expression and chromatin accessibility predicts tens of thousands of putative interactions between distal regulatory elements and gene promoters, including key oncogenes and targets in cancer immunotherapy, such as MYC , SRC , BCL2 , and PDL1 . Moreover, these regulatory interactions inform known genetic risk loci linked to cancer predisposition, nominating biochemical mechanisms and target genes for many cancer-linked genetic variants. Lastly, integration with mutation profiling by whole-genome sequencing identifies cancer-relevant noncoding mutations that are associated with altered gene expression. A single-base mutation located 12 kilobases upstream of the FGD4 gene, a regulator of the actin cytoskeleton, generates a putative de novo binding site for an NKX TF and is associated with an increase in chromatin accessibility and a concomitant increase in FGD4 gene expression. CONCLUSION The accessible genome of primary human cancers provides a wealth of information on the susceptibility, mechanisms, prognosis, and potential therapeutic strategies of diverse cancer types. Prediction of interactions between DNA regulatory elements and gene promoters sets the stage for future integrative gene regulatory network analyses. The discovery of hundreds of noncoding somatic mutations that exhibit allele-specific regulatory effects suggests a pervasive mechanism for cancer cells to manipulate gene expression and increase cellular fitness. These data may serve as a foundational resource for the cancer research community.

...read moreread less

774 citations

Journal Article•DOI•

FastQ Screen: A tool for multi-genome mapping and quality control

[...]

Steven W. Wingett¹, Simon Andrews¹•Institutions (1)

Babraham Institute¹

24 Aug 2018-F1000Research

TL;DR: FastQ Screen is a tool to validate the origin of DNA samples by quantifying the proportion of reads that map to a panel of reference genomes and is intended to be used routinely as a quality control measure and for analysing samples in which theorigin of the DNA is uncertain or has multiple sources.

...read moreread less

Abstract: DNA sequencing analysis typically involves mapping reads to just one reference genome. Mapping against multiple genomes is necessary, however, when the genome of origin requires confirmation. Mapping against multiple genomes is also advisable for detecting contamination or for identifying sample swaps which, if left undetected, may lead to incorrect experimental conclusions. Consequently, we present FastQ Screen, a tool to validate the origin of DNA samples by quantifying the proportion of reads that map to a panel of reference genomes. FastQ Screen is intended to be used routinely as a quality control measure and for analysing samples in which the origin of the DNA is uncertain or has multiple sources.

...read moreread less

738 citations

Journal Article•DOI•

Genome evolution across 1,011 Saccharomyces cerevisiae isolates

[...]

Jackson Peter¹, Matteo De Chiara², Anne Friedrich¹, Jia-Xing Yue², David Pflieger¹, Anders Bergström², Anastasie Sigwalt¹, Benjamin Barré², Kelle C. Freel¹, Agnès Llored², Corinne Cruaud³, Karine Labadie³, Jean-Marc Aury³, Benjamin Istace³, Kevin Lebrigand⁴, Pascal Barbry⁴, Stefan Engelen³, Arnaud Lemainque³, Patrick Wincker³, Patrick Wincker⁵, Gianni Liti², Joseph Schacherer¹ - Show less +18 more•Institutions (5)

University of Strasbourg¹, French Institute of Health and Medical Research², French Alternative Energies and Atomic Energy Commission³, Centre national de la recherche scientifique⁴, University of Évry Val d'Essonne⁵

11 Apr 2018-Nature

TL;DR: Whole-genome sequencing and phenotyping of 1,011 natural isolates of the yeast Saccharomyces cerevisiae reveal its evolutionary history, including a single out-of-China origin and multiple domestication events, and provides a framework for genotype–phenotype studies in this model organism.

...read moreread less

Abstract: Large-scale population genomic surveys are essential to explore the phenotypic diversity of natural populations. Here we report the whole-genome sequencing and phenotyping of 1,011 Saccharomyces cerevisiae isolates, which together provide an accurate evolutionary picture of the genomic variants that shape the species-wide phenotypic landscape of this yeast. Genomic analyses support a single ‘out-of-China’ origin for this species, followed by several independent domestication events. Although domesticated isolates exhibit high variation in ploidy, aneuploidy and genome content, genome evolution in wild isolates is mainly driven by the accumulation of single nucleotide polymorphisms. A common feature is the extensive loss of heterozygosity, which represents an essential source of inter-individual variation in this mainly asexual species. Most of the single nucleotide polymorphisms, including experimentally identified functional polymorphisms, are present at very low frequencies. The largest numbers of variants identified by genome-wide association are copy-number changes, which have a greater phenotypic effect than do single nucleotide polymorphisms. This resource will guide future population genomics and genotype–phenotype studies in this classic model system. Whole-genome sequencing of 1,011 natural isolates of the yeast Saccharomyces cerevisiae reveals its evolutionary history, including a single out-of-China origin and multiple domestication events, and provides a framework for genotype–phenotype studies in this model organism.

...read moreread less

727 citations

Journal Article•DOI•

Genomic and Functional Approaches to Understanding Cancer Aneuploidy

[...]

Alison M. Taylor¹, Alison M. Taylor², Juliann Shih¹, Gavin Ha² +729 more•Institutions (4)

09 Apr 2018-Cancer Cell

TL;DR: The genomic and phenotypic correlates of cancer aneuploidy are defined and genome engineering is applied to delete 3p in lung cells, causing decreased proliferation rescued in part by chromosome 3 duplication.

...read moreread less

Journal Article•DOI•

Uncovering the essential genes of the human malaria parasite Plasmodium falciparum by saturation mutagenesis

[...]

Min Zhang¹, Chengqi Wang¹, Thomas D. Otto², Jenna Oberstaller¹, Xiangyun Liao¹, Swamy R. Adapa¹, Kenneth O. Udenze¹, Iraad F. Bronner², Deborah Casandra¹, Matthew Mayho², Jacqueline Brown², Suzanne Li¹, Justin Swanson¹, Julian C. Rayner², Rays H. Y. Jiang¹, John H. Adams¹ - Show less +12 more•Institutions (2)

University of South Florida¹, Wellcome Trust Sanger Institute²

04 May 2018-Science

TL;DR: Saturation-scale mutagenesis allows prioritization of intervention targets in the genome of the most important cause of malaria, and confirms the proteasome-degradation pathway is a high-value druggable target.

...read moreread less

Abstract: INTRODUCTION Malaria remains a devastating global parasitic disease, with the majority of malaria deaths caused by the highly virulent Plasmodium falciparum . The extreme AT-bias of the P. falciparum genome has hampered genetic studies through targeted approaches such as homologous recombination or CRISPR-Cas9, and only a few hundred P. falciparum mutants have been experimentally generated in the past decades. In this study, we have used high-throughput piggyBac transposon insertional mutagenesis and quantitative insertion site sequencing (QIseq) to reach saturation-level mutagenesis of this parasite. RATIONALE Our study exploits the AT-richness of the P. falciparum genome, which provides numerous piggyBac transposon insertion targets within both gene coding and noncoding flanking sequences, to generate more than 38,000 P. falciparum mutants. At this level of mutagenesis, we could distinguish essential genes as nonmutable and dispensable genes as mutable. Subsequently, we identified 2680 genes essential for in vitro asexual blood-stage growth. RESULTS We calculated mutagenesis index scores (MISs) and mutagenesis fitness scores (MFSs) in order to functionally define the relative fitness cost of disruption for 5399 genes. A competitive growth phenotype screen confirmed that MIS and MFS were predictive of the fitness cost for in vitro asexual growth. Genes predicted to be essential included genes implicated in drug resistance—such as the “ K13 ” Kelch propeller, mdr , and dhfr-ts —as well as targets considered to be high value for drugs development, such as pkg and cdpk5 . The screen revealed essential genes that are specific to human Plasmodium parasites but absent from rodent-infective species, such as lipid metabolic genes that may be crucial to transmission commitment in human infections. MIS and MFS profiling provides a clear ranking of the relative essentiality of gene ontology (GO) functions in P. falciparum . GO pathways associated with translation, RNA metabolism, and cell cycle control are more essential, whereas genes associated with protein phosphorylation, virulence factors, and transcription are more likely to be dispensable. Last, we confirm that the proteasome-degradation pathway is a high-value druggable target on the basis of its high ratio of essential to dispensable genes, and by functionally confirming its link to the mode of action of artemisinin, the current front-line antimalarial. CONCLUSION Saturation-scale mutagenesis allows prioritization of intervention targets in the genome of the most important cause of malaria. The identification of more than 2680 essential genes, including ~1000 Plasmodium -conserved essential genes, will be valuable for antimalarial therapeutic research.

...read moreread less

Journal Article•DOI•

Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality.

[...]

Chaoling Wei¹, Hua Yang¹, Songbo Wang², Jian Zhao¹, Chun Liu², Liping Gao¹, En-Hua Xia¹, Ying Lu³, Yuling Tai¹, Guangbiao She¹, Jun Sun¹, Haisheng Cao¹, Wei Tong¹, Qiang Gao², Yeyun Li¹, Wei-Wei Deng¹, Xiaolan Jiang¹, Wenzhao Wang¹, Qi Chen¹, Shihua Zhang¹, Haijing Li¹, Junlan Wu¹, Ping Wang¹, Penghui Li¹, Chengying Shi¹, Fengya Zheng², Jianbo Jian², Bei Huang¹, Dai Shan², Mingming Shi², Congbing Fang¹, Yi Yue¹, Fangdong Li¹, Daxiang Li¹, Shu Wei¹, Bin Han⁴, Chang-Jun Jiang¹, Ye Yin², Tao Xia¹, Zhengzhu Zhang¹, Jeffrey L. Bennetzen¹, Shancen Zhao², Xiaochun Wan¹ - Show less +39 more•Institutions (4)

Anhui Agricultural University¹, Beijing Genomics Institute², Shanghai Ocean University³, Chinese Academy of Sciences⁴

01 May 2018-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A high-quality genome assembly of Camellia sinensis var.

...read moreread less

Abstract: Tea, one of the world’s most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties.

...read moreread less

Journal Article•DOI•

Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies.

[...]

Michael J. Roach¹, Simon A. Schmidt¹, Anthony R. Borneman¹•Institutions (1)

Australian Wine Research Institute¹

29 Nov 2018-BMC Bioinformatics

TL;DR: Purge Haplotigs improves the haploid and diploid representations of third-gen sequencing based genome assemblies by identifying and reassigning allelic contigs and is less likely to over-purge repetitive or paralogous elements compared to alignment-only based methods.

...read moreread less

Abstract: Recent developments in third-gen long read sequencing and diploid-aware assemblers have resulted in the rapid release of numerous reference-quality assemblies for diploid genomes. However, assembly of highly heterozygous genomes is still problematic when regional heterogeneity is so high that haplotype homology is not recognised during assembly. This results in regional duplication rather than consolidation into allelic variants and can cause issues with downstream analysis, for example variant discovery, or haplotype reconstruction using the diploid assembly with unpaired allelic contigs. A new pipeline—Purge Haplotigs—was developed specifically for third-gen sequencing-based assemblies to automate the reassignment of allelic contigs, and to assist in the manual curation of genome assemblies. The pipeline uses a draft haplotype-fused assembly or a diploid assembly, read alignments, and repeat annotations to identify allelic variants in the primary assembly. The pipeline was tested on a simulated dataset and on four recent diploid (phased) de novo assemblies from third-generation long-read sequencing, and compared with a similar tool. After processing with Purge Haplotigs, haploid assemblies were less duplicated with minimal impact on genome completeness, and diploid assemblies had more pairings of allelic contigs. Purge Haplotigs improves the haploid and diploid representations of third-gen sequencing based genome assemblies by identifying and reassigning allelic contigs. The implementation is fast and scales well with large genomes, and it is less likely to over-purge repetitive or paralogous elements compared to alignment-only based methods. The software is available at https://bitbucket.org/mroachawri/purge_haplotigs under a permissive MIT licence.

...read moreread less

Journal Article•DOI•

High-resolution TADs reveal DNA sequences underlying genome organization in flies.

[...]

Fidel Ramírez¹, Vivek Bhardwaj¹, Vivek Bhardwaj², Laura Arrigoni¹, Kin Chung Lam¹, Björn Grüning², Jose M. Villaveces¹, Bianca Habermann¹, Asifa Akhtar¹, Thomas Manke¹ - Show less +6 more•Institutions (2)

Max Planck Society¹, University of Freiburg²

15 Jan 2018-Nature Communications

TL;DR: Software to identify high-resolution TAD boundaries and reveal their relationship to underlying DNA motifs is developed and it is demonstrated that boundaries can be accurately predicted using only the motif sequences at open chromatin sites.

...read moreread less

Abstract: Despite an abundance of new studies about topologically associating domains (TADs), the role of genetic information in TAD formation is still not fully understood. Here we use our software, HiCExplorer (hicexplorer.readthedocs.io) to annotate >2800 high-resolution (570 bp) TAD boundaries in Drosophila melanogaster. We identify eight DNA motifs enriched at boundaries, including a motif bound by the M1BP protein, and two new boundary motifs. In contrast to mammals, the CTCF motif is only enriched on a small fraction of boundaries flanking inactive chromatin while most active boundaries contain the motifs bound by the M1BP or Beaf-32 proteins. We demonstrate that boundaries can be accurately predicted using only the motif sequences at open chromatin sites. We propose that DNA sequence guides the genome architecture by allocation of boundary proteins in the genome. Finally, we present an interactive online database to access and explore the spatial organization of fly, mouse and human genomes, available at http://chorogenome.ie-freiburg.mpg.de . Although topologically associating domains (TADs) have been extensively investigated, it is not clear to what extent DNA sequence contributes to their formation. Here the authors develop software to identify high-resolution TAD boundaries and reveal their relationship to underlying DNA motifs.

...read moreread less

Journal Article•DOI•

Structural variation in the 3D genome.

[...]

Malte Spielmann¹, Darío G. Lupiáñez², Stefan Mundlos³•Institutions (3)

University of Washington¹, Max Delbrück Center for Molecular Medicine², Max Planck Society³

01 Jul 2018-Nature Reviews Genetics

TL;DR: The authors review the role of genetic structural variation in disease and the pathogenic potential of changes to the 3D genome.

...read moreread less

Abstract: Structural and quantitative chromosomal rearrangements, collectively referred to as structural variation (SV), contribute to a large extent to the genetic diversity of the human genome and thus are of high relevance for cancer genetics, rare diseases and evolutionary genetics. Recent studies have shown that SVs can not only affect gene dosage but also modulate basic mechanisms of gene regulation. SVs can alter the copy number of regulatory elements or modify the 3D genome by disrupting higher-order chromatin organization such as topologically associating domains. As a result of these position effects, SVs can influence the expression of genes distant from the SV breakpoints, thereby causing disease. The impact of SVs on the 3D genome and on gene expression regulation has to be considered when interpreting the pathogenic potential of these variant types.

...read moreread less

Journal Article•DOI•

IRscope: an online program to visualize the junction sites of chloroplast genomes.

[...]

Ali Amiryousefi, Jaakko Hyvönen¹, Péter Poczai¹•Institutions (1)

American Museum of Natural History¹

01 Sep 2018-Bioinformatics

TL;DR: A new visualization tool that is specifically designed for chloroplast genomes is announced that allows the users to depict the genetic architecture of up to ten chlorop last genomes in the vicinity of the sites connecting the inverted repeats to the short and long single copy regions.

...read moreread less

Abstract: Motivation Genome plotting is performed using a wide range of visualizations tools each with emphasis on a different informative dimension of the genome. These tools can provide a deeper insight into the genomic structure of the organism. Results Here, we announce a new visualization tool that is specifically designed for chloroplast genomes. It allows the users to depict the genetic architecture of up to ten chloroplast genomes in the vicinity of the sites connecting the inverted repeats to the short and long single copy regions. The software and its dependent libraries are fully coded in R and the reflected plot is scaled up to realistic size of nucleotide base pairs in the vicinity of the junction sites. We introduce a website for easier use of the program and R source code of the software to be used in case of preferences to be changed and integrated into personal pipelines. The input of the program is an annotation GenBank (.gb) file, the accession or GI number of the sequence or a DOGMA output file. The software was tested using over a 100 embryophyte chloroplast genomes and in all cases a reliable output was obtained. Availability and implementation Source codes and the online suit available at https://irscope.shinyapps.io/irapp/ or https://github.com/Limpfrog/irscope.

...read moreread less

Journal Article•DOI•

Whole genome landscapes of major melanoma subtypes

[...]

Richard A. Scolyer

01 Feb 2018-Pathology

TL;DR: Nicholas K. Hayward*, James S. Wilmott*, Nicola Waddell*, Peter A. Johansson*, Matthew A. Spillane, Robyn P. Lau, Rebecca A. Dagg, Sarah-Jane Schramm, Antonia Pritchard, Ken Dutton-Regester, Felicity Newell, Anna Fitzgerald, Catherine A. Shang, Sean M.

...read moreread less

Journal Article•DOI•

Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice.

[...]

Qiang Zhao¹, Qi Feng¹, Hengyun Lu¹, Yan Li¹, Ahong Wang¹, Qilin Tian¹, Qilin Zhan¹, Yiqi Lu¹, Lei Zhang¹, Tao Huang¹, Yongchun Wang¹, Danlin Fan¹, Yan Zhao¹, Ziqun Wang¹, Congcong Zhou¹, Jiaying Chen¹, Chuanrang Zhu¹, Wen-Jun Li¹, Qijun Weng¹, Qun Xu², Zi-Xuan Wang¹, Xinghua Wei², Bin Han¹, Xuehui Huang³, Xuehui Huang¹ - Show less +21 more•Institutions (3)

Chinese Academy of Sciences¹, Rice University², Shanghai Normal University³

15 Jan 2018-Nature Genetics

TL;DR: A pan-genome dataset of the Oryza sativa–Oryza rufipogon species complex generated through deep sequencing and de novo genome assembly of 66 divergent accessions will be helpful in pinpointing new causal variants underlying complex traits and in promoting evolutionary and functional studies in rice.

...read moreread less

Abstract: The rich genetic diversity in Oryza sativa and Oryza rufipogon serves as the main sources in rice breeding. Large-scale resequencing has been undertaken to discover allelic variants in rice, but much of the information for genetic variation is often lost by direct mapping of short sequence reads onto the O. sativa japonica Nipponbare reference genome. Here we constructed a pan-genome dataset of the O. sativa–O. rufipogon species complex through deep sequencing and de novo assembly of 66 divergent accessions. Intergenomic comparisons identified 23 million sequence variants in the rice genome. This catalog of sequence variations includes many known quantitative trait nucleotides and will be helpful in pinpointing new causal variants that underlie complex traits. In particular, we systemically investigated the whole set of coding genes using this pan-genome data, which revealed extensive presence and absence of variation among rice accessions. This pan-genome resource will further promote evolutionary and functional studies in rice. A pan-genome dataset of the Oryza sativa–Oryza rufipogon species complex generated through deep sequencing and de novo genome assembly of 66 divergent accessions will be helpful in pinpointing new causal variants underlying complex traits and in promoting evolutionary and functional studies in rice.

...read moreread less

Journal Article•DOI•

Mutant phenotypes for thousands of bacterial genes of unknown function

[...]

Morgan N. Price¹, Kelly M. Wetmore¹, Robert Jordan Waters¹, Mark Callaghan¹, Jayashree Ray¹, Hualan Liu¹, Jennifer V. Kuehl¹, Ryan A. Melnyk¹, Jacob S. Lamson¹, Yumi Suh¹, Hans K. Carlson¹, Zuelma Esquivel¹, Harini Sadeeshkumar¹, Romy Chakraborty¹, Grant M. Zane², Benjamin E. Rubin³, Judy D. Wall², Axel Visel¹, Axel Visel⁴, James Bristow¹, Matthew J. Blow¹, Adam P. Arkin¹, Adam P. Arkin⁵, Adam M. Deutschbauer¹, Adam M. Deutschbauer⁵ - Show less +21 more•Institutions (5)

Lawrence Berkeley National Laboratory¹, University of Missouri², University of California, San Diego³, University of California, Merced⁴, University of California, Berkeley⁵

16 May 2018-Nature

TL;DR: A large-scale mutagenesis screen identifies mutant phenotypes for over 11,000 protein-coding genes in bacteria that had previously not been assigned a specific function, demonstrating the scalability of microbial genetics and its utility for improving gene annotations.

...read moreread less

Abstract: One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because they are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.

...read moreread less

Journal Article•DOI•

CRISPR for Crop Improvement: An Update Review.

[...]

Deepa Jaganathan¹, Karthikeyan Ramasamy¹, Gothandapani Sellamuthu¹, Shilpha Jayabalan¹, Gayatri Venkataraman¹ - Show less +1 more•Institutions (1)

M S Swaminathan Research Foundation¹

17 Jul 2018-Frontiers in Plant Science

TL;DR: Application of CRISPR/Cas9 techniques will result in the development of non-genetically modified (Non-GMO) crops with the desired trait that can contribute to increased yield potential under biotic and abiotic stress conditions.

...read moreread less

Abstract: The availability of genome sequences for several crops and advances in genome editing approaches has opened up possibilities to breed for almost any given desirable trait. Advancements in genome editing technologies such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) has made it possible for molecular biologists to more precisely target any gene of interest. However, these methodologies are expensive and time-consuming as they involve complicated steps that require protein engineering. Unlike first-generation genome editing tools, CRISPR/Cas9 genome editing involves simple designing and cloning methods, with the same Cas9 being potentially available for use with different guide RNAs targeting multiple sites in the genome. After proof-of-concept demonstrations in crop plants involving the primary CRISPR-Cas9 module, several modified Cas9 cassettes have been utilized in crop plants for improving target specificity and reducing off-target cleavage (eg. Nmcas9, Sacas9, Stcas9). Further, the availability of Cas9 enzymes from additional bacterial species has made available options to enhance specificity and efficiency of gene editing methodologies. This review summarizes the options available to plant biotechnologists to bring about crop improvement using CRISPR/Cas9 based genome editing tools and also presents studies where CRISPR/Cas9 has been used for enhancing biotic and abiotic stress tolerance. Application of these techniques will result in the development of non-genetically modified (Non-GMO) crops with the desired trait that can contribute to increased yield potential under biotic and abiotic stress conditions.

...read moreread less

Journal Article•DOI•

Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L.

[...]

Jisen Zhang¹, Xingtan Zhang², Haibao Tang², Qing Zhang², Xiuting Hua², Xiaokai Ma², Fan Zhu², Tyler Jones, Xin-Guang Zhu³, John E. Bowers⁴, Ching Man Wai⁵, Chunfang Zheng⁶, Yan Shi², Shuai Chen², Xiuming Xu², Jingjing Yue², David R. Nelson⁷, Lixian Huang², Zhen Li², Huimin Xu², Dong Zhou², Yongjun Wang², Weichang Hu², Jishan Lin², Youjin Deng², Neha Pandey², Melina Cristina Mancini², Dessireé Zerpa², Julie K. Nguyen², Liming Wang², Liang Yu², Yinghui Xin², Liangfa Ge², Jie Arro², Jennifer Han², Setu Chakrabarty², Marija Pushko², Wenping Zhang², Yanhong Ma², Panpan Ma², Mingju Lv³, Faming Chen⁸, Guangyong Zheng⁸, Jingsheng Xu², Zhenhui Yang², Fang Deng², Xuequn Chen², Zhenyang Liao², Xunxiao Zhang², Zhicong Lin², Hai Lin², Hansong Yan², Zheng Kuang², Weimin Zhong², Pingping Liang², Guofeng Wang², Yuan Yuan², Jiaxian Shi², Jinxiang Hou², Jingxian Lin², Jingjing Jin, Peijian Cao, Qiaochu Shen², Qing Jiang², Ping Zhou², Yaying Ma², Xiaodan Zhang², Rongrong Xu², Juan Liu², Yongmei Zhou², Haifeng Jia², Qing Ma², Rui Qi², Zhiliang Zhang², Jingping Fang², Hongkun Fang², Jinjin Song², Mengjuan Wang², Guangrui Dong², Gang Wang², Zheng Chen², Teng Ma², Hong Liu², Singha R. Dhungana⁹, Sarah E. Huss², Xiping Yang¹⁰, Anupma Sharma¹¹, Jhon H. Trujillo, Maria C. Martinez, Matthew E. Hudson², John J. Riascos, Mary A. Schuler², Li Qing Chen², David M. Braun⁹, Lei Li², Qingyi Yu¹¹, Jianping Wang¹, Jianping Wang¹⁰, Kai Wang², Michael C. Schatz¹², David Heckerman¹³, Marie-Anne Van Sluys¹⁴, Glaucia Mendes Souza¹⁴, Paul H. Moore, David Sankoff⁶, Robert VanBuren⁵, Andrew H. Paterson⁴, Chifumi Nagai, Ray Ming², Ray Ming¹ - Show less +106 more•Institutions (14)

Fujian Agriculture and Forestry University¹, University of Illinois at Urbana–Champaign², Chinese Academy of Sciences³, University of Georgia⁴, Michigan State University⁵, University of Ottawa⁶, University of Tennessee⁷, CAS-MPG Partner Institute for Computational Biology⁸, University of Missouri⁹, University of Florida¹⁰, Texas A&M University System¹¹, Johns Hopkins University¹², Microsoft¹³, University of São Paulo¹⁴

08 Oct 2018-Nature Genetics

TL;DR: In this article, a haplotype of S. spontaneum, AP85-441, facilitated the assembly of 32 pseudo-chromosomes comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined.

...read moreread less

Abstract: Modern sugarcanes are polyploid interspecific hybrids, combining high sugar content from Saccharum officinarum with hardiness, disease resistance and ratooning of Saccharum spontaneum. Sequencing of a haploid S. spontaneum, AP85-441, facilitated the assembly of 32 pseudo-chromosomes comprising 8 homologous groups of 4 members each, bearing 35,525 genes with alleles defined. The reduction of basic chromosome number from 10 to 8 in S. spontaneum was caused by fissions of 2 ancestral chromosomes followed by translocations to 4 chromosomes. Surprisingly, 80% of nucleotide binding site-encoding genes associated with disease resistance are located in 4 rearranged chromosomes and 51% of those in rearranged regions. Resequencing of 64 S. spontaneum genomes identified balancing selection in rearranged regions, maintaining their diversity. Introgressed S. spontaneum chromosomes in modern sugarcanes are randomly distributed in AP85-441 genome, indicating random recombination among homologs in different S. spontaneum accessions. The allele-defined Saccharum genome offers new knowledge and resources to accelerate sugarcane improvement.

...read moreread less

Journal Article•DOI•

Genomic features of bacterial adaptation to plants.

[...]

Asaf Levy¹, Isai Salas González², Isai Salas González³, Maximilian Mittelviefhaus⁴, Scott Clingenpeel¹, Sur Herrera Paredes², Sur Herrera Paredes³, Sur Herrera Paredes⁵, Jiamin Miao⁶, Jiamin Miao⁷, Kunru Wang⁶, Giulia Devescovi⁸, Kyra Stillman¹, Freddy Monteiro³, Freddy Monteiro², Bryan Rangel Alvarez¹, Derek S. Lundberg², Derek S. Lundberg³, Tse-Yuan Lu⁹, Sarah L. Lebeis¹⁰, Zhao Jin¹¹, Meredith McDonald³, Meredith McDonald², Andrew P. Klein³, Andrew P. Klein², Meghan E. Feltcher², Meghan E. Feltcher¹², Meghan E. Feltcher³, Tijana Glavina del Rio¹, Sarah R. Grant³, Sharon L. Doty¹³, Ruth E. Ley¹⁴, Bingyu Zhao⁶, Vittorio Venturi⁸, Dale A. Pelletier⁹, Julia A. Vorholt⁴, Susannah G. Tringe¹, Susannah G. Tringe¹⁵, Tanja Woyke¹⁵, Tanja Woyke¹, Jeffery L. Dangl - Show less +37 more•Institutions (15)

Joint Genome Institute¹, Howard Hughes Medical Institute², University of North Carolina at Chapel Hill³, ETH Zurich⁴, Stanford University⁵, Virginia Tech⁶, Gansu Agricultural University⁷, International Centre for Genetic Engineering and Biotechnology⁸, Oak Ridge National Laboratory⁹, University of Tennessee¹⁰, Cornell University¹¹, Research Triangle Park¹², University of Washington¹³, Max Planck Society¹⁴, University of California, Merced¹⁵

01 Jan 2018-Nature Genetics

TL;DR: This work sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize and validated candidates from two sets of plant-associated genes, including one involved in plant colonization and the other serving in microbe–microbe competition between plant and microbe.

...read moreread less

Abstract: Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and the other serving in microbe-microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. This work expands the genome-based understanding of plant-microbe interactions and provides potential leads for efficient and sustainable agriculture through microbiome engineering.

...read moreread less

Journal Article•DOI•

The axolotl genome and the evolution of key tissue formation regulators

[...]

Sergej Nowoshilow¹, Sergej Nowoshilow², Sergej Nowoshilow³, Siegfried Schloissnig⁴, Ji-Feng Fei⁵, Andreas Dahl³, Andy Wing Chun Pang, Martin Pippel⁴, Sylke Winkler², Alex Hastie, George R. Young⁶, Juliana G. Roscito², Francisco Falcon⁷, Dunja Knapp³, Sean Powell⁴, Alfredo Cruz⁷, Han Cao, Bianca Habermann⁸, Michael Hiller², Elly M. Tanaka¹, Elly M. Tanaka², Elly M. Tanaka³, Eugene W. Myers² - Show less +19 more•Institutions (8)

Research Institute of Molecular Pathology¹, Max Planck Society², Dresden University of Technology³, Heidelberg Institute for Theoretical Studies⁴, South China Normal University⁵, Francis Crick Institute⁶, CINVESTAV⁷, Aix-Marseille University⁸

24 Jan 2018-Nature

TL;DR: The sequencing and assembly of the 32-gigabase-pair axolotl genome is reported using an approach that combined long-read sequencing, optical mapping and development of a new genome assembler (MARVEL).

...read moreread less

Abstract: Salamanders serve as important tetrapod models for developmental, regeneration and evolutionary studies. An extensive molecular toolkit makes the Mexican axolotl (Ambystoma mexicanum) a key representative salamander for molecular investigations. Here we report the sequencing and assembly of the 32-gigabase-pair axolotl genome using an approach that combined long-read sequencing, optical mapping and development of a new genome assembler (MARVEL). We observed a size expansion of introns and intergenic regions, largely attributable to multiplication of long terminal repeat retroelements. We provide evidence that intron size in developmental genes is under constraint and that species-restricted genes may contribute to limb regeneration. The axolotl genome assembly does not contain the essential developmental gene Pax3. However, mutation of the axolotl Pax3 paralogue Pax7 resulted in an axolotl phenotype that was similar to those seen in Pax3-/- and Pax7-/- mutant mice. The axolotl genome provides a rich biological resource for developmental and evolutionary studies.

...read moreread less

Journal Article•DOI•

Comparative genome and phenotypic analysis of three Clostridioides difficile strains isolated from a single patient provide insight into multiple infection of C. difficile.

[...]

Uwe Groß¹, Elzbieta Brzuszkiewicz¹, Katrin Gunka¹, Jessica Starke¹, Thomas Riedel², Boyke Bunk², Cathrin Spröer², Daniela Wetzel¹, Anja Poehlein¹, Cynthia Maria Chibani¹, Wolfgang Bohne¹, Jörg Overmann², Ortrud Zimmermann¹, Rolf Daniel¹, Heiko Liesegang¹ - Show less +11 more•Institutions (2)

University of Göttingen¹, Leibniz Association²

02 Jan 2018-BMC Genomics

TL;DR: Findings show that evolutionary events based on horizontal gene transfer occur within an ongoing CDI and contribute to the adaptation of the species by the introduction of new genes into the genomes.

...read moreread less

Abstract: Clostridioides difficile infections (CDI) have emerged over the past decade causing symptoms that range from mild, antibiotic-associated diarrhea (AAD) to life-threatening toxic megacolon. In this study, we describe a multiple and isochronal (mixed) CDI caused by the isolates DSM 27638, DSM 27639 and DSM 27640 that already initially showed different morphotypes on solid media. The three isolates belonging to the ribotypes (RT) 012 (DSM 27639) and 027 (DSM 27638 and DSM 27640) were phenotypically characterized and high quality closed genome sequences were generated. The genomes were compared with seven reference strains including three strains of the RT 027, two of the RT 017, and one of the RT 078 as well as a multi-resistant RT 012 strain. The analysis of horizontal gene transfer events revealed gene acquisition incidents that sort the strains within the time line of the spread of their RTs within Germany. We could show as well that horizontal gene transfer between the members of different RTs occurred within this multiple infection. In addition, acquisition and exchange of virulence-related features including antibiotic resistance genes were observed. Analysis of the two genomes assigned to RT 027 revealed three single nucleotide polymorphisms (SNPs) and apparently a regional genome modification within the flagellar switch that regulates the fli operon. Our findings show that (i) evolutionary events based on horizontal gene transfer occur within an ongoing CDI and contribute to the adaptation of the species by the introduction of new genes into the genomes, (ii) within a multiple infection of a single patient the exchange of genetic material was responsible for a much higher genome variation than the observed SNPs.

...read moreread less

Journal Article•DOI•

The Enhancement of Plant Disease Resistance Using CRISPR/Cas9 Technology.

[...]

Virginia Maria Grazia Borrelli¹, Vittoria Brambilla², Peter M. Rogowsky³, Adriano Marocco¹, Alessandra Lanubile¹ - Show less +1 more•Institutions (3)

Catholic University of the Sacred Heart¹, University of Milan², Institut national de la recherche agronomique³

24 Aug 2018-Frontiers in Plant Science

TL;DR: Recently, CRISPR/Cas9 has largely overtaken the other genome editing technologies due to the fact that it is easier to design and implement, has a higher success rate, and is more versatile and less expensive.

...read moreread less

Abstract: Genome editing technologies have progressed rapidly and become one of the most important genetic tools in the implementation of pathogen resistance in plants. Recent years have witnessed the emergence of site directed modification methods using meganucleases, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindrome repeats (CRISPR)/CRISPR-associated protein 9 (Cas9). Recently, CRISPR/Cas9 has largely overtaken the other genome editing technologies due to the fact that it is easier to design and implement, has a higher success rate, and is more versatile and less expensive. This review focuses on the recent advances in plant protection using CRISPR/Cas9 technology in model plants and crops in response to viral, fungal and bacterial diseases. As regards the achievement of viral disease resistance, the main strategies employed in model species such as Arabidopsis and Nicotiana benthamiana, which include the integration of CRISPR-encoding sequences that target and interfere with the viral genome and the induction of a CRISPR-mediated targeted mutation in the host plant genome, will be discussed. Furthermore, as regards fungal and bacterial disease resistance, the strategies based on CRISPR/Cas9 targeted modification of susceptibility genes in crop species such as rice, tomato, wheat, and citrus will be reviewed. After spending years deciphering and reading genomes, researchers are now editing and rewriting them to develop crop plants resistant to specific pests and pathogens.

...read moreread less

Journal Article•DOI•

Cancer Genome Interpreter annotates the biological and clinical relevance of tumor alterations

[...]

David Tamborero¹, Carlota Rubio-Perez¹, Jordi Deu-Pons¹, Michael P Schroeder¹, Michael P Schroeder², Ana Vivancos³, Ana Rovira, Ignasi Tusquets³, Joan Albanell¹, Jordi Rodon³, Josep Tabernero³, Carmen de Torres⁴, Rodrigo Dienstmann³, Abel Gonzalez-Perez¹, Nuria Lopez-Bigas⁵, Nuria Lopez-Bigas¹ - Show less +12 more•Institutions (5)

Pompeu Fabra University¹, Charité², Autonomous University of Barcelona³, Hospital Sant Joan de Déu Barcelona⁴, Catalan Institution for Research and Advanced Studies⁵

28 Mar 2018-Genome Medicine

TL;DR: The Cancer Genome Interpreter is presented, a versatile platform that automates the interpretation of newly sequenced cancer genomes, annotating the potential of alterations detected in tumors to act as drivers and their possible effect on treatment response.

...read moreread less

Abstract: While tumor genome sequencing has become widely available in clinical and research settings, the interpretation of tumor somatic variants remains an important bottleneck. Here we present the Cancer Genome Interpreter, a versatile platform that automates the interpretation of newly sequenced cancer genomes, annotating the potential of alterations detected in tumors to act as drivers and their possible effect on treatment response. The results are organized in different levels of evidence according to current knowledge, which we envision can support a broad range of oncology use cases. The resource is publicly available at http://www.cancergenomeinterpreter.org .

...read moreread less

Journal Article•DOI•

I-motif DNA structures are formed in the nuclei of human cells

[...]

Mahdi Zeraati¹, Mahdi Zeraati², David B. Langley¹, Peter R. Schofield¹, Aaron L. Moye³, Romain Rouet¹, William E. Hughes², William E. Hughes¹, Tracy M. Bryan³, Marcel E. Dinger¹, Marcel E. Dinger², Daniel Christ¹, Daniel Christ² - Show less +9 more•Institutions (3)

Garvan Institute of Medical Research¹, University of New South Wales², Children's Medical Research Institute³

23 Apr 2018-Nature Chemistry

TL;DR: The generation and characterization of an antibody fragment (iMab) is reported that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i- Motifs in the nuclei of human cells and providing evidence that i-Motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions.

...read moreread less

Abstract: Human genome function is underpinned by the primary storage of genetic information in canonical B-form DNA, with a second layer of DNA structure providing regulatory control. I-motif structures are thought to form in cytosine-rich regions of the genome and to have regulatory functions; however, in vivo evidence for the existence of such structures has so far remained elusive. Here we report the generation and characterization of an antibody fragment (iMab) that recognizes i-motif structures with high selectivity and affinity, enabling the detection of i-motifs in the nuclei of human cells. We demonstrate that the in vivo formation of such structures is cell-cycle and pH dependent. Furthermore, we provide evidence that i-motif structures are formed in regulatory regions of the human genome, including promoters and telomeric regions. Our results support the notion that i-motif structures provide key regulatory roles in the genome.

...read moreread less

Journal Article•DOI•

The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level

[...]

Luis M. Rodriguez-R¹, Santosh Gunturu², William T. Harvey¹, Ramon Rosselló-Móra³, James M. Tiedje², James R. Cole², Konstantinos T. Konstantinidis¹ - Show less +3 more•Institutions (3)

Georgia Institute of Technology¹, Michigan State University², University of the Balearic Islands³

02 Jul 2018-Nucleic Acids Research

TL;DR: The Microbial Genomes Atlas (MiGA) is a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity concepts.

...read moreread less

Abstract: The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called ‘Clade Project’) to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.

...read moreread less

Journal Article•DOI•

A-to-I RNA editing — immune protector and transcriptome diversifier

[...]

Eli Eisenberg¹, Erez Y. Levanon²•Institutions (2)

Tel Aviv University¹, Bar-Ilan University²

01 Aug 2018-Nature Reviews Genetics

TL;DR: Next-generation sequencing technologies have enabled the comparison of editomes from multiple individuals and from multiple species and the results have changed the understanding of the extent and distribution of A-to-I editing and its role in evolution and disease.

...read moreread less

Abstract: Modifications of RNA affect its function and stability. RNA editing is unique among these modifications because it not only alters the cellular fate of RNA molecules but also alters their sequence relative to the genome. The most common type of RNA editing is A-to-I editing by double-stranded RNA-specific adenosine deaminase (ADAR) enzymes. Recent transcriptomic studies have identified a number of 'recoding' sites at which A-to-I editing results in non-synonymous substitutions in protein-coding sequences. Many of these recoding sites are conserved within (but not usually across) lineages, are under positive selection and have functional and evolutionary importance. However, systematic mapping of the editome across the animal kingdom has revealed that most A-to-I editing sites are located within mobile elements in non-coding parts of the genome. Editing of these non-coding sites is thought to have a critical role in protecting against activation of innate immunity by self-transcripts. Both recoding and non-coding events have implications for genome evolution and, when deregulated, may lead to disease. Finally, ADARs are now being adapted for RNA engineering purposes.

...read moreread less

Collapse