scispace - formally typeset
Search or ask a question
Author

Marie E. Bolger

Bio: Marie E. Bolger is an academic researcher from Forschungszentrum Jülich. The author has contributed to research in topics: Genome & Sequence assembly. The author has an hindex of 11, co-authored 16 publications receiving 1169 citations. Previous affiliations of Marie E. Bolger include RWTH Aachen University & Max Planck Society.

Papers
More filters
Journal ArticleDOI
TL;DR: A high-quality genome assembly of the parents of the IL population of S. pennellii is described, defining candidate genes for stress tolerance and providing evidence that transposable elements had a role in the evolution of these traits.
Abstract: Solanum pennellii is a wild tomato species endemic to Andean regions in South America, where it has evolved to thrive in arid habitats. Because of its extreme stress tolerance and unusual morphology, it is an important donor of germplasm for the cultivated tomato Solanum lycopersicum. Introgression lines (ILs) in which large genomic regions of S. lycopersicum are replaced with the corresponding segments from S. pennellii can show remarkably superior agronomic performance. Here we describe a high-quality genome assembly of the parents of the IL population. By anchoring the S. pennellii genome to the genetic map, we define candidate genes for stress tolerance and provide evidence that transposable elements had a role in the evolution of these traits. Our work paves a path toward further tomato improvement and for deciphering the mechanisms underlying the myriad other agronomic traits that can be improved with S. pennellii germplasm.

378 citations

Journal ArticleDOI
TL;DR: A redesigned and significantly enhanced MapMan4 framework is presented, together with a revised version of the associated online Mercator annotation tool, providing protein annotations for all embryophytes with a comparably high quality.

276 citations

Journal ArticleDOI
TL;DR: Though many of the published genomes are considered incomplete, they nevertheless have proved a valuable tool to understand important crop traits such as fruit ripening, grain traits and flowering time adaptation.

271 citations

Journal ArticleDOI
TL;DR: The generation of a comprehensive nanopore sequencing data set with a median read length of 11,979 bp for a self-compatible accession of the wild tomato species Solanum pennellii indicates that such long read sequencing data can be used to affordably sequence and assemble gigabase-sized plant genomes.
Abstract: Updates in nanopore technology have made it possible to obtain gigabases of sequence data. Prior to this, nanopore sequencing technology was mainly used to analyze microbial samples. Here, we describe the generation of a comprehensive nanopore sequencing data set with a median read length of 11,979 bp for a self-compatible accession of the wild tomato species Solanum pennellii. We describe the assembly of its genome to a contig N50 of 2.5 MB. The assembly pipeline comprised initial read correction with Canu and assembly with SMARTdenovo. The resulting raw nanopore-based de novo genome is structurally highly similar to that of the reference S. pennellii LA716 accession but has a high error rate and was rich in homopolymer deletions. After polishing the assembly with Illumina reads, we obtained an error rate of <0.02% when assessed versus the same Illumina data. We obtained a gene completeness of 96.53%, slightly surpassing that of the reference S. pennellii. Taken together, our data indicate that such long read sequencing data can be used to affordably sequence and assemble gigabase-sized plant genomes.

179 citations

Journal ArticleDOI
TL;DR: The authors describe the genome sequence of a parasitic plant, Cuscuta campestris, and find that gene losses and host gene acquisitions reflect the independence from photosynthesis and the ability to retain and express chunks of foreign genomic DNA.
Abstract: A parasitic lifestyle, where plants procure some or all of their nutrients from other living plants, has evolved independently in many dicotyledonous plant families and is a major threat for agriculture globally. Nevertheless, no genome sequence of a parasitic plant has been reported to date. Here we describe the genome sequence of the parasitic field dodder, Cuscuta campestris. The genome contains signatures of a fairly recent whole-genome duplication and lacks genes for pathways superfluous to a parasitic lifestyle. Specifically, genes needed for high photosynthetic activity are lost, explaining the low photosynthesis rates displayed by the parasite. Moreover, several genes involved in nutrient uptake processes from the soil are lost. On the other hand, evidence for horizontal gene transfer by way of genomic DNA integration from the parasite’s hosts is found. We conclude that the parasitic lifestyle has left characteristic footprints in the C. campestris genome.

122 citations


Cited by
More filters
01 Jun 2012
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

10,124 citations

Journal ArticleDOI
TL;DR: The current landscape of available tools is reviewed, the principles of error correction, base modification detection, and long-read transcriptomics analysis are focused on, and the challenges that remain are highlighted.
Abstract: Long-read technologies are overcoming early limitations in accuracy and throughput, broadening their application domains in genomics. Dedicated analysis tools that take into account the characteristics of long-read data are thus required, but the fast pace of development of such tools can be overwhelming. To assist in the design and analysis of long-read sequencing projects, we review the current landscape of available tools and present an online interactive database, long-read-tools.org, to facilitate their browsing. We further focus on the principles of error correction, base modification detection, and long-read transcriptomics analysis and highlight the challenges that remain.

1,172 citations

Journal ArticleDOI
TL;DR: An effective new paradigm is the targeted identification of specific genetic determinants of stress adaptation that have evolved in nature and their precise introgression into elite varieties.
Abstract: Crop yield reduction as a consequence of increasingly severe climatic events threatens global food security. Genetic loci that ensure productivity in challenging environments exist within the germplasm of crops, their wild relatives and species that are adapted to extreme environments. Selective breeding for the combination of beneficial loci in germplasm has improved yields in diverse environments throughout the history of agriculture. An effective new paradigm is the targeted identification of specific genetic determinants of stress adaptation that have evolved in nature and their precise introgression into elite varieties. These loci are often associated with distinct regulation or function, duplication and/or neofunctionalization of genes that maintain plant homeostasis.

727 citations

Journal ArticleDOI
TL;DR: A comprehensive analysis of tomato evolution based on the genome sequences of 360 accessions provides evidence that domestication and improvement focused on two independent sets of quantitative trait loci (QTLs), resulting in modern tomato fruit ∼100 times larger than its ancestor.
Abstract: The histories of crop domestication and breeding are recorded in genomes. Although tomato is a model species for plant biology and breeding, the nature of human selection that altered its genome remains largely unknown. Here we report a comprehensive analysis of tomato evolution based on the genome sequences of 360 accessions. We provide evidence that domestication and improvement focused on two independent sets of quantitative trait loci (QTLs), resulting in modern tomato fruit ∼100 times larger than its ancestor. Furthermore, we discovered a major genomic signature for modern processing tomatoes, identified the causative variants that confer pink fruit color and precisely visualized the linkage drag associated with wild introgressions. This study outlines the accomplishments as well as the costs of historical selection and provides molecular insights toward further improvement.

678 citations

Journal ArticleDOI
TL;DR: Genotyping-by-sequencing (GBS) has been developed and applied in sequencing multiplexed samples that combine molecular marker discovery and genotyping and has been successfully used in implementing genome-wide association study (GWAS), genomic diversity study, genetic linkage analysis, molecular markers discovery and genomic selection under a large scale of plant breeding programs.
Abstract: Marker-assisted selection (MAS) refers to the use of molecular markers to assist phenotypic selections in crop improvement. Several types of molecular markers, such as single nucleotide polymorphism (SNP), have been identified and effectively used in plant breeding. The application of next-generation sequencing (NGS) technologies has led to remarkable advances in whole genome sequencing, which provides ultra-throughput sequences to revolutionize plant genotyping and breeding. To further broaden NGS usages to large crop genomes such as maize and wheat, genotyping-by-sequencing (GBS) has been developed and applied in sequencing multiplexed samples that combine molecular marker discovery and genotyping. GBS is a novel application of NGS protocols for discovering and genotyping SNPs in crop genomes and populations. The GBS approach includes the digestion of genomic DNA with restriction enzymes followed by the ligation of barcode adapter, PCR amplification and sequencing of the amplified DNA pool on a single lane of flow cells. Bioinformatic pipelines are needed to analyze and interpret GBS datasets. As an ultimate MAS tool and a cost-effective technique, GBS has been successfully used in implementing genome-wide association study (GWAS), genomic diversity study, genetic linkage analysis, molecular marker discovery and genomic selection under a large scale of plant breeding programs.

500 citations