Home
/
Authors
/
Anthony Bolger

Author

Anthony Bolger

Bio: Anthony Bolger is an academic researcher from RWTH Aachen University. The author has contributed to research in topics: Genome & Gene. The author has an hindex of 17, co-authored 29 publications receiving 28766 citations. Previous affiliations of Anthony Bolger include Max Planck Society.

Topics: Genome, Gene, Nanopore sequencing, Parthenocarpy, Transcriptome ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Trimmomatic: a flexible trimmer for Illumina sequence data

[...]

Anthony Bolger¹, Marc Lohse¹, Bjoern Usadel¹•Institutions (1)

Max Planck Society¹

01 Aug 2014-Bioinformatics

TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.

...read moreread less

Abstract: Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic Contact: ed.nehcaa-htwr.1oib@ledasu Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

39,291 citations

Journal Article•DOI•

RobiNA: a user-friendly, integrated software solution for RNA-Seq-based transcriptomics

[...]

Marc Lohse¹, Anthony Bolger¹, Axel Nagel¹, Alisdair R. Fernie¹, John E. Lunn¹, Mark Stitt¹, Björn Usadel¹ - Show less +3 more•Institutions (1)

Max Planck Society¹

01 Jul 2012-Nucleic Acids Research

TL;DR: RobiNA is an integrated solution that consolidates all steps of RNA-Seq-based differential gene-expression analysis in one user-friendly cross-platform application featuring a rich graphical user interface and supports quality checking, flexible filtering and statistical analysis of differential gene expression based on state of the art biostatistical methods developed in the R/Bioconductor projects.

...read moreread less

Abstract: Recent rapid advances in next generation RNA sequencing (RNA-Seq)-based provide researchers with unprecedentedly large data sets and open new perspectives in transcriptomics. Furthermore, RNA-Seq-based transcript profiling can be applied to non-model and newly discovered organisms because it does not require a predefined measuring platform (like e.g. microarrays). However, these novel technologies pose new challenges: the raw data need to be rigorously quality checked and filtered prior to analysis, and proper statistical methods have to be applied to extract biologically relevant information. Given the sheer volume of data, this is no trivial task and requires a combination of considerable technical resources along with bioinformatics expertise. To aid the individual researcher, we have developed RobiNA as an integrated solution that consolidates all steps of RNA-Seq-based differential gene-expression analysis in one user-friendly cross-platform application featuring a rich graphical user interface. RobiNA accepts raw FastQ files, SAM/BAM alignment files and counts tables as input. It supports quality checking, flexible filtering and statistical analysis of differential gene expression based on state-of-the art biostatistical methods developed in the R/Bioconductor projects. In-line help and a step-by-step manual guide users through the analysis. Installer packages for Mac OS X, Windows and Linux are available under the LGPL licence from http://mapman.gabipd.org/web/guest/ robin.

...read moreread less

782 citations

Journal Article•DOI•

The genome of the stress-tolerant wild tomato species Solanum pennellii

[...]

Anthony Bolger¹, Federico Scossa¹, Marie E. Bolger¹, Christa Lanz¹, Florian Maumus², Takayuki Tohge¹, Hadi Quesneville², Saleh Alseekh¹, Iben Sørensen³, Gabriel Lichtenstein⁴, Eric A. Fich³, Mariana Conte⁴, Heike Keller¹, Korbinian Schneeberger¹, Rainer Schwacke¹, Itai Ofner⁵, Julia Vrebalov⁶, Yimin Xu⁶, Sonia Osorio¹, Saulo Alves Aflitos⁷, Elio Schijlen⁷, José M. Jiménez-Gómez¹, Małgorzata Ryngajłło¹, Seisuke Kimura⁸, Ravi Kumar⁸, Daniel Koenig⁸, Lauren R. Headland⁸, Julin N. Maloof⁸, Neelima Sinha⁸, Roeland C. H. J. van Ham⁷, René Klein Lankhorst⁷, Linyong Mao⁶, Alexander Vogel⁹, Borjana Arsova¹⁰, Ralph Panstruga⁹, Zhangjun Fei¹, Jocelyn K. C. Rose³, Dani Zamir⁵, Fernando Carrari⁴, James J. Giovannoni⁶, Detlef Weigel¹, Björn Usadel¹, Alisdair R. Fernie¹ - Show less +39 more•Institutions (10)

Max Planck Society¹, Institut national de la recherche agronomique², Cornell University³, National Scientific and Technical Research Council⁴, Hebrew University of Jerusalem⁵, Boyce Thompson Institute for Plant Research⁶, Wageningen University and Research Centre⁷, University of California, Davis⁸, RWTH Aachen University⁹, University of Düsseldorf¹⁰

01 Sep 2014-Nature Genetics

TL;DR: A high-quality genome assembly of the parents of the IL population of S. pennellii is described, defining candidate genes for stress tolerance and providing evidence that transposable elements had a role in the evolution of these traits.

...read moreread less

Abstract: Solanum pennellii is a wild tomato species endemic to Andean regions in South America, where it has evolved to thrive in arid habitats. Because of its extreme stress tolerance and unusual morphology, it is an important donor of germplasm for the cultivated tomato Solanum lycopersicum. Introgression lines (ILs) in which large genomic regions of S. lycopersicum are replaced with the corresponding segments from S. pennellii can show remarkably superior agronomic performance. Here we describe a high-quality genome assembly of the parents of the IL population. By anchoring the S. pennellii genome to the genetic map, we define candidate genes for stress tolerance and provide evidence that transposable elements had a role in the evolution of these traits. Our work paves a path toward further tomato improvement and for deciphering the mechanisms underlying the myriad other agronomic traits that can be improved with S. pennellii germplasm.

...read moreread less

378 citations

Journal Article•DOI•

Comparative transcriptomics reveals patterns of selection in domesticated and wild tomato

[...]

Daniel Koenig¹, José M. Jiménez-Gómez², Seisuke Kimura³, Daniel Fulop, Daniel H. Chitwood, Lauren R. Headland, Ravi Kumar, Michael F. Covington, Upendra K. Devisetty, An V. Tat, Takayuki Tohge², Anthony Bolger², Korbinian Schneeberger², Stephan Ossowski², Christa Lanz², Guangyan Xiong⁴, Mallorie Taylor-Teeples¹, Siobhan M. Brady¹, Markus Pauly⁴, Detlef Weigel², Björn Usadel², Alisdair R. Fernie², Jie Peng¹, Neelima Sinha, Julin N. Maloof - Show less +21 more•Institutions (4)

University of California, Davis¹, Max Planck Society², Kyoto Sangyo University³, University of California, Berkeley⁴

09 Jul 2013-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: High-throughput sequencing is used to identify changes in DNA sequence and gene expression that differentiate cultivated tomato and its wild relatives and identifies hundreds of candidate genes that have evolved new protein sequences or have changed expression levels in response to natural selection in wild tomato relatives.

...read moreread less

Abstract: Although applied over extremely short timescales, artificial selection has dramatically altered the form, physiology, and life history of cultivated plants. We have used RNAseq to define both gene sequence and expression divergence between cultivated tomato and five related wild species. Based on sequence differences, we detect footprints of positive selection in over 50 genes. We also document thousands of shifts in gene-expression level, many of which resulted from changes in selection pressure. These rapidly evolving genes are commonly associated with environmental response and stress tolerance. The importance of environmental inputs during evolution of gene expression is further highlighted by large-scale alteration of the light response coexpression network between wild and cultivated accessions. Human manipulation of the genome has heavily impacted the tomato transcriptome through directed admixture and by indirectly favoring nonsynonymous over synonymous substitutions. Taken together, our results shed light on the pervasive effects artificial and natural selection have had on the transcriptomes of tomato and its wild relatives.

...read moreread less

331 citations

Journal Article•DOI•

MapMan4: A Refined Protein Classification and Annotation Framework Applicable to Multi-Omics Data Analysis.

[...]

Rainer Schwacke¹, Gabriel Y. Ponce-Soto¹, Kirsten Krause, Anthony Bolger², Borjana Arsova¹, Asis Hallab¹, Kristina Gruden, Mark Stitt³, Marie E. Bolger¹, Björn Usadel² - Show less +6 more•Institutions (3)

Forschungszentrum Jülich¹, RWTH Aachen University², Max Planck Society³

03 Jun 2019-Molecular Plant

TL;DR: A redesigned and significantly enhanced MapMan4 framework is presented, together with a revised version of the associated online Mercator annotation tool, providing protein annotations for all embryophytes with a comparably high quality.

...read moreread less

276 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

HTSeq—a Python framework to work with high-throughput sequencing data

[...]

Simon Anders, Paul Theodor Pyl, Wolfgang Huber

15 Jan 2015-Bioinformatics

TL;DR: This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.

...read moreread less

Abstract: Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.de

...read moreread less

15,744 citations

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

[...]

Glenn Tesler

01 Jun 2012

TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).

...read moreread less

Abstract: The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online ( http://bioinf.spbau.ru/spades ). It is distributed as open source software.

...read moreread less

10,124 citations

Journal Article•DOI•

A new coronavirus associated with human respiratory disease in China.

[...]

Fan Wu¹, Su Zhao², Bin Yu³, Yan-Mei Chen¹, Wen Wang³, Zhi gang Song¹, Yi Hu², Zhao Wu Tao², Jun Hua Tian³, Yuan Yuan Pei¹, Ming Li Yuan², Yu Ling Zhang¹, Fa Hui Dai¹, Yi Liu¹, Qi Min Wang¹, Jiao Jiao Zheng¹, Lin Xu¹, Edward C. Holmes⁴, Edward C. Holmes¹, Yong-Zhen Zhang³, Yong-Zhen Zhang¹ - Show less +17 more•Institutions (4)

Fudan University¹, Huazhong University of Science and Technology², Centers for Disease Control and Prevention³, University of Sydney⁴

03 Feb 2020-Nature

TL;DR: Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.

...read moreread less

Abstract: Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, present a major threat to public health1–3. Despite intense research efforts, how, when and where new diseases appear are still a source of considerable uncertainty. A severe respiratory disease was recently reported in Wuhan, Hubei province, China. As of 25 January 2020, at least 1,975 cases had been reported since the first patient was hospitalized on 12 December 2019. Epidemiological investigations have suggested that the outbreak was associated with a seafood market in Wuhan. Here we study a single patient who was a worker at the market and who was admitted to the Central Hospital of Wuhan on 26 December 2019 while experiencing a severe respiratory syndrome that included fever, dizziness and a cough. Metagenomic RNA sequencing4 of a sample of bronchoalveolar lavage fluid from the patient identified a new RNA virus strain from the family Coronaviridae, which is designated here ‘WH-Human 1’ coronavirus (and has also been referred to as ‘2019-nCoV’). Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that the virus was most closely related (89.1% nucleotide similarity) to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) that had previously been found in bats in China5. This outbreak highlights the ongoing ability of viral spill-over from animals to cause severe disease in humans. Phylogenetic and metagenomic analyses of the complete viral genome of a new coronavirus from the family Coronaviridae reveal that the virus is closely related to a group of SARS-like coronaviruses found in bats in China.

...read moreread less

9,231 citations

Journal Article•DOI•

fastp: an ultra-fast all-in-one FASTQ preprocessor.

[...]

Shifu Chen¹, Yanqing Zhou, Yaru Chen, Jia Gu¹•Institutions (1)

Chinese Academy of Sciences¹

01 Sep 2018-Bioinformatics

TL;DR: Fastp is developed as an ultra‐fast FASTQ preprocessor with useful quality control and data‐filtering features that can perform quality control, adapter trimming, quality filtering, per‐read quality pruning and many other operations with a single scan of the FAST Q data.

...read moreread less

Abstract: Motivation Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient. Results We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2-5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools. Availability and implementation The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp.

...read moreread less

7,461 citations

Journal Article•DOI•

De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis

[...]

Brian J. Haas¹, Alexie Papanicolaou², Moran Yassour³, Moran Yassour⁴, Manfred Grabherr⁵, Philip D. Blood⁶, Joshua C. Bowden², M. B. Couger⁷, David Eccles⁸, Bo Li⁹, Matthias Lieber¹⁰, Matthew D. MacManes¹¹, Michael Ott², Joshua Orvis, Nathalie Pochet¹², Nathalie Pochet³, Francesco Strozzi¹³, Nathan T. Weeks¹⁴, Rick Westerman¹⁵, Thomas William, Colin N. Dewey⁹, Robert Henschel¹⁶, Richard D. LeDuc¹⁶, Nir Friedman⁴, Aviv Regev³ - Show less +21 more•Institutions (16)

Broad Institute¹, Commonwealth Scientific and Industrial Research Organisation², Massachusetts Institute of Technology³, Hebrew University of Jerusalem⁴, Science for Life Laboratory⁵, Pittsburgh Supercomputing Center⁶, Oklahoma State University–Stillwater⁷, Griffith University⁸, University of Wisconsin-Madison⁹, Dresden University of Technology¹⁰, California Institute for Quantitative Biosciences¹¹, Flanders Institute for Biotechnology¹², Parco Tecnologico Padano¹³, United States Department of Agriculture¹⁴, Purdue University¹⁵, Indiana University¹⁶

01 Aug 2013-Nature Protocols

TL;DR: This protocol provides a workflow for genome-independent transcriptome analysis leveraging the Trinity platform and presents Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes.

...read moreread less

Abstract: De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.

...read moreread less

6,369 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse